天天看點

java 生成parquet檔案格式,使用Java API将Parquet格式寫入HDFS,而不使用Avro和MR

java 生成parquet檔案格式,使用Java API将Parquet格式寫入HDFS,而不使用Avro和MR

What is the simple way to write Parquet Format to HDFS (using Java API) by directly creating Parquet Schema of a Pojo, without using avro and MR?

The samples I found were outdated and uses deprecated methods also uses one of Avro, spark or MR.

解決方案

Effectively, there is not a lot of sample available for reading/writing Apache parquet files without the help of an external framework.

You then just need to use the same functionality with an HDFS file. You can follow this SOW question for this : Accessing files in HDFS using Java

UPDATED : to respond to the deprecated parts of the API : AvroWriteSupport should be replaced by AvroParquetWriter and I check ParquetWriter it's not deprecated and can be used safely.

Regards,

Loïc