Apache Druid可以从本地或者HDFS批量摄取数据,现在最新版本(0.18)也支持直接解析 ORC parquet
及
格式的数据,但是要使用这个功能还需要进行简单的配置。
官方文档说明
Apache Druid打包了所有的核心扩展(参考本文附件),您可以通过将需要的扩展名添加到
common.runtime.properties
中的
druid.extensions.loadList
。例如,要加载
postqresql-metadata-storage
和
druid-hdfs-storage
扩展,请使用配置:
druid.extensions.loadList=["postgresql-metadata-storage", "druid-hdfs-storage"]
所以当我们需要Druid 解析ORC及Parquet格式的数据时,就需要这样配置:
druid.extensions.loadList=["druid-hdfs-storage", "druid-kafka-indexing-service", "druid-datasketches","druid-orc-extensions","druid-parquet-extensions"]
附件
Name | Description | Docs |
---|---|---|
druid-avro-extensions | Support for data in Apache Avro data format. | link |
druid-azure-extensions | Microsoft Azure deep storage. | |
druid-basic-security | Support for Basic HTTP authentication and role-based access control. | |
druid-bloom-filter | Support for providing Bloom filters in druid queries. | |
druid-datasketches | Support for approximate counts and set operations with Apache DataSketches. | |
druid-google-extensions | Google Cloud Storage deep storage. | |
druid-hdfs-storage | HDFS deep storage. | |
druid-histogram | Approximate histograms and quantiles aggregator. Deprecated, please use the DataSketches quantiles aggregator from the extension instead. | |
druid-kafka-extraction-namespace | Apache Kafka-based namespaced lookup. Requires namespace lookup extension. | |
druid-kafka-indexing-service | Supervised exactly-once Apache Kafka ingestion for the indexing service. | |
druid-kinesis-indexing-service | Supervised exactly-once Kinesis ingestion for the indexing service. | |
druid-kerberos | Kerberos authentication for druid processes. | |
druid-lookups-cached-global | A module for lookups providing a jvm-global eager caching for lookups. It provides JDBC and URI implementations for fetching lookup data. | |
druid-lookups-cached-single | Per lookup caching module to support the use cases where a lookup need to be isolated from the global pool of lookups | |
druid-orc-extensions | Support for data in Apache ORC data format. | |
druid-parquet-extensions | Support for data in Apache Parquet data format. Requires druid-avro-extensions to be loaded. | |
druid-protobuf-extensions | Support for data in Protobuf data format. | |
druid-ranger-security | Support for access control through Apache Ranger. | |
druid-s3-extensions | Interfacing with data in AWS S3, and using S3 as deep storage. | |
druid-ec2-extensions | Interfacing with AWS EC2 for autoscaling middle managers | UNDOCUMENTED |
druid-stats | Statistics related module including variance and standard deviation. | |
mysql-metadata-storage | MySQL metadata store. | |
postgresql-metadata-storage | PostgreSQL metadata store. | |
simple-client-sslcontext | Simple SSLContext provider module to be used by Druid's internal HttpClient when talking to other Druid processes over HTTPS. | |
druid-pac4j | OpenID Connect authentication for druid processes. |
aliyun-oss-extensions | Aliyun OSS deep storage | |
ambari-metrics-emitter | Ambari Metrics Emitter | |
druid-cassandra-storage | Apache Cassandra deep storage. | |
druid-cloudfiles-extensions | Rackspace Cloudfiles deep storage and firehose. | |
druid-distinctcount | DistinctCount aggregator | |
druid-redis-cache | A cache implementation for Druid based on Redis. | |
druid-time-min-max | Min/Max aggregator for timestamp. | |
sqlserver-metadata-storage | Microsoft SQLServer deep storage. | |
graphite-emitter | Graphite metrics emitter | |
statsd-emitter | StatsD metrics emitter | |
kafka-emitter | Kafka metrics emitter | |
druid-thrift-extensions | Support thrift ingestion | |
druid-opentsdb-emitter | OpenTSDB metrics emitter | |
materialized-view-selection, materialized-view-maintenance | Materialized View | |
druid-moving-average-query | Support for Moving Average and other Aggregate Window Functions in Druid queries. | |
druid-influxdb-emitter | InfluxDB metrics emitter | |
druid-momentsketch | Support for approximate quantile queries using the momentsketch library | |
druid-tdigestsketch | Support for approximate sketch aggregators based on T-Digest | |
gce-extensions | GCE Extensions |