Integrations

Analyze stream data at scale

Druid was designed from the outset for rapid ingestion and immediate querying of stream data. No connectors are needed, as Druid includes inherent exactly-once ingestion for data streams using Apache Kafka® and Amazon Kinesis APIs.

See documentation

Streaming analytics with Druid

Apache Druid is purpose built for stream ingestion. It ingests event-by-event, not a series of batched data files sent sequentially to mimic a stream. This means that Druid supports query-on-arrival. It’s true real-time analytics with no wait for data to be batched then delivered.

Massive scalability

Druid handles data streams up to millions of events per second with ease, ideal for highly dynamic data.
Exactly-once semantics

Druid guarantees data consistency - preventing duplicates or data loss - through its native indexing service.
Continuous backup

Druid ensure no data loss of streaming data as it persists data segments to deep storage automatically.

Apache Kafka

Kafka is natively integrated with Druid so there is no need for a connector. The data is loaded into Druid from a Kafka stream using Druid’s Kafka indexing service. The connection to Kafka topics is part of Druid. Define an ingestion spec with “type”: “kafka” that defines the the topic and parameters you want.

Whenever an event is added to the topic, it will immediately become available for your queries in Druid. After an interval, the events will be indexed and added to a segment, committed to both data nodes and durable deep storage. Once the event is fully committed, it is removed from the topic, so every event is written to Druid once and only once.

Learn more about Kakfa and Druid

Apache Kafka APIs

Any data stream that supports Kafka APIs can be used for Druid stream ingestion the same as open source Kafka.

This includes:

Confluent Enterprise
Confluent Cloud
Amazon Managed Services for Kafka
Azure Event Hub
Red Panda
Aiven Managed Kafka
Alibaba
Instaclustr

Amazon Kinesis

Inherent connection to Kinesis data streams is part of Druid. Define an ingestion spec with “type”: “kinesis” that defines the the stream and parameters you want.

Whenever an event is added to the stream, it will immediately become available for your queries in Druid. After an interval, the events will be indexed and added to a segment, committed to both data nodes and durable deep storage. Once the event is fully committed, it is removed from the stream, so every event is written to Druid once and only once.

“Druid has native Kafka integration out of the box…we don't need anything to make Apache Kafka and Apache Druid work together. It just works.”

Harini Rajendran | Software Engineer, Confluent

By Functional Use

By Application

FEATURED

DRUID CASE STUDIES

Apache Druid

Content

Support

Analyze stream data at scale

Streaming analytics with Druid

Massive scalability

Exactly-once semantics

Continuous backup

Apache Kafka

Apache Kafka APIs

Amazon Kinesis

Let us help with your analytics apps