Dec 06, 2023

Druid Operator: Bridging Kubernetes and Apache Druid

Apache Druid is a real-time distributed data store designed for low-latency queries. It can ingest data in real-time and make it available for querying as soon as an event occurs. The standard service level agreement (SLA) for running Druid requires continuous data ingestion, uninterrupted data availability, and constant data queryability.

However, running Druid on Kubernetes poses significant challenges. Druid comprises multiple components, each with a specific role. Implementing scalable logic for these components, choosing between StatefulSets or Deployments, and managing external dependencies such as ZooKeeper and object storage make it complex. Even a brief downtime can result in significant latency issues, affecting data ingestion to query responsiveness.

The Druid Operator addresses these challenges by encapsulating Druid-specific logic for Kubernetes deployment. It serves as a bridge, providing insights into the current state of the Druid cluster and how to interpret it within the Kubernetes environment.

This talk aims to provide Druid engineers and data operations professionals with valuable insights into leveraging the Druid Kubernetes Operator. It enables efficient management of a distributed Druid cluster’s state, facilitates complex scaling operations, supports ordered rolling upgrades, and automates maintenance actions for seamless operations.

See similar videos

No records found...
Oct 22, 2024

Keynote: Powering Event-Driven Data with Apache Druid

The distinction between OLTP and OLAP is becoming less relevant as data architectures shift toward entities and events. In this session, we’ll delve into how Apache Druid’s event-first approach synthesizes...

Watch now
Oct 22, 2024

Closing Keynote: Charting the Future of Druid

What lies ahead for Apache Druid? Join us as we explore the evolving landscape of Druid’s query and storage engines, and how they are positioned to address the biggest challenges in event data for the future. Speaker: Gian...

Watch now
Oct 22, 2024

Salesforce: Tracing Service Dependencies at Scale with Druid and Flink

At Salesforce, we manage approximately 300 million distributed spans to infer service dependencies. We have successfully utilized a combination of Druid and Flink to handle this scale with high availability....

Watch now

Let us help with your analytics apps

Request a Demo