Dec 06, 2023

Moving ingestion from 3 hours to 5 minutes – Challenges and Mitigations

This is a real world account from a Druid cluster in production. A story of 48 hours of debugging, learning and understanding Druid better, filing a couple of issues in Druid github and finally a stable production pipeline again thanks to the Druid community. We will discuss the bottlenecks we had in overlord, slot issues for Peons in middle managers, coordinator bottlenecks, how to mitigated task and segment flooding, what configs we changed sprinkled with real world numbers and snapshots from our graphana dashboards. Finally we will list all the latest awesomeness in Druid 25.0 that helped us in this endeavour, how we discovered those midnight and our learnings.

See similar videos

No records found...
Oct 22, 2024

Keynote: Powering Event-Driven Data with Apache Druid

The distinction between OLTP and OLAP is becoming less relevant as data architectures shift toward entities and events. In this session, we’ll delve into how Apache Druid’s event-first approach synthesizes...

Watch now
Oct 22, 2024

Closing Keynote: Charting the Future of Druid

What lies ahead for Apache Druid? Join us as we explore the evolving landscape of Druid’s query and storage engines, and how they are positioned to address the biggest challenges in event data for the future. Speaker: Gian...

Watch now
Oct 22, 2024

Salesforce: Tracing Service Dependencies at Scale with Druid and Flink

At Salesforce, we manage approximately 300 million distributed spans to infer service dependencies. We have successfully utilized a combination of Druid and Flink to handle this scale with high availability....

Watch now

Let us help with your analytics apps

Request a Demo