Apr 26, 2023

Introducing KUDRAS – Kubernetes Druid Autoscaler for Maximum Resource Utilization and Speed

In this session I would like to talk about the huge amount of data we ingested into Druid (raw data is 9 Terra per day) by using EMR, all orchestrated by Airflow. While the data grew we started experiencing many problems. After trying many scaling options, we decided to change the approach and came up with KUDRAS, the Kubernetes Druid Autoscaler.

This project is written in Python and is being used in our Apache Druid production environment. KUDRAS is a service developed using fastAPI which scales middlemanager nodes up and down in the most effective way, minimizing ingestion task costs to the bare minimum while maximizing ingestion speed.

See similar videos

No records found...
Jan 29, 2024

Physical Hardware, Digital Analytics: IoT Challenges, Best Practices, and Solutions

Electric vehicle maker Rivian and German startup Thing-it were kind enough to talk us through how real-time data and analytics play a key part in the evolving landscape of IoT (Internet of Things). The wealth...

Watch now
Dec 11, 2023

Analyzing streaming data with Apache Druid

Streaming data is not only data in motion—it’s a potential source of valuable insights, ready to be harvested and utilized. The challenge is to analyze streaming data at scale and extract these insights—before...

Watch now
Dec 06, 2023

Real-Time Analytics in the Real World

Engineering teams increasingly have to deliver insights in real-time. But as they aim to reduce latency from event-to-insight, they also face the challenge of dealing with larger and more complex data and concurrent...

Watch now

Let us help with your analytics apps

Request a Demo