Imply Videos

Dec 9, 2022

Introducing KUDRAS – Kubernetes Druid Autoscaler for Maximum Resource Utilization and Speed

In this session I would like to talk about the huge amount of data we ingested into Druid (raw data is 9 Terra per day) by using EMR, all orchestrated by Airflow. While the data grew we started experiencing many problems. After trying many scaling options, we decided to change the approach and came up with KUDRAS, the Kubernetes Druid Autoscaler.

This project is written in Python and is being used in our Apache Druid production environment. KUDRAS is a service developed using fastAPI which scales middlemanager nodes up and down in the most effective way, minimizing ingestion task costs to the bare minimum while maximizing ingestion speed.