Oct 22, 2024

Optimizing Druid Configurations at Netflix through Parallel Testing and Metrics Analysis

As a data-driven company, Netflix continually seeks to enhance the performance and reliability of its data infrastructure. This talk will delve into our sophisticated approach to optimizing Apache Druid configurations through parallel runs and A/B testing methodologies. We will explore how Netflix tests various Druid setups by running them concurrently against dual systems, enabling a direct comparison of key performance metrics reported by different clusters. Attendees will gain insights into the following areas:

1. Cluster Management and Deployment: An overview of Netflix’s strategies for managing and deploying Druid clusters, emphasizing automation and scalability.
2. Centralized Logging and Metrics: Techniques for aggregating and analyzing logs and metrics to facilitate real-time monitoring and post-mortem analysis.
3. Cluster Architecture Patterns: Best practices and patterns employed by Netflix to architect Druid clusters for optimal performance and reliability.
4. Parallel Testing Framework: Detailed methodologies for executing parallel runs and conducting A/B testing to evaluate different Druid configurations, including the tools and frameworks used.

This session will provide practical knowledge and actionable insights, empowering attendees to apply similar strategies within their own organizations to optimize Druid deployments. Join us to learn how Netflix leverages advanced testing and analytical techniques to push the boundaries of what is possible with Apache Druid.

Speaker:
Ben Sykes, Software Engineer, Netflix

[Timestamp] Table of Contents:
[0:00] Introduction
[1:54] Cluster Architecture Pattern
[4:00] Cluster Management and Deployment
[15:48] Centralized Logging and Metrics
[17:15] Config Testing Framework
[22:15] Parallel Testing Framework

See similar videos

No records found...
Oct 22, 2024

Keynote: Powering Event-Driven Data with Apache Druid

The distinction between OLTP and OLAP is becoming less relevant as data architectures shift toward entities and events. In this session, we’ll delve into how Apache Druid’s event-first approach synthesizes...

Watch now
Oct 22, 2024

Closing Keynote: Charting the Future of Druid

What lies ahead for Apache Druid? Join us as we explore the evolving landscape of Druid’s query and storage engines, and how they are positioned to address the biggest challenges in event data for the future. Speaker: Gian...

Watch now
Oct 22, 2024

Salesforce: Tracing Service Dependencies at Scale with Druid and Flink

At Salesforce, we manage approximately 300 million distributed spans to infer service dependencies. We have successfully utilized a combination of Druid and Flink to handle this scale with high availability....

Watch now

Let us help with your analytics apps

Request a Demo