How Druid Provides Internet-Scale Observability at Netflix

Netflix

Overview

Netflix is a leading subscription-based streaming service that allows its users to watch TV shows and movies on internet-connected devices. Founded in 1997 in Los Gatos, California, Netflix is now a household name globally. To ensure a consistently great experience to more than 100 million members in more than 190 countries enjoying 125 million hours of TV shows and movies each day, Netflix built an analytics application powered by Apache Druid. By turning log streams into real-time metrics, Netflix is able to see how over 300 million devices (across 4 major UIs) are performing at all times in the field.

Netflix chose Druid because it uniquely meets their high ingestion rate of data, high cardinality, and fast query requirements.

Challenge

An ongoing challenge for Netflix is to consistently deliver a great streaming entertainment experience while continuously pushing innovative technology updates.

As Netflix’s adoption has skyrocketed, this challenge has grown more complex. With over 300 million devices spanning four major UIs including IOS, Android, Smart TVs and their own website, Netflix has a constant need to identify and isolate issues that may only affect a certain group, such as a version of the app, certain types of devices, or particular countries.

Netflix needed to be sure that updates they performed didn’t interfere or downgrade the experience of the users while also ensuring that changes, fixes, and improvements were adding to the experience in a meaningful and measurable way.

Solution

Netflix chose Apache Druid as their database to power their real-time analytics application because it’s uniquely capable of high ingestion rate of event data, with high cardinality and fast query requirements. To quantify how seamlessly users’ devices are handling browsing and playback, Netflix derives measurements using real-time logs from playback devices as a source of events.

Once they have these measures, Netflix feeds them into Druid. Every measure is tagged with anonymized details about the kind of device being used, for example, whether the device is a Smart TV, an iPad or an Android Phone. This enables Netflix to classify devices and view the data according to various aspects. With Druid, this aggregated data is available immediately for querying, either via dashboards or ad-hoc queries.

Netflix leverages Druid to employ A/B testing to assess how updates and changes impact various user groups. It uses the results to compare how the new version performs against the older version to tell whether users on different systems should get the update or not.

Let us help with your analytics apps

Request a Demo