Apache Druid Roadmap

May 12, 2020

The community defines Apache Druid’s roadmap. Apache Druid PMC Chair and Imply co-founder Gian Merlino shares what he and the Imply team are working on regarding the Druid roadmap. He emphasizes his intent to continue to invest in real-time experiences and to continue to build on Druid’s performance, ease of use, and query capabilities.

Three reasons why Target built a custom enterprise analytics platform

October 21, 2020

Target explains why they built a custom analytics platform, focusing on self-service development, discovery /collaboration of data, and to ensure scalability.

Target explains unions in Apache Druid

October 21, 2020

In this short video, Jeremy Woelfel of Target explains how unions allow them to blend their data.

Key Apache Druid features Target utilizes

October 21, 2020

Jeremy Woelfel of Target explains how they ingest their data, the impact of cardinality, metrics collection, Thetasketch approximations, and more!

ApacheCon @Home Thursday Keynote: Gian Merlino, Imply

October 14, 2020

Gian Merlino discuss why you can never find that single perfect system and how to think about & evaluate data on a temperature-based spectrum.

What's inside Netflix’s metrics pipeline & Apache Druid cluster?

October 12, 2020

Ben Sykes discusses how Netflix created its metrics pipeline to ensure a high-quality streaming experience & how they structure their Apache Druid cluster.

A explainer about Druid rollup, cardinality and segments from Netflix

October 12, 2020

In this short video, Ben Sykes of Netflix explains Druid roll-up, the impact of high cardinality, and segment sizing.

Benchmarking Apache Druid II: What’s Under the Hood?

October 5, 2020

Learn how Druid provides sub-second response times over billions of rows and why performance scales predictably with data growth.

TrafficGuard uses Imply Clarity to monitor Apache Druid performance

October 1, 2020

Raigon Jolly explains how they use Imply Clarity to monitor Druid's performance and pinpoint and resolve issues.

TrafficGuard analyzes ad fraud data “at the speed of thought” w/ Imply

October 1, 2020

Raigon Jolly, explains how Imply enables Trafficguard's technical and non-technical employees to analyze data at the speed of thought.

How TrafficGuard Fights Ad Fraud with Imply

October 1, 2020

In this Virtual Druid Summit excerpt, Raigon Jolley, Head of Analytics @ TrafficGuard shares their success with Imply.

Why Twitch needed self-service analytics

September 21, 2020

At Twitch, data is at the center of all decisions. To empower data-driven decisions at the company, they needed a self-service analytics platform.

Twitch explains the benefits of Imply Pivot

September 21, 2020

Learn the main advantages Twitch has experienced while using Pivot.

Inside Twitch's Apache Druid Architecture

September 21, 2020

Nicholas Ngorok of Twitch gives an in-depth explanation of Twitch’s Apache Druid architecture and ingest self-service tooling.

Apache Druid Hits 10,000 Star Milestone (Gource Video)

September 10, 2020

Here’s a little treat, a fun gource video that shows the journey that Druid code has taken.

Demo: How Imply Plus Kafka Enables Self-service Hot Analytics

August 31, 2020

In this demo from Kafka Summit 2020, we'll show you how Imply plus Kafka enables self-service hot analytics through a real-time data platform.

A Demonstration of How to Use Imply for Retail Analytics

August 31, 2020

See how anyone can leverage Imply to reveal insights in milliseconds; regardless of their technical ability or data scale.

Some Like It Hot

August 25, 2020

In this talk, we'll reflect on why you can never seem to find that single perfect system, and how to think about their capabilities on a spectrum.

Apache Druid Lightning Fast Analytics on Real-time and Historical Data

August 24, 2020

In this talk, we will start with an overview of Apache Druid followed by a look at several examples of how Druid is being used in the real-world.

Why Imply instead of open-source Apache Druid | NTT

August 21, 2020

A key reason why NTT chose Imply’s self-service analytics platform over open-source Apache Druid was because of Imply Pivot.

Why NTT Chose Apache Druid over Elasticsearch, InfluxDB and Cassandra

August 20, 2020

In this webinar excerpt, learn why NTT chose Apache Druid over Elasticsearch, InfluxDB, and Cassandra.

Imply Demo - Network Monitoring at Scale

August 19, 2020

A common use case of Imply’s self-service analytics platform is to store, analyze, and visualize different types of networking data.

What’s inside an Apache Druid cluster?

August 12, 2020

In this video excerpt, Imply Technology Evangelist Peter Marshall provides an in-depth technical explanation of what's inside an Apache Druid cluster.

What happens once data is in Apache Druid?

August 12, 2020

In this video clip, Imply Technology Evangelist Peter Marshall breaks down the step-by-step processes that occur once data is in Druid.

How NTT improves operations with self-service analytics

August 6, 2020

Learn how NTT, the owner and operator of one of the largest global tier-1 IP backbones, uses Imply for self-service analytics for network telemetry.

Virtual Apache Druid Meetup featuring SuperAwesome

July 31, 2020

In this talk, each of the components of an Apache Druid cluster is described

Benchmarking Apache Druid

July 21, 2020

This talk explains how we evaluated Druid’s performance using the Star Schema Benchmark (SSB).

Introduction to Hot Analytics

July 9, 2020

In this webinar excerpt, Imply co-founder Gian Merlino explains the attributes of hot, warm and cold analytics and use cases for hot analytics in particular.

Apache Druid vs Google Big Query: price performance benchmark

July 7, 2020

Industry standard benchmarks show that Imply delivers 12 times the price / performance of Google BigQuery.

Migrating From Elasticsearch to Apache Druid (WalkMe)

June 30, 2020

For Walkme's use case, Elasticsearch did not meet their criteria. They needed a system that could provide real-time analytics.

Why data warehouses cannot support hot analytics

June 24, 2020

Data warehouses struggle with hot analytics use cases because they are too slow, unable to scale, or too expensive.

Query laning in Apache Druid

June 23, 2020

Query laning works like an HOV lane. It provides prioritized access to a subset of resources for urgent queries.

Why we added JOINs in Apache Druid

June 22, 2020

The addition of JOINs simplifies data pipelines and creates substantial cost savings by reducing storage costs, data ingestion costs, and maintenance costs.

Apache Druid JOINs Introduction

June 22, 2020

The addition of JOINs helps reduce cloud data storage and compute costs for Apache Druid users.

How Cisco Safely Queries Multi-tenant Data Sources

May 22, 2020

Learn how Cisco safely queries multi-tenant data sources using an API proxy and gRPC calls that are translated into SQL queries for Druid.

Cisco's Real-time Ingestion Architecture with Kafka and Druid

May 22, 2020

See Cisco’s real-time ingestion architecture, which includes applications that ingest real-time streaming data to a set of Kafka topics

What’s New in Imply 3.3 & Apache Druid 0.18

May 21, 2020

The most recent Imply 3.3 release, based on Apache 0.18 brings several major new features, including joins, query laning and Clarity Alerts.

How DBS uses Apache Druid in its event-driven ecosystem

May 15, 2020

A key reason why Druid fits in DBS's ecosystem is because of its "native real-time capabilities and integration with Kafka

DBS’s Data River Architecture

May 15, 2020

Arpit Dubey shares how they solve the toughest problems around anti-money laundering (AML) utilizing a data river architecture.

How Athena Health Automated CI/CD for Apache Druid Clusters

May 14, 2020

Ramesh Kampanna walks us through an implementation that includes Bitbucket, Terraform and Jenkins.

Athena Health’s Apache Druid Architecture

May 13, 2020

Karthik Urs, Lead Member of Technical Staff from Athena Health quickly runs through their Apache Druid architecture.

Athena Health augments Snowflake and Cassandra with Apache Druid

May 13, 2020

What separated Druid from other options they were using such as Cassandra & Snowflake was its sub-second response, low latency and support for high concurrency.

What is Apache Druid used for?

May 12, 2020

Druid is great for helping analysts troubleshoot and find root causes to issues as well as monitoring and looking into the current status of a system.

The Difference Between Hot, Warm and Cold Data

May 12, 2020

Apache Druid PMC Chair and Imply co-founder Gian Merlino classifies data into three different temperatures (Hot, Warm, and Cold).

Apache Druid Roadmap

May 12, 2020

Apache Druid PMC Chair and Imply co-founder Gian Merlino shares what he and the Imply team are working on regarding the Druid roadmap.

Why BT decided to partner with Imply

April 29, 2020

BT partnered with Imply to help reduce their overall maintenance costs and free up valuable engineering resources to work on higher value business problems.

Why BT chose Druid over Cassandra

April 29, 2020

The main deciding factors in choosing Druid were because of use case fit and the ability to support streaming data.

BT's Apache Druid architecture

April 29, 2020

In this clip, you will receive a high-level overview of BT's Druid Architecture. Pankaj Tiwari walks us through the components of BT’s Druid architecture.

Using Tiering to reduce infrastructure costs by 20 percent at Twitter

April 28, 2020

In this Virtual Druid Summit clip, Twitter shares their experience of dividing their cluster up into multiple tiers.

How Twitter uses Imply Clarity to Monitor its Druid Cluster

April 28, 2020

Twitter uses Clarity to monitor their production Druid cluster. In some cases, severe and difficult issues were found because of Clarity.

Best Practices for Deploying Druid on the Cloud

April 10, 2020

In this clip, we share common best practices for deploying Druid on the cloud.

Best Practices for Deploying Apache Druid on Kubernetes

April 10, 2020

In this clip, we share best practices for deploying Druid in a Kubernetes cluster.

Best Practices for Deploying Apache Druid on Azure

April 10, 2020

This video clip shares recommendations for running Apache Druid on services such as Azure VM, Azure Blob Storage, and Azure Database Service.

Best Practices for Deploying Apache Druid on AWS

April 10, 2020

We share best practices for running Apache Druid on services such as S3, Amazon Aurora, MySQL, and more.

Best Practices for Deploying Apache Druid on GCP

April 10, 2020

In this clip, we share best practices for leveraging GCP services such as Compute Engine, Cloud Storage, and Cloud SQL.

Apache Druid Architecture

April 10, 2020

This video provides an overview of the Druid Architecture. Delving into the rules and components of each process and how they interact with each other.

Deploying Apache Druid as a service on any cloud

March 29, 2020

Learn some of the best practices for deploying Druid on Amazon Web Services (AWS), Google, Microsoft, or your own infrastructure.

Real-Time Analytics Stack with Apache Kafka and Apache Druid

February 19, 2020

In this webinar, a creator of Apache Druid covers the current state of real-time and streaming analytics, architectural challenges and design considerations.

Scalable Incremental Index for Druid

February 3, 2020

Dr. Bortnikov @ Verizon Media: Ingestion and queries of real-time data in Druid are performed by a software component named Incremental Index (I^2).

Analyzing 1 Billion Gamers w/ Apache Druid - GameAnalytics (Tech Talk)

January 22, 2020

This talk will cover how GameAnalytics built a cloud architecture using AWS Kinesis & Apache Druid via Imply Cloud to analyze activity on 16 billion sessions.

Real Time Analytics with Druid, Spark, and Kafka (Outbrain)

January 13, 2020

Real Time Analytics with Druid, Apache, Spark, and Kafka by Daria Litvinov. A talk from Druid meetup at Outbrain on November 2019.

Why Nielsen Chose Druid Over Elasticsearch

January 6, 2020

In this 3 minute video, Itai Yaffe from Nielsen explains why they moved from Elasticsearch to Apache Druid as infrastructure for their marketing analytics.

Theta Sketches for Fast Approximation on Large Data Sets

December 18, 2019

A short explanation of how Theta Sketches, a fast approximation algorithm for Druid, work.

Webinar: Marketing Performance Analytics Using Apache Druid (Nielsen)

December 11, 2019

Learn how Nielsen Marketing Cloud used Apache Druid to Built a cloud-based architecture to serve up real-time query response against 10s of terabytes per day.

Webinar: WalkMe | Druid & Imply Cloud for user behavior analysis

November 13, 2019

Yotam Spencer, Head of Data Engineering at WalkMe discusses how they use Druid / Imply Cloud for user behavior analytics, with a dive into bloom filters also.

Meetup: How Pinterest Powers Advertising Analytics with Apache Druid

October 23, 2019

Filip Jaros at Pinterest , discusses how they went from nothing to serving all ads data requests via Druid.

Druid Demos and Roadmap at Bay Area Meetup Hosted by Pinterest

October 22, 2019

Vadim Ogievetsky shows off the new Data Loader and SQL Query Builder and Gian Merlino discusses coming enhancements including SuperBatch, Druid SQL and JOINs.

Apache Druid's Rollup Feature

October 19, 2019

A brief look at Apache Druid's rollup feature that greatly speeds queries and reduces storage requirements.

Druid “Sweet 0.16" and Imply 3.1 Webinar

October 9, 2019

In this webinar, Imply Chief Product Officer and co-founder Vadim Ogievetsky demonstrates some of the new features in the latest Apache Druid release.

Apache Druid 0.16.0 Quickstart

September 26, 2019

A quick run through installing Druid on a single server and using the new Data Loader and SQL View to ingest and query a Wikipedia edits file.

Real-Time Clickstream Data Exploration with Apache Druid

August 28, 2019

This webinar covers clickstream analytics using Apache Druid, a broad sets of use cases including analytics for marketing, e-commerce, apps, games and more.

Intro to Imply 3.0: Delivering Ease of Druid

July 30, 2019

In this webinar, Vadim Ogievetsky, CPO at Imply, will demo new Imply 3.0 capabilities including Data Loader, Alerts and Imply Manager.

Apache Druid and Imply: Analyzing Network Telemetry

July 29, 2019

A demonstration of ad hoc interactive analysis (OLAP operations) on network data such as network telemetry, Netflow and syslog data.

Imply Manager Walk-through

July 15, 2019

A walk-through of the Druid cluster management functionality in Imply Manager.

Architecting Microservices Applications with Instant Analytics

June 20, 2019

Online talk about microservices and analytics featuring Confluent (Kafka) and Imply (Druid) Experts

Swimming in the Data River; When Streaming Analytics Isn't

June 12, 2019

This talk covers the current state of the streaming analytics world.

Apache Druid & An Introduction to Data Rivers

June 12, 2019

FJ Yang, CEO of Imply, spoke at Data Driven NYC in June 2019 where he shared the history of Apache Druid and introduced the concept of data rivers.

Introduction to Imply Cloud Webinar

May 16, 2019

Learn about Druid and Imply through an end-to-end Imply Cloud demo including deployment, use and management with Imply's co-founder, Vadim Ogievetsky.

First Look: Apache Druid (incubating) Data Loader with Apache Kafka

May 16, 2019

Connecting Druid to an Apache Kafka topic using the Data Loader

a16z Podcast: The Future of Decision-Making—3 Startup Opportunities

May 1, 2019

In this podcast episode, Jad Naous (@jadtnaous) ‏and Frank Chen (@withfries2) discuss this change and the startup opportunities these changes create.

How To Use Apache Kafka® and Druid to Tame Your Router Data

April 2, 2019

In this talk given at Kafka Summit 2019, Eric Graham and Rachel Pedreschi of Imply will discuss and demonstrate streaming analytics pipelines.

Imply Cloud: A walk-through

February 19, 2019

A walk-through of Imply Cloud, Imply's AWS based managed Apache Druid service.

Sub-second Analytics on Apache Kafka with Confluent and Imply

November 19, 2018

Learn about how Apache Kafka and Apache Druid can be combined to form an end-to-end streaming analytics stack.

Tutorial: Imply Quickstart

November 15, 2018

A quick walk-though of Imply quickstart. Learn how to set up Imply and load some example data.

Druid Meetup at Netflix

November 14, 2018

Netflix shares their experience of using Druid and how it has helped provide the best streaming experience to their users through a series of lightning talks.

The Rise of the Operational Analytic Data Stores

November 1, 2018

Imply's cofounder and CTO, Gian Merlino, presents about Druid and operational analytic databases.

SF Scala: Gian Merlino Interview

October 31, 2018

A brief interview with Imply's CTO and cofounder, Gian Merlino.

New data architectures for high performance netflow analytics

October 10, 2018

Learn about how open source streaming technologies such as Apache Kafka and Apache Druid can be combined to analyze network traffic data.

Introduction to Imply

July 18, 2018

Imply is an operational data analytics platform that is designed from the ground up for event-driven data.

Interactive Exploratory Analytics with Druid

July 21, 2017

Learn more about how Druid can power exploratory workflows that go beyond dashboarding and reporting.

Building an Open Source Streaming Analytics Stack with Kafka and Druid

February 23, 2017

Learn how to build an end-to-end streaming analytic stack by combining a message bus (Apache Kafka), a stream processor, and a query engine (Apache Druid).

OLAP for Big Data, SF Data Mining Meetup

April 13, 2016

A look into what is needed to support OLAP for a big distributed data platform.