Some Like It Hot

August 25, 2020 in Industry

Today’s world has no shortage of systems that claim to help with the analysis of large amounts of data. Under the hood, today’s popular systems have a variety of interesting and unique architectures. In this talk, we’ll reflect on why you can never seem to find that single perfect system, and how to think about their capabilities on a spectrum.

Imply x Kafka: Capture, interact and scale streaming data

July 29, 2021

Imply and Kafka is the perfect architecture to capture and surface streaming data through interactive queries and unlimited scale

Confluent Cloud 연동하여 Imply로 실시간 데이터 분석 및 시각화

July 22, 2021

라이브 데모: 대화형 쿼리 및 제한없는 확장을 통해 스트리밍 데이터를 캡쳐하고 보여주는 아키텍쳐를 구축하는 방법

Making Real-Time Data a Reality for your Business

June 9, 2021

Technaura takes you through an hour of practical Real-time, data-driven outcomes.

Apache Druid Engine Roadmap

June 3, 2021

如果你还在每天等待数据刷新,是时候试试Apache Druid 亚秒级数据库了。 该开源数据库在Twitter、Pinterest和Snapchat广泛部署。此次研讨会中, Imply的产品经理Will Xu将分享常见的应用场景以及Imply Druid企业版本的独特功能和未来产品规划。

Comparing drill-down workflows between Tableau & Snowflake with Imply

May 10, 2021

In this video, Jad Naous demonstrates drill-down workflows between Tableau & Snowflake with Imply.

How Spideo turbo-charged data analytics by using Imply

May 7, 2021

In this video, Spideo, a humanized recommendation provider, shares with us its data analytics journey.

Pivot 2.0 - The next gen visualization tool

April 30, 2021

In this webinar, will walk you through the exciting new features that are coming soon to Pivot.

Introducing... Druid's Components

April 22, 2021

In this talk, we look at the three families of components that create “Apache Druid.”


April 21, 2021


Analyser Des Flux De Donnees Massifs Avec Imply

April 14, 2021

NOVAGEN présentera un cas avec IMPLY, inspiré de ses projets menés dans le secteur de la grande distribution, qui a su répondre à ces nouveaux enjeux :

Using Imply to prevent fraud

April 9, 2021

Learn a new methodology for anomaly detection and analysis that can be applied to everything from fraud detection to factory accident prevention.

Inside Apache Druid’s storage and query engine

March 2, 2021

We’ll cover how Druid stores data, what kinds of compression it uses, how it indexes data, how the storage engine is linked with the query processing engine.

First look: Imply CrossTab (beta)

March 1, 2021

In this short video, Imply Chief Product Officer Vadim Ogievetsky demonstrates a cool new beta feature, Imply CrossTab.

Imply CrossTab: PivotTables to infinity and beyond

February 24, 2021

In this webinar we will introduce Imply CrossTab, a new visualization that makes the pivot table user experience feel instantaneous.

Where do Imply and Elasticsearch fit in my big data stack?

February 22, 2021

Itai Yaffe shares how Imply and Elasticsearch fit in a modern architecture for big data.

How Imply fulfills the requirements of real-time analytics

February 22, 2021

Data wizard Itai Yaffe explains how Imply's platform fills the requirements of general-purpose operational analytics.

Elasticsearch: pros and cons for real-time analytics

February 22, 2021

In this video, solution architect Itai Yaffe shares what Elasticsearch provides for analytic use cases compared to Imply.

What’s new in Apache Druid?

February 17, 2021

Looking for a recap of the key improvements to Apache Druid in its most most recent releases? Here’s the video for you.

Near term Apache Druid roadmap for 2021

February 17, 2021

Gian Merlino, Apache Druid PMC Chair, lays out the near team Druid roadmap.

5 minute explanation of Imply’s advantages for self service analytics

January 29, 2021

We describe the Imply full-stack, multi-cloud data platform, and Itai breaks down the unique combination of features Imply brings to self-service analytics.

The Superstars of Apache Druid meetup

January 28, 2021

The Superstars of Apache Druid meetup was recorded live on 1-21-2021. Talks from Dan Prince from Target, Gian Merlino, and Vadim Ogievetsky

Comparing Elasticsearch and Imply for operational analytics

January 21, 2021

if you’re considering using it for interactive analytics, be advised that it’s quite common to see companies struggling with Elastic-based analytic solutions.

3 different approaches to multi-tenant applications

January 19, 2021

The three main challenges with multi-tenant applications are cost, performance, and data management.

How Apache Druid ensures quality of service

January 19, 2021

Gian Merlino covers Druid’s use of tiers, lanes and priorities to address QoS and deliver consistent performance as you scale demand.

Building interactive data applications for event streams w/Confluent

January 14, 2021

Data apps let business users explore and investigate all of a company's event data and come to insights that impact day-to-day and long-range decision making.

Outbrain's real-time analytics architecture

January 6, 2021

Outbrain’s real-time analytics architecture consists of modern big data technologies like Kafka, Spark, Druid and Imply Pivot for query and visualization.

Technical reasons why Lyft chose Apache Druid for real time analytics

January 5, 2021

In this video, Sharanya Santhanam from Lyft explains the key technical reasons why they chose Druid as the engine for their real-time data pipeline.

Lyft's modern data architecture feat. Apache Druid, Kafka and Flink

January 5, 2021

Take a look at Lyft’s modern data architecture. They have an app implemented in Flink that reads real-time event data from Kinesis and transforms the data.

Lyft's Apache Druid uses cases

January 5, 2021

Lyft uses Apache Druid for three core use cases: geo-spatial data lookups for rideshare customer service, analytics on AWS cloud infrastructure spend.

Achieve the event-driven Nirvana with Apache Druid

January 5, 2021

We will explain how Apache Druid enables self-service BI on event data and allows business users to ask their own questions leading to real-time insights.

Apache Druid and database evolution

December 24, 2020

A look at how Druid builds on the long history of database engines, and what that means for technologists and decision-makers.

Apache Druid data optimisations

December 10, 2020

A telling of the tale of how Druid can optimize data as it is ingested, turning it into a format perfect for interactive, real-time analytics.

Virtual Druid Summit 2020 highlights

December 8, 2020

Catch all of the amazing highlights from our free Virtual Druid Summit. We had four amazing events that took place between April 10 to November 20, 2020.

A demo of how Imply powers real-time data applications

November 25, 2020

Learn more about how Imply powers real-time data applications. In this demonstration video, Sales Engineer Vasilis Vagias walks you through the product.

Imply Cloud on AWS demonstration

November 11, 2020

Want to see how to set up, use and manage Imply Cloud for real-time self-service analytics at scale?

Apache Druid 101

November 11, 2020

This talk provides an introduction to Apache Druid including: Druid's core architecture and its advantages.

What is a data application?

October 29, 2020

Data applications are visually-oriented and designed for non-analysts to investigate and explore large, fast-moving data sets.

The 5 things data applications need

October 29, 2020

What do data applications need? Imply CTO Gian Merlino boils it down to 5 key things.

SuperAwesome Returns for a SuperAwesome Apache Druid Meetup

October 29, 2020

In this talk, we walk through how we run Druid on spot instances.

How Apache Druid addresses the 5 requirements of data applications

October 29, 2020

Data applications are visually-oriented and designed for non-analysts to investigate and explore large, fast-moving data sets.

Three reasons why Target built a custom enterprise analytics platform

October 21, 2020

Target explains why they built a custom analytics platform, focusing on self-service development, discovery /collaboration of data, and to ensure scalability.

Target explains unions in Apache Druid

October 21, 2020

In this short video, Jeremy Woelfel of Target explains how unions allow them to blend their data.

Key Apache Druid features Target utilizes

October 21, 2020

Jeremy Woelfel of Target explains how they ingest their data, the impact of cardinality, metrics collection, Thetasketch approximations, and more!

ApacheCon @Home Thursday Keynote: Gian Merlino, Imply

October 14, 2020

Gian Merlino discuss why you can never find that single perfect system and how to think about & evaluate data on a temperature-based spectrum.

What's inside Netflix’s metrics pipeline & Apache Druid cluster?

October 12, 2020

Ben Sykes discusses how Netflix created its metrics pipeline to ensure a high-quality streaming experience & how they structure their Apache Druid cluster.

A explainer about Druid rollup, cardinality and segments from Netflix

October 12, 2020

In this short video, Ben Sykes of Netflix explains Druid roll-up, the impact of high cardinality, and segment sizing.

Benchmarking Apache Druid II: What’s Under the Hood?

October 5, 2020

Learn how Druid provides sub-second response times over billions of rows and why performance scales predictably with data growth.

TrafficGuard uses Imply Clarity to monitor Apache Druid performance

October 1, 2020

Raigon Jolly explains how they use Imply Clarity to monitor Druid's performance and pinpoint and resolve issues.

TrafficGuard analyzes ad fraud data “at the speed of thought” w/ Imply

October 1, 2020

Raigon Jolly, explains how Imply enables Trafficguard's technical and non-technical employees to analyze data at the speed of thought.

How TrafficGuard Fights Ad Fraud with Imply

October 1, 2020

In this Virtual Druid Summit excerpt, Raigon Jolley, Head of Analytics @ TrafficGuard shares their success with Imply.

Why Twitch needed self-service analytics

September 21, 2020

At Twitch, data is at the center of all decisions. To empower data-driven decisions at the company, they needed a self-service analytics platform.

Twitch explains the benefits of Imply Pivot

September 21, 2020

Learn the main advantages Twitch has experienced while using Pivot.

Inside Twitch's Apache Druid Architecture

September 21, 2020

Nicholas Ngorok of Twitch gives an in-depth explanation of Twitch’s Apache Druid architecture and ingest self-service tooling.

Apache Druid Hits 10,000 Star Milestone (Gource Video)

September 10, 2020

Here’s a little treat, a fun gource video that shows the journey that Druid code has taken.

Demo: How Imply Plus Kafka Enables Self-service Hot Analytics

August 31, 2020

In this demo from Kafka Summit 2020, we'll show you how Imply plus Kafka enables self-service hot analytics through a real-time data platform.

A Demonstration of How to Use Imply for Retail Analytics

August 31, 2020

See how anyone can leverage Imply to reveal insights in milliseconds; regardless of their technical ability or data scale.

Some Like It Hot

August 25, 2020

In this talk, we'll reflect on why you can never seem to find that single perfect system, and how to think about their capabilities on a spectrum.

Apache Druid Lightning Fast Analytics on Real-time and Historical Data

August 24, 2020

In this talk, we will start with an overview of Apache Druid followed by a look at several examples of how Druid is being used in the real-world.

Why Imply instead of open-source Apache Druid | NTT

August 21, 2020

A key reason why NTT chose Imply’s self-service analytics platform over open-source Apache Druid was because of Imply Pivot.

Why NTT Chose Apache Druid over Elasticsearch, InfluxDB and Cassandra

August 20, 2020

In this webinar excerpt, learn why NTT chose Apache Druid over Elasticsearch, InfluxDB, and Cassandra.

Imply Demo - Network Monitoring at Scale

August 19, 2020

A common use case of Imply’s self-service analytics platform is to store, analyze, and visualize different types of networking data.

What’s inside an Apache Druid cluster?

August 12, 2020

In this video excerpt, Imply Technology Evangelist Peter Marshall provides an in-depth technical explanation of what's inside an Apache Druid cluster.

What happens once data is in Apache Druid?

August 12, 2020

In this video clip, Imply Technology Evangelist Peter Marshall breaks down the step-by-step processes that occur once data is in Druid.

How NTT improves operations with self-service analytics

August 6, 2020

Learn how NTT, the owner and operator of one of the largest global tier-1 IP backbones, uses Imply for self-service analytics for network telemetry.

Virtual Apache Druid Meetup featuring SuperAwesome

July 31, 2020

In this talk, each of the components of an Apache Druid cluster is described

Benchmarking Apache Druid

July 21, 2020

This talk explains how we evaluated Druid’s performance using the Star Schema Benchmark (SSB).

Introduction to Hot Analytics

July 9, 2020

In this webinar excerpt, Imply co-founder Gian Merlino explains the attributes of hot, warm and cold analytics and use cases for hot analytics in particular.

Apache Druid vs Google Big Query: price performance benchmark

July 7, 2020

Industry standard benchmarks show that Imply delivers 12 times the price / performance of Google BigQuery.

Migrating From Elasticsearch to Apache Druid (WalkMe)

June 30, 2020

For Walkme's use case, Elasticsearch did not meet their criteria. They needed a system that could provide real-time analytics.

Why data warehouses cannot support hot analytics

June 24, 2020

Data warehouses struggle with hot analytics use cases because they are too slow, unable to scale, or too expensive.

Query laning in Apache Druid

June 23, 2020

Query laning works like an HOV lane. It provides prioritized access to a subset of resources for urgent queries.

Why we added JOINs in Apache Druid

June 22, 2020

The addition of JOINs simplifies data pipelines and creates substantial cost savings by reducing storage costs, data ingestion costs, and maintenance costs.

Apache Druid JOINs Introduction

June 22, 2020

The addition of JOINs helps reduce cloud data storage and compute costs for Apache Druid users.

How Cisco Safely Queries Multi-tenant Data Sources

May 22, 2020

Learn how Cisco safely queries multi-tenant data sources using an API proxy and gRPC calls that are translated into SQL queries for Druid.

Cisco's Real-time Ingestion Architecture with Kafka and Druid

May 22, 2020

See Cisco’s real-time ingestion architecture, which includes applications that ingest real-time streaming data to a set of Kafka topics

What’s New in Imply 3.3 & Apache Druid 0.18

May 21, 2020

The most recent Imply 3.3 release, based on Apache 0.18 brings several major new features, including joins, query laning and Clarity Alerts.

How DBS uses Apache Druid in its event-driven ecosystem

May 15, 2020

A key reason why Druid fits in DBS's ecosystem is because of its "native real-time capabilities and integration with Kafka

DBS’s Data River Architecture

May 15, 2020

Arpit Dubey shares how they solve the toughest problems around anti-money laundering (AML) utilizing a data river architecture.

How Athena Health Automated CI/CD for Apache Druid Clusters

May 14, 2020

Ramesh Kampanna walks us through an implementation that includes Bitbucket, Terraform and Jenkins.

Athena Health’s Apache Druid Architecture

May 13, 2020

Karthik Urs, Lead Member of Technical Staff from Athena Health quickly runs through their Apache Druid architecture.

Athena Health augments Snowflake and Cassandra with Apache Druid

May 13, 2020

What separated Druid from other options they were using such as Cassandra & Snowflake was its sub-second response, low latency and support for high concurrency.

What is Apache Druid used for?

May 12, 2020

Druid is great for helping analysts troubleshoot and find root causes to issues as well as monitoring and looking into the current status of a system.

The Difference Between Hot, Warm and Cold Data

May 12, 2020

Apache Druid PMC Chair and Imply co-founder Gian Merlino classifies data into three different temperatures (Hot, Warm, and Cold).

Apache Druid Roadmap

May 12, 2020

Apache Druid PMC Chair and Imply co-founder Gian Merlino shares what he and the Imply team are working on regarding the Druid roadmap.

Why BT decided to partner with Imply

April 29, 2020

BT partnered with Imply to help reduce their overall maintenance costs and free up valuable engineering resources to work on higher value business problems.

Why BT chose Druid over Cassandra

April 29, 2020

The main deciding factors in choosing Druid were because of use case fit and the ability to support streaming data.

BT's Apache Druid architecture

April 29, 2020

In this clip, you will receive a high-level overview of BT's Druid Architecture. Pankaj Tiwari walks us through the components of BT’s Druid architecture.

Using Tiering to reduce infrastructure costs by 20 percent at Twitter

April 28, 2020

In this Virtual Druid Summit clip, Twitter shares their experience of dividing their cluster up into multiple tiers.

How Twitter uses Imply Clarity to Monitor its Druid Cluster

April 28, 2020

Twitter uses Clarity to monitor their production Druid cluster. In some cases, severe and difficult issues were found because of Clarity.

Best Practices for Deploying Druid on the Cloud

April 10, 2020

In this clip, we share common best practices for deploying Druid on the cloud.

Best Practices for Deploying Apache Druid on Kubernetes

April 10, 2020

In this clip, we share best practices for deploying Druid in a Kubernetes cluster.

Best Practices for Deploying Apache Druid on Azure

April 10, 2020

This video clip shares recommendations for running Apache Druid on services such as Azure VM, Azure Blob Storage, and Azure Database Service.

Best Practices for Deploying Apache Druid on AWS

April 10, 2020

We share best practices for running Apache Druid on services such as S3, Amazon Aurora, MySQL, and more.

Best Practices for Deploying Apache Druid on GCP

April 10, 2020

In this clip, we share best practices for leveraging GCP services such as Compute Engine, Cloud Storage, and Cloud SQL.

Apache Druid Architecture

April 10, 2020

This video provides an overview of the Druid Architecture. Delving into the rules and components of each process and how they interact with each other.

Deploying Apache Druid as a service on any cloud

March 29, 2020

Learn some of the best practices for deploying Druid on Amazon Web Services (AWS), Google, Microsoft, or your own infrastructure.

Real-Time Analytics Stack with Apache Kafka and Apache Druid

February 19, 2020

In this webinar, a creator of Apache Druid covers the current state of real-time and streaming analytics, architectural challenges and design considerations.

Scalable Incremental Index for Druid

February 3, 2020

Dr. Bortnikov @ Verizon Media: Ingestion and queries of real-time data in Druid are performed by a software component named Incremental Index (I^2).

Analyzing 1 Billion Gamers w/ Apache Druid - GameAnalytics (Tech Talk)

January 22, 2020

This talk will cover how GameAnalytics built a cloud architecture using AWS Kinesis & Apache Druid via Imply Cloud to analyze activity on 16 billion sessions.

Real Time Analytics with Druid, Spark, and Kafka (Outbrain)

January 13, 2020

Real Time Analytics with Druid, Apache, Spark, and Kafka by Daria Litvinov. A talk from Druid meetup at Outbrain on November 2019.

Why Nielsen Chose Druid Over Elasticsearch

January 6, 2020

In this 3 minute video, Itai Yaffe from Nielsen explains why they moved from Elasticsearch to Apache Druid as infrastructure for their marketing analytics.

Theta Sketches for Fast Approximation on Large Data Sets

December 18, 2019

A short explanation of how Theta Sketches, a fast approximation algorithm for Druid, work.

Webinar: Marketing Performance Analytics Using Apache Druid (Nielsen)

December 11, 2019

Learn how Nielsen Marketing Cloud used Apache Druid to Built a cloud-based architecture to serve up real-time query response against 10s of terabytes per day.

Webinar: WalkMe | Druid & Imply Cloud for user behavior analysis

November 13, 2019

Yotam Spencer, Head of Data Engineering at WalkMe discusses how they use Druid / Imply Cloud for user behavior analytics, with a dive into bloom filters also.

Meetup: How Pinterest Powers Advertising Analytics with Apache Druid

October 23, 2019

Filip Jaros at Pinterest , discusses how they went from nothing to serving all ads data requests via Druid.

Druid Demos and Roadmap at Bay Area Meetup Hosted by Pinterest

October 22, 2019

Vadim Ogievetsky shows off the new Data Loader and SQL Query Builder and Gian Merlino discusses coming enhancements including SuperBatch, Druid SQL and JOINs.

Apache Druid's Rollup Feature

October 19, 2019

A brief look at Apache Druid's rollup feature that greatly speeds queries and reduces storage requirements.

Druid “Sweet 0.16" and Imply 3.1 Webinar

October 9, 2019

In this webinar, Imply Chief Product Officer and co-founder Vadim Ogievetsky demonstrates some of the new features in the latest Apache Druid release.

Apache Druid 0.16.0 Quickstart

September 26, 2019

A quick run through installing Druid on a single server and using the new Data Loader and SQL View to ingest and query a Wikipedia edits file.

Real-Time Clickstream Data Exploration with Apache Druid

August 28, 2019

This webinar covers clickstream analytics using Apache Druid, a broad sets of use cases including analytics for marketing, e-commerce, apps, games and more.

Intro to Imply 3.0: Delivering Ease of Druid

July 30, 2019

In this webinar, Vadim Ogievetsky, CPO at Imply, will demo new Imply 3.0 capabilities including Data Loader, Alerts and Imply Manager.

Apache Druid and Imply: Analyzing Network Telemetry

July 29, 2019

A demonstration of ad hoc interactive analysis (OLAP operations) on network data such as network telemetry, Netflow and syslog data.

Imply Manager Walk-through

July 15, 2019

A walk-through of the Druid cluster management functionality in Imply Manager.

Architecting Microservices Applications with Instant Analytics

June 20, 2019

Online talk about microservices and analytics featuring Confluent (Kafka) and Imply (Druid) Experts

Swimming in the Data River; When Streaming Analytics Isn't

June 12, 2019

This talk covers the current state of the streaming analytics world.

Apache Druid & An Introduction to Data Rivers

June 12, 2019

FJ Yang, CEO of Imply, spoke at Data Driven NYC in June 2019 where he shared the history of Apache Druid and introduced the concept of data rivers.

Introduction to Imply Cloud Webinar

May 16, 2019

Learn about Druid and Imply through an end-to-end Imply Cloud demo including deployment, use and management with Imply's co-founder, Vadim Ogievetsky.

First Look: Apache Druid (incubating) Data Loader with Apache Kafka

May 16, 2019

Connecting Druid to an Apache Kafka topic using the Data Loader

a16z Podcast: The Future of Decision-Making—3 Startup Opportunities

May 1, 2019

In this podcast episode, Jad Naous (@jadtnaous) ‏and Frank Chen (@withfries2) discuss this change and the startup opportunities these changes create.

How To Use Apache Kafka® and Druid to Tame Your Router Data

April 2, 2019

In this talk given at Kafka Summit 2019, Eric Graham and Rachel Pedreschi of Imply will discuss and demonstrate streaming analytics pipelines.

Imply Cloud: A walk-through

February 19, 2019

A walk-through of Imply Cloud, Imply's AWS based managed Apache Druid service.

Sub-second Analytics on Apache Kafka with Confluent and Imply

November 19, 2018

Learn about how Apache Kafka and Apache Druid can be combined to form an end-to-end streaming analytics stack.

Tutorial: Imply Quickstart

November 15, 2018

A quick walk-though of Imply quickstart. Learn how to set up Imply and load some example data.

Druid Meetup at Netflix

November 14, 2018

Netflix shares their experience of using Druid and how it has helped provide the best streaming experience to their users through a series of lightning talks.

The Rise of the Operational Analytic Data Stores

November 1, 2018

Imply's cofounder and CTO, Gian Merlino, presents about Druid and operational analytic databases.

SF Scala: Gian Merlino Interview

October 31, 2018

A brief interview with Imply's CTO and cofounder, Gian Merlino.

New data architectures for high performance netflow analytics

October 10, 2018

Learn about how open source streaming technologies such as Apache Kafka and Apache Druid can be combined to analyze network traffic data.

Introduction to Imply

July 18, 2018

Imply is an operational data analytics platform that is designed from the ground up for event-driven data.

Interactive Exploratory Analytics with Druid

July 21, 2017

Learn more about how Druid can power exploratory workflows that go beyond dashboarding and reporting.

Building an Open Source Streaming Analytics Stack with Kafka and Druid

February 23, 2017

Learn how to build an end-to-end streaming analytic stack by combining a message bus (Apache Kafka), a stream processor, and a query engine (Apache Druid).

OLAP for Big Data, SF Data Mining Meetup

April 13, 2016

A look into what is needed to support OLAP for a big distributed data platform.

Technical deep dive into how Outbrain scales its real-time analytics

Because Outbrain processes billions of impressions and events a day, they risk running into scaling problems.