Lessons

Advance working knowledge of Druid with lessons designed to navigate technical approaches and best practices across the development lifecycle.

Planning a Druid Project

For people getting started with Apache Druid, this lesson takes a developer through all the key planning phases for a Druid project, including purpose, queries, data sources, visualization, infrastructure, and documentation.

Generating Synthetic Data for Development and Testing

This lesson outlines various approaches to generating synthetic data for nonproduction uses, including data requirements and programmatic generation.

Ingesting Batch Data

It’s easy to ingest batch data into Druid, turning files into full-indexed, high-performance tables using SQL commands. This lesson provides an overview of ingesting data from batch sources with in-database transformation capabilities and overall workflow automation.

Ingesting Stream Data

This lesson outlines the key considerations when ingesting data from streaming pipelines including Apache Kafka and Amazon Kinesis. It includes information on Data Sources, Schema Definition, Ingestion Spec, and Monitoring.

How the Time Series Extension Can Enable IoT use Cases in Imply Polaris

Temporal and real-time data have always been the cornerstones of Apache Druid, making it a natural fit for IoT applications that collect and analyze real-time sensor data. However, there was a subset of functionality that was available in other time-series databases that was missing from Druid. With the recently released time series extension from Imply, that gap has been closed, and Imply now has the ability to perform advanced time series analysis.

Working with changing data

This lesson covers key considerations when working with data sources that mutate over time, including creating tables from streaming data and using schema auto-discovery.

Working with Nested JSON in Druid

This lesson covers key considerations for how to use nested data in Apache Druid.

Scaling a Druid cluster

This lesson walks through design and implementation of Druid infrastructure, including Initial Cluster Creation, Infrastructure Provider Choice, Adding and Removing Capacity, Landscape Design, and Performance Tuning

Monitoring a Druid Cluster

This lesson covers how to consider and implement monitoring for internal Druid operations and cluster infrastructure.

Securing a Druid Cluster

This lesson described how to make a Druid cluster secure, including authoriztion, authentication, network security, and best practices.

Providing High Availability and Disaster Recovery

This lesson provides details on designing and implementing Druid to ensure always-on operations and high data durability.

Newsletter Signup

Let us help with your analytics apps

Request a Demo