Lessons
Advance working knowledge of Druid with lessons designed to navigate technical approaches and best practices across the development lifecycle.
How to Monitor Your Data in Real Time with AWS IoT Core and Imply
Learn how to use AWS IoT Core and Imply to monitor your data in real time.
How to Incrementally Encode String Columns
Learn about strings and encoding
Planning a Druid Project
For people getting started with Apache Druid, this lesson takes a developer through all the key planning phases for a Druid project, including purpose, queries, data sources, visualization, infrastructure, and documentation.
Generating Synthetic Data for Development and Testing
This lesson outlines various approaches to generating synthetic data for nonproduction uses, including data requirements and programmatic generation.
Ingesting Batch Data
It’s easy to ingest batch data into Druid, turning files into full-indexed, high-performance tables using SQL commands. This lesson provides an overview of ingesting data from batch sources with in-database transformation capabilities and overall workflow automation.
Ingesting Stream Data
This lesson outlines the key considerations when ingesting data from streaming pipelines including Apache Kafka and Amazon Kinesis. It includes information on Data Sources, Schema Definition, Ingestion Spec, and Monitoring.
How the Time Series Extension Can Enable IoT use Cases in Imply Polaris
Temporal and real-time data have always been the cornerstones of Apache Druid, making it a natural fit for IoT applications that collect and analyze real-time sensor data. However, there was a subset of functionality that was available in other time-series databases that was missing from Druid. With the recently released time series extension from Imply, that gap has been closed, and Imply now has the ability to perform advanced time series analysis.
How to Execute Window Functions on Sketches
Learn how to use SQL window functions in Apache Druid.
Working with changing data
This lesson covers key considerations when working with data sources that mutate over time, including creating tables from streaming data and using schema auto-discovery.
Working with Nested JSON in Druid
This lesson covers key considerations for how to use nested data in Apache Druid.
How to Reduce Credential Iterations—and Improve API Performance
Learn how to speed up your API with this trick.
How to Tune Apache Druid for Speed and Concurrency
Learn how to optimize Apache Druid for peak performance and concurrency.
Scaling a Druid cluster
This lesson walks through design and implementation of Druid infrastructure, including Initial Cluster Creation, Infrastructure Provider Choice, Adding and Removing Capacity, Landscape Design, and Performance Tuning
Monitoring a Druid Cluster
This lesson covers how to consider and implement monitoring for internal Druid operations and cluster infrastructure.
Securing a Druid Cluster
This lesson described how to make a Druid cluster secure, including authoriztion, authentication, network security, and best practices.
Providing High Availability and Disaster Recovery
This lesson provides details on designing and implementing Druid to ensure always-on operations and high data durability.