Why Data needs more than CRUD
After over 30 years of working with data analytics, we’ve been witness (and sometimes participant) to three major shifts in how we find insights from data – and now we’re looking at the fourth.
Planning a Druid Project
For people getting started with Apache Druid, this lesson takes a developer through all the key planning phases for a Druid project, including purpose, queries, data sources, visualization, infrastructure, and documentation.
Generating Synthetic Data for Development and Testing
This lesson outlines various approaches to generating synthetic data for nonproduction uses, including data requirements and programmatic generation.
Ingesting Batch Data
It’s easy to ingest batch data into Druid, turning files into full-indexed, high-performance tables using SQL commands. This lesson provides an overview of ingesting data from batch sources with in-database transformation capabilities and overall workflow automation.
Ingesting Stream Data
This lesson outlines the key considerations when ingesting data from streaming pipelines including Apache Kafka and Amazon Kinesis. It includes information on Data Sources, Schema Definition, Ingestion Spec, and Monitoring.
Working with changing data
This lesson covers key considerations when working with data sources that mutate over time, including creating tables from streaming data and using schema auto-discovery.
Working with Nested JSON in Druid
This lesson covers key considerations for how to use nested data in Apache Druid.
Scaling a Druid cluster
This lesson walks through design and implementation of Druid infrastructure, including Initial Cluster Creation, Infrastructure Provider Choice, Adding and Removing Capacity, Landscape Design, and Performance Tuning
Monitoring a Druid Cluster
This lesson covers how to consider and implement monitoring for internal Druid operations and cluster infrastructure.
Securing a Druid Cluster
This lesson described how to make a Druid cluster secure, including authoriztion, authentication, network security, and best practices.
Providing High Availability and Disaster Recovery
This lesson provides details on designing and implementing Druid to ensure always-on operations and high data durability.