Developer Center > Ingestion

Ingestion

Learn all the best practices for ingesting data into Druid across both batch and streaming ingestion methods.

Keeping up with changing schemas in streaming data

Discover how Apache Druid delivers a unique solution for managing schema changes in streaming data. Its approach helps alleviate challenges, solidifying Druid’s position as the top database for real-time analytics.

Upserts and Data Deduplication with Druid

This article explains several options to accomplish upserts and data deduplication in Apache Druid.

Exploring Unnest in Druid

This article shows how Druid supports multi-value strings through multi-value dimensions (MVDs), which automatically flattens during a group-by.

Ingesting Batch Data

It’s easy to ingest batch data into Druid, turning files into full-indexed, high-performance tables using SQL commands. This lesson provides an overview of ingesting data from batch sources with in-database transformation capabilities and overall workflow automation.

Ingesting Stream Data

This lesson outlines the key considerations when ingesting data from streaming pipelines including Apache Kafka and Amazon Kinesis. It includes information on Data Sources, Schema Definition, Ingestion Spec, and Monitoring.

Migrating Data From S3 To Apache Druid

This article covers the rationale, advantages, and step-by-step process for data transfer from AWS s3 to Apache Druid for faster real-time analytics and querying.

Migrate Analytics Data from Snowflake to Apache Druid

This tutorial outlines how to migrate data from Snowflake to Apache Druid. It covers the process of extracting data from Snowflake, ingesting that data into Druid and also querying that data.

Migrate Analytics Data from MongoDB to Apache Druid

This tutorial steps through the data migration procedure for moving data from MongoDB to Apache Druid. It covers the process of extracting data from MongoDB, ingesting that data into Druid and hanging changing data.

Automatic Kafka Stream Topic Detection and Ingestion

This tutorial outlines how to automate ingestion from new Kafka topics to Apache Druid. It outlines how to develop a monitoring agent that captures new topics and automatically creates ingestion tasks for them in Druid.

Schema Auto-Discovery with Apache Druid

This blog details how to ingest data into Apache Druid without specifying a schema. It also covers how to use the Druid Console to query and visualize data.

An in-depth look at streaming ingestion

An in-depth review of how streaming ingestion works in Apache Druid. It provides a review of the internal workings of each task and how they work together to provide scale and throughput.

Pruning in Druid

Fourth and last video in the Partitioning Series that describes and demos how different partitioning strategies provide segment pruning at query time.

Range Partitioning in Druid

Third video in the Partitioning Series that explains how single dimension and range partitioning work.

Hash Partitioning in Druid

Second video in the Partitioning Series that describes the details of how hash partitioning works.

Intro & Dynamic Partitioning in Druid

First in the Partitioning Series of videos that explains the concepts of time partitioning and secondary partitioning.

Schema Auto Discovery Demo Video

This technical video shows the new Apache Druid schema-auto discovery feature in action and how it can be leveraged to simplify data ingestion.

Newsletter Signup

Let us help with your analytics apps

Request a Demo