Articles
Gain a clear understanding of the data ecosystem, best practices, design patterns, and key decision criteria for various technical objectives and use cases.
Building a Real-Time Analytics Architecture with Imply Polaris on Azure
Imply Polaris is now available on Azure! Learn more about what Polaris can do for your Azure-based applications
Things to Consider When Scaling Analytics for High QPS
In the era of analytics where query volume is the sine qua non “V” of data, how should we think about system architecture – what matters and why?
Why Analytics Needs More than a Data Warehouse
For decades, analytics were defined by business intelligence and executive-style reports powered by read-optimized data warehouses. This article dives how an analytics are shifting from batch reporting workflows to real-time application workflows.
Why Data needs more than CRUD
After over 30 years of working with data analytics, we’ve been witness (and sometimes participant) to three major shifts in how we find insights from data – and now we’re looking at the fourth.
Overcome tradeoffs with schemaless databases
In this article, we explore the challenges posed by schemaless databases and introduce Druid, a groundbreaking database that seamlessly combines schema flexibility with high-performance capabilities, eliminating the need for trade-offs.
Three Ways to Use Apache Druid for Machine Learning Workflows
Apache Druid is an excellent addition to any machine learning environment and can facilitate analytics, streamline monitoring, and add real-time data to operations and training.
Distributed by Nature: Druid at Scale
This blog explains how Druid’s architecture and built-in automation makes it easy to operate and scale in cloud and k8s environments.
Real-Time Analytics: Building Blocks and Architecture
There’s an increasing need for immediacy in data analytics, and it’s happening at scale on large data sets. This post unpacks the key building blocks and data architecture for real-time analytics.
Apache Druid: Making 1000+ QPS for Analytics Look Easy
This post dives into Apache Druid’s architecture with details on how it can efficiently handle analytics applications needing high QPS.
Apache Kafka, Flink, and Druid: Open Source Essentials for Real-Time Data Products
Apache Kafka, Flink, and Druid, when used together, create a real-time data architecture for a wide range of streaming data-powered use cases from alerting, monitoring, dashboards, ad-hoc exploration, and decisioning workflows.
Keeping up with changing schemas in streaming data
Discover how Apache Druid delivers a unique solution for managing schema changes in streaming data. Its approach helps alleviate challenges, solidifying Druid’s position as the top database for real-time analytics.
Exploring Unnest in Druid
This article shows how Druid supports multi-value strings through multi-value dimensions (MVDs), which automatically flattens during a group-by.
The Promise (and Limitations) of Range Partitions
Learn how to improve read times with range partitions.
An Introduction to Window Functions
Learn all about window functions
Multi-dimensional range partioning in Druid
Druid always partitions data by the timestamp dimension to benefit time-based analytical queries. A secondary partitioning is available to further break down the time chunks into manageable partition sizes.
Joins in Apache Druid
This blog explores Druid’s multiple options for joins, including ingestion-time and query-time joins, catering to different use cases and data scenarios.
Exploring Unnest in Druid
This article shows how Druid supports multi-value strings through multi-value dimensions (MVDs), which automatically flattens during a group-by.
Learn how to achieve sub-second responses with Apache Druid
A review of Druid’s query processing engine with an eye on performance. Provides many data modeling and query tips that improve response times.
The Significance of Schema Auto-Discovery in Apache Druid
This article provides a technical overview of the schema auto-discovery feature in Apache Druid through a practical IoT telemetry use case.
Distributed by Nature: Druid at Scale
This blog explains how Druid’s architecture and built-in automation makes it easy to operate and scale in cloud and k8s environments.
Four Key Considerations for Customer-Facing Analytics
Analytics aren’t just for internal stakeholders anymore. If you’re building an analytics application for customers, then you’re probably wondering…what’s the right database backend?
What Developers Can Build with Apache Druid
Exploring the popular uses for Apache Druid, a high-performance, real-time analytics database, and how it could fit in your data stack.