Federated Queries, Separation of Compute and Storage, and Real-Time Analytics — All from Imply Polaris

Sep 09, 2024
Matt Morrissey

Modern, real-time analytics applications must deliver a predictable experience to users, requiring always-on infrastructure. This means the underlying systems need to be continuously operational, with minimal latency and downtime, to ensure users can access data and insights whenever they need them.

However, maintaining always-on infrastructure for seldom-accessed data is not cost-effective, even though access to this data remains essential. Instead of using different infrastructures for real-time and long-term data, it’s more efficient to load the data once and use it from a single source.

This is where Imply Polaris comes in. It offers a powerful solution that combines federated queries, separation of compute and storage, and real-time analytics. With Imply Polaris, you can achieve fast, interactive queries and cost-effective options for aged data, all from a single, efficient platform.

Quick Overview of Imply Polaris

Imply Polaris is a DBaaS built from Apache Druid, the leading real-time analytics database trusted by organizations like Confluent, Netflix, Target, and Salesforce. Druid excels in handling sub-second queries on vast amounts of streaming and batch data, processing hundreds to thousands of queries per second. Developers choose Imply Polaris for its ability to reduce time to market, enhance productivity, and lower operational costs.

Introducing Async Query 

Async Query for Imply Polaris allows for on-demand queries leveraging ephemeral infrastructure. In addition to querying data from always-on infrastructure, Polaris can now query data from deep storage (e.g. storage on an object store like S3) or external systems, ensuring efficient and cost-effective access to both fresh and historical data.

How it works?

Async Query leverages on-demand, ephemeral infrastructure that is provisioned temporarily and released when no longer needed. This approach mirrors the ‘Separation of Compute and Storage’ model popularized by Cloud Data Warehouses, bringing similar benefits to Imply Polaris users. The cost efficiency comes from using the most cost-effective storage options (hot or cold) while enabling infrastructure to be turned on/off and compute to be scaled independently—capabilities already built into Polaris.

New Use Cases Enabled by On-Demand, Federated Queries

By leveraging ephemeral infrastructure and enabling seamless access to both fresh and historical data, Polaris is opening up a range of innovative and cost-effective use cases. Here are some of the key ways how Async Query may be applied:

Cost-Effective Ad-Hoc Analysis with Optimal Price-Performance Ratio

Analysts can now leverage Polaris for historical analysis without workarounds, simplifying the process and making historical data exploration more accessible and cost-effective.

Example: A financial services company uses Imply Polaris to analyze transaction data in real-time to detect fraud. With Async Query, analysts can also conduct historical analysis on transaction data from the past five years to identify long-term fraud patterns, trends in customer behavior, and the effectiveness of past fraud prevention measures.

Exports and Downloads with a Simplified Data Architecture

Developers can integrate export and download features into applications, allowing users to access and download large datasets or reports generated from historical data.

Example: A market research firm needs to track consumer sentiment in real-time through social media feeds. With Async Query, developers can leverage Polaris to integrate functionality into their applications that allows users to download large datasets or comprehensive reports. For instance, a user might download a report summarizing sentiment analysis over the last two years, including detailed insights on specific events or trends.

Complex Reporting with Low-Cost Queries

With Async Query, Polaris now supports resource-intensive queries for in-depth reporting, such as large complex joins, alongside real-time analytics.

Example: A digital advertising technology company can now effortlessly prepare an annual report on ad performance using data spanning a full year. They can easily schedule the generation of these annual reports asynchronously, with the data stored in deep storage. This enables cost-effective querying, allowing the company to efficiently access and analyze extensive datasets without incurring high costs.

How to Take Advantage of This

It’s very easy to take advantage of this feature through the use of data storage policies:

  • Hot Storage (Cache): Keep data in hot storage for any desired period, ensuring sub-second performance.
  • Deep Storage: Retain data in deep storage based on your retention policy, making it available for analysis.

This approach ensures that fresh data needed for real-time queries is kept in memory or local storage, while older, infrequently accessed data is stored in deep storage. Since deep storage uses object storage with higher latency, queries on this data can be run asynchronously, avoiding any impact on the latency of real-time queries.

For the first time, you can maintain historical and real-time data efficiently within a single system, optimizing the price-performance ratio.

Summary

In summary, on-demand, federated queries with Imply Polaris provide significant advantages for modern analytics applications. By ensuring data consistency between real-time and historical data, businesses can reduce the risks associated with data discrepancies. The ease of use is another key benefit, as Imply Polaris consolidates real-time and historical data management into a single system, eliminating the need for multiple tools and simplifying the overall workflow. This unified approach not only enhances operational efficiency but also lowers costs by enabling more cost-effective storage of historical data. With Imply Polaris, organizations can optimize their price-performance ratio, ensuring they get the most value from their data infrastructure investments.

Learn More and Get Started for Free!

Ready to explore Polaris? Start your journey here by signing up for a free account.  You’ll automatically receive a US$500 credit to use within your first 30 days – no credit card required! Or, take Polaris for a test drive and experience firsthand how easy it is to build your next analytics application.If you have questions or want to learn more, set up a demo with an Imply expert. We’re here to help you make the most of Imply Polaris for your real-time analytics needs.

Other blogs you might find interesting

No records found...
Nov 14, 2024

Recap: Druid Summit 2024 – A Vibrant Community Shaping the Future of Data Analytics

In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...

Learn More
Oct 29, 2024

Pivot by Imply: A High-Speed Data Exploration UI for Druid

In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...

Learn More
Oct 22, 2024

Introducing Apache Druid® 31.0

We are excited to announce the release of Apache Druid 31.0. This release contains over 525 commits from 45 contributors.

Learn More

Let us help with your analytics apps

Request a Demo