RixEngine

RixEngine halved costs and query latency with Imply Enterprise Hybrid

“The transition to Imply Enterprise Hybrid has significantly enhanced our operational efficiency and flexibility for data roll-ups, drill-downs, and multi-dimensional data analysis. Imply has become an indispensable tool for our business and marketing teams, providing data-driven insights to optimize ad strategies and deliver real-time, actionable insights.”
Zeyu Huang, Data Engineer  |    RixEngine

Summary

AlgoriX and RixEngine both upgraded from Apache Druid to Imply Enterprise Hybrid to enhance real-time analytics capabilities for their global ad exchange business. Imply significantly enhanced their flexibility for data roll-ups, drill-down analysis, and query performance for multi-dimensional analysis. These improvements have made Imply Hybrid an indispensable tool for their business and marketing teams, who heavily rely on its advanced analytics capabilities to gain insights, optimize advertising strategies, and make data-driven decisions. This upgrade has not only improved operational efficiency, but also strengthened RixEngine’s ability to deliver real-time, actionable insights to various stakeholders.

Highlights

  • Decreased infrastructure costs by 50%+ due to efficient data compression, tiered storage, streamlined cluster management, and a 70% reduction in bandwidth usage
  • Improved query speed from >10 seconds to <5 seconds
  • Saved 8 hours of engineering resources per week
  • Accelerated data analysis across 100+ dimensions with real-time dashboards
  • Lowered operational overhead with simplified ingestion, management, and support

Background

AlgoriX is an independent global media and technology company that provides mobile monetization and advertising solutions to publishers, buyers, and advertisers. RixEngine, an AI-driven advertising exchange platform incubated by AlgoriX, operates independently as a full-stack SaaS solution for programmatic advertising. This advanced platform requires robust analytics capabilities to facilitate real-time bidding and generate actionable insights for performance optimization.

Prior to adopting Imply Enterprise Hybrid for real-time analytics, RixEngine relied on Apache Druid® and Amazon Redshift as their core database technologies. For visualization, they used Metabase and an internal operations platform.

Challenge

Before switching to Imply Enterprise Hybrid, RixEngine relied heavily on open-source Druid as their primary database. Although Druid provided a performant foundation for real-time analytics, they faced several challenges that prompted them to upgrade to Imply Hybrid as a more robust and scalable solution:

  • Unstable Query Performance: Initially, RixEngine attempted to improve query performance by adding more data nodes, but this approach proved to be costly and only provided marginal improvements. They tried tuning parameters for the Historical and Broker nodes, but couldn’t achieve the desired performance levels.
  • Data Ingestion Bottlenecks: With daily data ingestion volumes reaching hundreds of terabytes, RixEngine frequently experienced data consumption delays. The ingestion pipeline felt overwhelmed, and the service was often fragile. Even minor changes, such as adding a new field, could lead to unexpected issues and disruptions.
  • Operational Complexity: Managing and maintaining the Druid cluster required significant effort, and the system’s limitations made it difficult to scale efficiently or adapt to evolving business needs.
  • Data Silos: Data was spread across various systems, including Kafka streams (ad request and reporting logs, blocked ad request logs, machine monitoring data), AWS S3, Google Cloud Storage (GCS), MySQL databases (operational platform data), and third-party platforms providing anti-fraud information. As each source had different urgency and importance profiles, the distributed landscape made comprehensive analysis difficult, slowed down reporting, and increased operational overhead.

Solution

Overall, Imply Hybrid provided stable and high-performance queries, efficient handling of large-scale data ingestion, and flexible, in-depth data exploration. Imply’s managed services also reduced operational overhead for the RixEngine team, allowing them to focus more on delivering value to the business. Beyond these core capabilities, Imply delivered two key improvements for RixEngine: (1) unified data analysis and (2) flexible data management.

(1) Unified data analysis: Imply helped RixEngine eliminate data siloes by centralizing data into a unified platform and improving data quality through cleansing, transformation, and enrichment. This consolidation replaced a fragmented architecture that spanned across multiple databases, BI tools and self-managed systems ー thereby reducing operational overhead and the risk of data errors. Imply provides a single source of truth via:

  • Real-time Ingestion (Kafka): Imply’s robust Kafka integration allows RixEngine to ingest streaming data with low latency, which is crucial for near real-time dashboards, monitoring, and alerting on critical events.
  • Batch Ingestion (MSQ): For less time-sensitive data in S3 and GCS, RixEngine leverages Imply’s MSQ (Multi-Stage Query engine) for efficient batch ingestion and reindexing. This lends flexibility to load historical data or update existing datasets.
  • Lookups: Imply’s Lookup feature further enriches their data by augmenting event data with contextual information stored in an external system (MySQL).

(2) Flexible data management: Imply Hybrid’s flexibility allows RixEngine to tailor data management to their specific needs for optimal performance. To illustrate:

  • Adaptive Data Compression: For data with low query frequency, they leverage Imply’s auto-compaction feature, minimizing storage costs without impacting performance. Conversely, for frequently queried data, they use MSQ to periodically compress data older than three hours. This ensures optimal query performance for recent data while maintaining cost-effective storage for historical data.
  • Real-time Data Enrichment with Lookups: For contextual data or ID-to-name mappings stored in MySQL, Imply’s lookup functionality enables rapid implementation and enrichment of data, requiring minimal code changes and accelerating time to market for new features.
  • Efficient Data Synchronization with MSQ: As RixEngine maintains a separate copy of critical data in Redshift and other databases, they use Imply’s MSQ to efficiently export key dimensions and metrics from various sources to S3, enabling seamless data synchronization between systems. This streamlines their data pipelines and ensures data consistency across their organization.

Impact

Switching to Imply Enterprise Hybrid helped RixEngine decrease machine costs by over 50%, while reducing query times from over 10 seconds to under 5 seconds. Their ability to reduce the total cost of infrastructure (TCO) by 50% stemmed from data node optimization (50% fewer nodes deployed for equivalent workloads with Imply vs. self-hosted Druid), bandwidth optimization across machines (co-developed a multi-stage aggregation pipeline with Imply, reducing bandwidth usage by 70%), tool consolidation, and operational efficiency gained from data replay incident reduction and lower DevOps overhead.

Additionally, Imply saved RixEngine’s engineering team 8 hours per week by improving data analysis, monitoring, visualization and support processes via:

  • Enhanced Data Analysis: The data cube, alert, and dashboard modules have enriched their data analysis capabilities, enabling quick issue identification across nearly 100 dimensions. This has allowed them to pinpoint root causes, whether they originate from our service, network providers, or upstream/downstream users.
  • Superior Support: The support team’s proactive standby during upgrades, scenario-based testing, and assistance in testing new features like window functions have enhanced RixEngine’s efficiency.
  • Improved Monitoring with Clarity: Clarity has provided better service and query monitoring. RixEngine can observe which dimensions have the poorest query performance and consider adding them as partition keys to optimize performance.
  • User-Friendly UI: The Pivot design in Imply’s UI is more intuitive and user-friendly compared to alternatives like Metabase and Superset, making it easier for users to perform queries and navigate the platform.

Moving forward, RixEngine plans to continue advancing along the path of data- and algorithm-driven development by combining Imply’s data analysis capabilities (e.g. more SQL quantile-related and window functions) with large language models. They’re also investigating how to make their visualizations more accessible via Pivot 一 The overall goal is to guide the team toward deeper data analysis and present it in an easily understandable way through Pivot, lowering the barrier for non-developers to use these functions.

References

See more similar to RixEngine

Netflix

How Druid Provides Internet-Scale Observability at Netflix

Netflix built an observability analytics app powered by Druid, enabling them to monitor playback quality and ensure a consistently great user experience across all devices and operating systems.

Learn More
Pinterest

Pinterest and Druid: Optimizing Advertising for 400 Million+ Visitors

Pinterest chose Druid to power Archmage, their real-time analytics application that enables advertisers to effectively reach over 400 million people who use Pinterest every month.

Learn More
NTT

How NTT Powers Their Analytics Stack and Data Exploration with Imply

NTT is one of the largest telecommunications companies in the world. NTT Global IP Network (GIN) business unit chose Druid and Imply to power their analytics stack, unlocking new data exploration use cases...

Learn More
Ibotta

Security at Speed: Why Ibotta Built Real-Time Fraud Detection on Imply

Ibotta, a free cashback rewards platform, chose Druid to power their multifaceted fraud prevention strategy that combines data from third-party vendors with Ibotta’s own data to make decisions about fraud...

Learn More
Poshmark

How Poshmark Uses Druid to Monitor Their Platform in Real Time

Poshmark is a leading social marketplace for buying and selling of second hand fashion and home goods. Poshmark's team chose Apache Druid to as the core of their analytics framework that lets users explore...

Learn More
Pepsi logo

How PepsiCo Powers Real-Time Sales, Marketing, and Supply Chain Intelligence

PepsiCo leverages Apache Druid to power real-time operational intelligence across its global business, delivering instant insights for sales, marketing, and supply chain teams. By combining Druid’s subsecond...

Learn More
Citrix

Citrix uses Druid to Prevent Security Threats in Real-time

Citrix is a digital workspace platform that gives employees everything they need to be productive in one unified experience while arming IT with the visibility, simplicity, and security needed to enable and...

Learn More
Blis

Blis+Imply: Real-Time Analytics for Adtech—At Massive Scale

Blis is an integrated advertising planning and buying platform that delivers scaled, relevant, and high-performing audiences to the world’s top brands and media agencies. Blis chose Imply to implement real-time...

Learn More
Yahoo

Yahoo uses Druid and DataSketches for Real-time Behavioral Analytics

As its audience and advertising data volumes grew, Yahoo faced increasing demand to make data more accessible, both to internal users and customers. To address the demand for data, the Yahoo team decided to...

Learn More
Zillow Group

Zillow and Imply: Empowering Internal Users with Self-Serve Analytics

As the most-visited real estate website in the United States, Zillow and its affiliates offer customers an on-demand experience for selling, buying, renting and financing with transparency and nearly seamless...

Learn More

Imply Polaris’ Dimension Tables streamline real-time analytics for Metaimpact

Metaimpact improved its real-time analytics by transitioning from Apache Kafka-driven lookup tables to Dimension Tables with Imply Polaris. This allowed Metaimpact to manage large datasets more effectively,...

Learn More
Charter

How Charter Communications Improves Customer Experiences with Imply

Charter Communications is a leading broadband connectivity company and cable operator serving more than 30 million customers in 41 states through its Spectrum brand. Charter chose Druid and Imply as the foundation...

Learn More
Expedia Logo

Personalizing Travel: Expedia, Imply, and the Art of Segmentation

As one of the world’s top travel platforms, Expedia Group manages customer experiences across more than 200 booking sites and 25 brands including Brand Expedia, Orbitz, Travelocity, Vrbo, and Hotels.com....

Learn More
Sift

Sift: Achieving Real-Time Anomaly Detection with Imply + Druid

Sift is the leader in Digital Trust & Safety, empowering companies of all sizes to unlock revenue without risk. Sift chose Druid to power their automated monitoring tool, Watchtower, a system that would use...

Learn More
Walmart

Walmart uses Apache Druid to Track Competitor Pricing in Real Time

Walmart chose Druid as part of their technology stack to track the pricing of their competitors in real-time. With Druid, latencies dropped to near subsecond levels while easily scaling to more than 1 billion...

Learn More
PayPal

PayPal chooses Druid to Optimize the User Journey with User Analytics

PayPal is an online payment system that enables individuals and businesses to send and receive money securely through its mobile app or website. PayPal uses Druid to analyze behavioral data generated from users...

Learn More
GameAnalytics

GameAnalytics turns 57M Game Events per Day into Real-time Insights

GameAnalytics is the number one analytics tool for anyone building a mobile game, from indie developers and game studios to established publishers. The platform receives, stores, and processes game events from...

Learn More
Amobee

Amobee Scales Ad Analytics, Querying Trillions of Rows in Milliseconds

Amobee provides end-to-end advertising campaigns and portfolio management across TV, digital and social media for some of the world’s largest brands. Since implementing Apache Druid, Amobee has been able...

Learn More
Nielsen Marketing Cloud

How Nielsen Marketing Cloud Uses Druid to Analyze Audience Trends

Nielsen Marketing Cloud provides a way to profile the various audiences that marketers and publishers would like to target on digital media, activate via various ad networks, and then gain insights on ad performance....

Learn More
Target

Target and Apache Druid: Real-Time Analytics at Massive Scale

As a data-driven organization, US-based retailer Target needed a data analytics platform that could address the unique needs of each of its various business units, while scaling to hundreds of thousands of...

Learn More

Atlassian Switches from PostgreSQL to Druid for Customer Analytics

Atlassian is a software company with a suite of products designed to enable collaboration among software developers, project managers, and other software development teams. Atlassian chose Druid to power their...

Learn More
Reddit

Reddit Analyzes Advertisement Data in Real Time Using Druid with Imply

Reddit generates tens of gigabytes of event data per hour from advertisements on its platform. To let advertisers both understand their impact and decide how to target their spending, Reddit needed to enable...

Learn More
Twitch

Data for All: How Twitch Used Imply to Build Self-Service Analytics

As Twitch grew, the amount of data they received and the number of employees interested in using data grew rapidly. To continue empowering decision-making as they scaled, Twitch turned to Druid and Imply to...

Learn More
Salesforce

Salesforce Chooses Apache Druid For Their Edge Observability Platform

To ensure a consistently great experience for more than 150,000 customers around the globe, Salesforce built an observability application powered by Druid. Now, Salesforce is able to obtain data-driven insights...

Learn More
The Royal and Ancient Golf Club of St Andrews logo, aka. R&A

How The R&A Powers Real-Time Analytics for Global Sporting Events

The R&A, based in St Andrews, governs the sport of golf worldwide (outside the U.S. and Mexico) and organizes some of the game’s most prestigious tournaments, including The Open and the AIG Women’s...

Learn More

How Apache Druid and Imply Helped Orb Scale Their Usage-Based Billing Platform

Orb, a modern billing engine purpose-built for companies with complex pricing models, overcame database challenges and achieved significant success with Apache Druid and Imply. Facing scalability issues with...

Learn More
Confluent

Scale, Streaming, and Subsecond Queries: Confluent and Apache Druid

Confluent is a full-scale data streaming platform that enables its customers to easily access, store, and manage data as continuous, real-time streams. Confluent turned to Druid because their existing NoSQL...

Learn More
TrueCar

Speed, Security, and Scaling: Why TrueCar Uses Imply + Druid

TrueCar is the most efficient and transparent way to find a car. TrueCar chose Druid and Imply to make their dashboards real-time, detect anomalies, and do so while minimizing engineering and operational overhead.

Learn More
Iron Source

ironSource + Imply: Codeless Queries and Interactive Dashboards

As the leading business platform for the app economy, IronSource provides an array of services to monetize and scale applications, all using streams powered by Confluent and real-time dashboards powered by...

Learn More
Rakuten

Rakuten relies on Druid to Analyze Millions of Records per Second

Rakuten is an affiliate marketing company that helps users earn cash back by shopping through their site. As one of the biggest data-driven companies in Japan, Rakuten ingests and processes huge amounts of...

Learn More
WalkMe

WalkMe Delivers Real-time Analytics for Digital Adoption Platform

WalkMe is a Digital Adoption Platform (DAP) pioneer that offers a 360-degree solution to leading organizations worldwide. WalkMe chose Druid to power their internal and external analytics applications, enabling...

Learn More

How Roblox Powers Real-Time Analytics for Millions of Gaming Creators

Roblox leverages Apache Druid to deliver real-time, scalable analytics for millions of gaming creators across its global platform. By combining Druid’s subsecond query performance with powerful approximation...

Learn More
Adikteev

Adikteev Achieves Subsecond Latency for Customer Analytics with Imply

Adikteev designs and executes mobile marketing campaigns for their clients in order to boost app use and engagement. With Imply, built from Apache Druid, Adikteev created customer-facing dashboards that enable...

Learn More
Paytm

Paytm Built a PB-scale Analytics Application using Druid with Imply

Paytm, India’s leading financial services company, switched to Imply to support a powerful, cost-efficient application that enables hundreds of internal users to analyze customer behavioral data in real-time.

Learn More
Splunk

Imply and Druid: The Foundation of Splunk’s Real-Time Analytics Engine

Splunk is the world’s first Data-to-Everything™ Platform, designed to remove the barriers between data and action to turn data into doing for its 19,000+ customers. With Apache Druid and Imply powering...

Learn More

With Imply for Druid,
save time and money.

Imply is the easiest way to build with Druid through our cloud service and committer-driven expertise. For existing Apache Druid users, we can guarantee it.

Get started

Let us help with your analytics apps

Request a Demo