Oct 2, 2020
Archmage, Pinterest’s Real-time Analytics Platform on Druid
In this talk, we will talk about:
- The motivation of switching from Hbase backed analytics system to Druid
- The architecture design of Druid as a platform in Pinterest (Archmage, Hadoop, Kafka) including a query interface, Archmage, a thrift service in front of Druid which exposes a thrift api to company-wise clients, handles Druid broker hosts discovery, serves as a relay to broker hosts to abstract the async HTTP connection and provides query optimizations transparent to clients including directly translating fixed pattern SQL to Druid native JSON queries to save planning time. In addition, we’ll cover the production Hadoop batch and Kafka real time ingestion pipeline setup and the reason we picked a pull-based solution instead of a push-based solution for real time ingestion.
- We will also talk about the use cases currently running in production on this platform including their data volume, QPS, Druid cluster setup, the unique challenges we met while onboarding and how we addressed them with extensive tunings to meet SLA and lessons learned for use cases including: partner insights, which provides partners with stats on organic pins; realtime spam detection, which detects user login related anomaly events and pin related spamming events like pin creation and repin; and migrating the backend from Presto to Druid for Ads related experiments data analysis.
Modern Analytics Applications in the Financial Industry
Join Eric Tschetter, Field CTO of Imply and co-creator of Apache Druid, Ravi Maurya and Shubham Gupta from Paytm in this inspiring virtual event
Watch the Webinar
Keynote – Building Modern Analytics Applications with Apache Druid
Fangjin Yang, Co-Founder and Chief Executive Officer of Imply, presents the opening keynote at the Druid Summit 2021 virtual conference.
Watch the Presentation
Technical deep dive into how Outbrain scales its real-time analytics
Because Outbrain processes billions of impressions and events a day, they risk running into scaling problems.
Imply x Kafka: Capture, interact and scale streaming data
Imply and Kafka is the perfect architecture to capture and surface streaming data through interactive queries and unlimited scale
Confluent Cloud 연동하여 Imply로 실시간 데이터 분석 및 시각화
라이브 데모: 대화형 쿼리 및 제한없는 확장을 통해 스트리밍 데이터를 캡쳐하고 보여주는 아키텍쳐를 구축하는 방법
Making Real-Time Data a Reality for your Business
Technaura takes you through an hour of practical Real-time, data-driven outcomes.
Apache Druid Engine Roadmap
如果你还在每天等待数据刷新，是时候试试Apache Druid 亚秒级数据库了。 该开源数据库在Twitter、Pinterest和Snapchat广泛部署。此次研讨会中， Imply的产品经理Will Xu将分享常见的应用场景以及Imply Druid企业版本的独特功能和未来产品规划。
Comparing drill-down workflows between Tableau & Snowflake with Imply
In this video, Jad Naous demonstrates drill-down workflows between Tableau & Snowflake with Imply.