Two weeks ago, it was with heavy hearts we decided to postpone the first face-to-face Druid Summit to the fall, to ensure the health and safety of the global Druid community. The silver lining for those wishing to attend is that, since the conference is not until November 2-4, we have reinstated Druid Summit early bird pricing, so tickets are available at very affordable prices to hear talks from dozens of Druid practitioners and contributors as well as get advanced Druid training.
However, since in this time of regional lockdowns, a week can feel like a month and six months like an eternity, we decided to pull together a select group of speakers to give shortened versions of their originally-planned Druid Summit talks in a virtual setting.
I am happy to tell you that 10 days after we officially postponed Druid Summit, we have now launched Virtual Druid Summit, which will take place as a series of online talks on April 15. Each talk will be a spicy 30 minutes of great real-world information, followed by Q/A. There are 5 talks from Druid practitioners from a variety of industries and spanning 3 continents. The summit will open with a Apache Druid Roadmap and Vision talk from Apache Druid PMC Chair Gian Merlino, and will close with a 2-way voice-interactive “ask us anything” session featuring Druid authors and contributors.
I very much hope you can join us for the Virtual Druid Summit in a few weeks, and then again face to face in the fall. During that time we wish the best for you, your families and your communities.
Visit the Virtual Druid Summit registration page to sign up and select your talks.
Virtual Druid Summit
April 15, 2020 8 AM - 3 PM Pacific Time
Apache Druid Vision and Roadmap
Gian Merlino, Apache Druid PMC Chair 8:00am - 8:30am PT
Gian will offer his reflections on the Druid journey to date, plus describe his vision for what Druid will become. He will lay out the near-term Druid roadmap and take your questions.
Automating CI/CD for Druid Clusters at Athena Health
Shyam Mudambi, Sr. Architect, Athena Health
9:00am - 9:30am PT
At Athena Health, we are creating a new performance management application for our clients, and one of its key components is Apache Druid. Since we are deploying this new application in the cloud, we needed an automated (CI/CD) based approach to create, update and delete Druid clusters, as well as scale different node groups within the cluster based on expected load. In this talk, we will go over how we implemented this process on AWS utilizing Terraform to deploy and update clusters within minutes.
Druid for Anti-Money Laundering (AML) Investigation
Arpit Dubey, SVP, Big Data Platform Lead & Architect, DBS
10:00am - 10:30am PT
DBS is using Druid to handle the AML investigation for the compliance team. The AML (anti-money laundering) workflow generates alerts which are tracked within Druid. The transactional data is ingested from RDBMS to S3 and ingested back to Druid at regular intervals. Investigators can now slice and dice over millions of data with low latency. Currently over 4 million transactions per day are recorded in Druid.
In addition to this, the following use cases are powered by Druid:
Real time dashboards from druid for the monitoring purpose. Druid also serves the aggregated data required for the Machine Learning modelling for AML module. Which no other data store provide similar performance results
How Druid Power Real-Time Analytics at BT
Pankaj Tiwari, Head of Engineering, BT
11:00am - 11:30am PT
We joined this Journey in Q2 2019 by asking Imply to help us with onboarding an in-house Network Performance Management Project and it has been an amazing journey with its fair share of ups and downs. DRUID has plenty of features which we can talk about, however the ones which enabled us to choose DRUID as our choice of database are:
Highly distributed and Horizontally scalable architecture Share nothing architecture Support for Aggregation/Post-aggregation and statistical functions https://druidsummit.org/cfp#page-submit
Analytics over Terabytes of Data
Swapnesh Gandhi, Senior Software Engineer, Twitter
12:00pm - 12:30pm PT
MoPub, a Twitter company, provides monetization solutions for mobile app publishers and developers around the globe. MoPub receives over 33 Billion ad requests per day generating over 200TB of raw logs every day. We built MoPub Analytics as the analytics platform, using Druid + Imply for our end users who are publishers, demand side partners and Internal users.
We will talk about the architecture of the analytics platform, our Druid cluster setup, hardware choices, monitoring, use cases, limiting factors, challenges with lookups and solutions we used.
Using Druid for Network Monitoring and Trust Analytics at Cisco
TJ Giuli, Principal Engineer, and Abhishek Balaji Radhakrishnan, Software Engineer, Cisco Systems
1:00pm - 1:30pm PT
At Cisco’s Crosswork Cloud, we use Apache Druid for several use cases, including monitoring internet routing updates, tracking device inventory statistics, and ingesting trusted device events. In our talk, we share our experiences and insights on how we deploy, monitor, and integrate Druid with our applications. We describe the technical challenges that led us to migrate from a key-value data store to Druid, our pipeline architecture, and an overview of our streaming and batch workloads.
Experiences deploying, running, and monitoring Druid in production at Cisco. Methods for safely querying multi-tenant data sources. Techniques for using code-generation to manage ingestion, data source schemas, and provide a strongly-typed end-to-end data flow throughout our system.
Apache Druid Fireside Chat (Ask Us Anything)
Fangjin Yang, Druid co-author, Gian Merlino, Apache Druid PMC Chair, Vadim Ogievetsky, Imply Chief Product Officer and Druid Contributor
2:00pm - 2:45pm PT
Take advantage of a rare opportunity to talk to some of the world’s most adept Druid experts. During this session, we open the mic to take your questions. Ask us anything. Really.