How we made long-running queries work in Apache Druid

by Charles Smith · in Community · October 25, 2021

When you think of querying with Apache Druid, you probably imagine queries over massive data sets that run in less than a second. However, a little birdie told us they wanted to use Druid for long-running queries with downloadable results. As you can imagine, the system wasn’t designed for this use case. This blog is about some of the things we did as a team to discover the user stories, define an asynchronous download API, and deliver it in a monthly STS release.

Hourglass image from Pixabay user xaviandrew.

Here at Imply, we have a fantastic mini demo session where anyone in the organization can present what they’re working on. The mini demos let us get early feedback so that we can build useful products that people love. Back in July, the co-founders Gian and Vadim surprised us with a proof of concept of a feature called “async downloads” seemingly out of the blue. The feature generated lots of interest and excitement and it served the requested use case. It was time to figure out how to tranform Gian and Vad’s prototype into a full-fledged feature.

Within a couple of weeks, the Druid engineering team had taken over the project: Siva Singaram in product management talked to customers to figure out precisely what the feature should do. Based upon his research, he outlined the customer requirements. This also meant that we had customers ready to use the feature even before we’d written a line of code.

With customers now clamoring for async downloads, we needed to take the initial design from PM and turn it into a technically sound, executable course of action. To move as quickly as possible, we sketched out high-level requirements for the API so that Dev, QE, and Docs could all work in parallel. With our game plan in place, we divided up the work so we could handle tasks in parallel and work cross-functionally. Jihoon Son, Maytas Monsereenusorn, and Karan Kumar started the development tasks keeping Andy Tsai from QE involved throughout the process.

The team is geographically distributed: Maytas is in Thailand and Karan is in India while Jihoon and Andy are in California. This enabled us to submit a PR for review at the end of the day in California and have the review ready for us the next day. Merging happened more quickly making the feature available for testing as soon as possible. When Andy broke things in his spectacular way, the team could address bugs and add guardrails to help improve stability and user experience.

Hourglass image from Unsplash user Surface.

Each time we finished some feature work, it was back to the mini demo session to show what we’d done and gather feedback. We adjusted our design and requirements in-flight to help drive usability and quality. This cycle meant we were able to deliver an alpha release to the customer in the September STS release. Now the feature is out and available to all the birds in the wild. This means hearing active customer feedback and going through the cycle again and again to move “async downloads” to GA.

Hourglass image from Pixabay user dendoktoor.

Want to work on cool stuff? Learn more about Imply at https://imply.io.
Explore careers @ Imply at https://imply.io/careers.

Back to blog
Tags: #apachedruid

How can we help?