Druid Nails Cost Efficiency Challenge Against ClickHouse & Rockset

Nov 22, 2021
Eric Tschetter

There is a popular Star Schema Benchmark (SSB).  We know this benchmark very well, we previously released benchmark results from it.  After that, another real time analytics company Rockset published their results claiming much better performance and efficiency compared to Druid.  Then Altinity published their results on ClickHouse. We decided to take a closer look at the ClickHouse and Rockset approach and see how Druid performs today.

To make a long story short, we were pleased to confirm that Druid is 2 times faster than ClickHouse and 8 times faster than Rockset with fewer hardware resources!

Methodology

In the previous run, the data hadn’t actually been organized in any fashion.  There was no time or energy spent into partitioning the data or otherwise setting up sorting patterns.  We decided to borrow from what Altinity did with ClickHouse and choose to do a hash partitioning by the tuple (“s_region”, “c_region”, “p_mfgr”, “s_nation”).

While Rockset did a warm-up run and then 3 queries and took the median.  And ClickHouse did a warm-up run and then 3 queries and took the mean.  We do a warm-up run and then take the median of 3 queries as this shaves about 5 milliseconds off of the aggregate result compared to if we did averages.  So, we obviously pick the methodology that makes us look even more better.  

We actually wanted to do the benchmark on the same hardware, an m5.8xlarge, but the only pre-baked configuration we have for m5.8xlarge is actually the m5d.8xlarge, which ends up looking more expensive even though the CPU and RAM are the same and disk doesn’t actually impact this benchmark as the data is fairly small.  Instead, we run on a c5.9xlarge instance which costs $1.53/h, thus achieving our primary goal of running on some node type that we can claim is cheaper than the nodes used by the other benchmarks.  

With the partitioned table Druid runs at full throttle, while both ClickHouse and Rockset are left far behind.

QueryDruid, $1.53/hRockset, 2.16$/h% diff vs RocksetClickHouse, $1.54/h% diff vs ClickHouse
Q1.10.0970.944973.20%0.112115.46%
Q1.20.0110.2542309.09%0.02181.82%
Q1.30.0060.2964933.33%0.03500.00%
Q2.10.0950.161169.47%0.165173.68%
Q2.20.0740.136183.78%0.079106.76%
Q2.30.0640.129201.56%0.066103.13%
Q3.10.1340.696519.40%0.186138.81%
Q3.20.0580.5981031.03%0.09155.17%
Q3.30.0420.343816.67%0.165392.86%
Q3.40.0150.032213.33%0.01280.00%
Q4.10.0890.384431.46%0.0889.89%
Q4.20.0390.132338.46%0.039100.00%
Q4.30.0230.041178.26%0.068295.65%
Total run time:0.7474.146946.08%1.112187.17%

Druid is simply 2 times faster than ClickHouse and 8 times faster than Rockset!

Or, maybe you take offense at averaging percentage differences and instead think that the total aggregate runtime for a chunk of queries is a meaningful way to compare two systems, in that case, Druid is still 1.5x faster than ClickHouse and 5.5x faster than RockSet!

Conclusions

Benchmarks exist, they are a thing and might make for an interesting data point sometimes, but code changes, data layouts change, indexes change.  While these results are a correct and accurate reflection of Druid’s performance, all published benchmark results are merely a snapshot of one specific configuration at one point in time.  Whether that configuration actually aligns with anything that would be seen in reality, that’s another question.  For example, this SSB data set is like no other dataset in the real world: all columns exhibit a uniform distribution of values with absolutely no skew whatsoever.

Indeed, the updated results here came about because of updated code and paying attention to the data layout.  Any sort of production deployment is going to have target SLAs for queries and data will have to be modeled and adjusted to align with the use case.  There is a lot more to infrastructure than just raw performance, there is availability, simplicity of scale and fault tolerance as well.  I hope that everyone who reads this tongue-in-cheek blog post will go away with an appreciation that myopia on performance alone leads to poor outcomes and that it is much more important to see a system actually operating for you on production workloads than it is to compare arbitrary benchmark runs that were thrown together to try to get to the top of Hacker News.

Other blogs you might find interesting

No records found...
Oct 17, 2024

An Overview to Data Tiering in Imply and Apache Druid

Learn all about tiering for Imply and Apache Druid—when it makes sense, how it works, and its opportunities and limitations.

Learn More
Oct 10, 2024

Last Call—and Know Before You Go—For Druid Summit 2024

Druid Summit 2024 is almost here! Learn what to expect—so you can block off your schedule and make the most of this event.

Learn More
Sep 30, 2024

The Top Five Articles from the Imply Developer Center (Fall 2024 edition)

Build, troubleshoot, and learn—with the top five articles, lessons, and tutorials from Imply’s Developer Center for Fall 2024.

Learn More

Let us help with your analytics apps

Request a Demo