Druid Nails Cost Efficiency Challenge Against ClickHouse & Rockset
Nov 22, 2021
Eric Tschetter
There is a popular Star Schema Benchmark (SSB). We know this benchmark very well, we previously released benchmark results from it. After that, another real time analytics company Rockset published their results claiming much better performance and efficiency compared to Druid. Then Altinity published their results on ClickHouse. We decided to take a closer look at the ClickHouse and Rockset approach and see how Druid performs today.
To make a long story short, we were pleased to confirm that Druid is 2 times faster than ClickHouse and 8 times faster than Rockset with fewer hardware resources!
Methodology
In the previous run, the data hadn’t actually been organized in any fashion. There was no time or energy spent into partitioning the data or otherwise setting up sorting patterns. We decided to borrow from what Altinity did with ClickHouse and choose to do a hash partitioning by the tuple (“s_region”, “c_region”, “p_mfgr”, “s_nation”).
While Rockset did a warm-up run and then 3 queries and took the median. And ClickHouse did a warm-up run and then 3 queries and took the mean. We do a warm-up run and then take the median of 3 queries as this shaves about 5 milliseconds off of the aggregate result compared to if we did averages. So, we obviously pick the methodology that makes us look even more better.
We actually wanted to do the benchmark on the same hardware, an m5.8xlarge, but the only pre-baked configuration we have for m5.8xlarge is actually the m5d.8xlarge, which ends up looking more expensive even though the CPU and RAM are the same and disk doesn’t actually impact this benchmark as the data is fairly small. Instead, we run on a c5.9xlarge instance which costs $1.53/h, thus achieving our primary goal of running on some node type that we can claim is cheaper than the nodes used by the other benchmarks.
With the partitioned table Druid runs at full throttle, while both ClickHouse and Rockset are left far behind.
Query
Druid, $1.53/h
Rockset, 2.16$/h
% diff vs Rockset
ClickHouse, $1.54/h
% diff vs ClickHouse
Q1.1
0.097
0.944
973.20%
0.112
115.46%
Q1.2
0.011
0.254
2309.09%
0.02
181.82%
Q1.3
0.006
0.296
4933.33%
0.03
500.00%
Q2.1
0.095
0.161
169.47%
0.165
173.68%
Q2.2
0.074
0.136
183.78%
0.079
106.76%
Q2.3
0.064
0.129
201.56%
0.066
103.13%
Q3.1
0.134
0.696
519.40%
0.186
138.81%
Q3.2
0.058
0.598
1031.03%
0.09
155.17%
Q3.3
0.042
0.343
816.67%
0.165
392.86%
Q3.4
0.015
0.032
213.33%
0.012
80.00%
Q4.1
0.089
0.384
431.46%
0.08
89.89%
Q4.2
0.039
0.132
338.46%
0.039
100.00%
Q4.3
0.023
0.041
178.26%
0.068
295.65%
Total run time:
0.747
4.146
946.08%
1.112
187.17%
Druid is simply 2 times faster than ClickHouse and 8 times faster than Rockset!
Or, maybe you take offense at averaging percentage differences and instead think that the total aggregate runtime for a chunk of queries is a meaningful way to compare two systems, in that case, Druid is still 1.5x faster than ClickHouse and 5.5x faster than RockSet!
Conclusions
Benchmarks exist, they are a thing and might make for an interesting data point sometimes, but code changes, data layouts change, indexes change. While these results are a correct and accurate reflection of Druid’s performance, all published benchmark results are merely a snapshot of one specific configuration at one point in time. Whether that configuration actually aligns with anything that would be seen in reality, that’s another question. For example, this SSB data set is like no other dataset in the real world: all columns exhibit a uniform distribution of values with absolutely no skew whatsoever.
Indeed, the updated results here came about because of updated code and paying attention to the data layout. Any sort of production deployment is going to have target SLAs for queries and data will have to be modeled and adjusted to align with the use case. There is a lot more to infrastructure than just raw performance, there is availability, simplicity of scale and fault tolerance as well. I hope that everyone who reads this tongue-in-cheek blog post will go away with an appreciation that myopia on performance alone leads to poor outcomes and that it is much more important to see a system actually operating for you on production workloads than it is to compare arbitrary benchmark runs that were thrown together to try to get to the top of Hacker News.
Other blogs you might find interesting
No records found...
Nov 14, 2024
Recap: Druid Summit 2024 – A Vibrant Community Shaping the Future of Data Analytics
In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...
Pivot by Imply: A High-Speed Data Exploration UI for Druid
In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...