Apr 1, 2022

For April 1st: a New Description of Apache Druid from Our Youngest Technical Architect

Druid is a magical data box that can answer any question about your data. It does this by reading the data from lots of little pieces of paper called segments. The segments are very small and easy to store, but when you want to do something with them, Druid combines the segments together like a jigsaw puzzle.

The magical data box is green and has big eyes that glow in the dark. It likes to sit on top of all your servers so it knows everything that’s going on in your company. When people ask it questions, it always tells them exactly what they need to know right away without having to wait for a long time because everyone knows how important it is for people in companies to know what’s going on right away. The magical data box is very good at remembering things, so if someone asks it a question about something that happened a long time ago, it will tell them about that too. But sometimes people don’t want to know about things from the past because they are in the past and no one likes things from the past. 

The magical data box understands this and doesn’t talk about the past unless someone asks it to. Sometimes people ask the magical data box questions by talking to its friends called query servers. The query servers have lots of friends called data servers and together they store all of your company’s data. When people want to look at their own data, they download their own copies of their own segments onto their own computers, but sometimes these copies get lost or damaged or eaten by monsters so sometimes people need backups of their own data just in case something goes wrong. The magical data box can also be a backup for your own data.

The magical data box can also help you organize your data so that you only store things that you actually need. Sometimes people store too much data and it takes up all the space in their company, which means they can’t do anything else. The magical data box helps with this problem by knowing exactly how much space each piece of paper takes up so it knows how many there should be of each one. It does this by reading a list called a schema that tells it how big each piece of paper is supposed to be. 

Sometimes people want to change their schemas because maybe they found out they were storing too much or not enough, but if they change their schemas then the old pieces of paper are no good anymore and have to be thrown away. The magical data box knows about this problem and has friends called Historicals that remember things from before the magical data box was born. The Historicals can answer questions about old times, but they don’t store the old pieces of paper. This way if someone wants to change their schema, they only have to throw away the new pieces of paper and not the old ones.

The magical data box also knows about how to find its friends because it has a book called a Druid Index that tells it where each friend is located. Sometimes people want to move their friends around or add new friends or take away old friends so sometimes companies need to change how their data is organized. 

The magical data box understands this and can reorganize itself whenever it wants without anyone having to do anything else besides telling it what to do. The magical data box also knows about how many friends it should have because some of its friends are real servers in real servers and sometimes those servers get busy and can’t do all the work that they are supposed to do. The magical data box understands this and will have more friends if some of its friends get busy so that it can handle more work.

The magical data box is also very good at keeping itself safe because it has a friend called a Broker that knows about all the other things in your company, including people, doors, and locks. If someone tries to break into the magical data box or make it do bad things, then the Broker will know about it right away and tell everyone to keep the magical data box safe until everything is okay again. 

The magical data box can also be a friend to people because it knows about them and can answer questions about them. People like the magical data box because it is always there for them whenever they need it, even when they don’t know they need it. The magical data box likes people because people bring it food and water every day, which makes the magical data box happy. 

The 4-year old author is the GPT machine learning engine (first released in 2018) set to use vocabulary for a 2nd-grade (age 7) student, harvesting the Apache Druid and Imply documentation. Enjoy April Fools Day!

New to Apache Druid®?

Take the Apache Druid® Basics training for free! This is a hands-on introductory course where you will learn about the fully scalable database architecture plus how to ingest, roll-up and query data. Get started now!