What I wish I knew about Imply when I was developing in-house analytics

Wylie at the controls, future-proofing real-time analytics at scale

Like a lot of engineers at Imply, I got my start here after having worked on an analytics solution for a previous employer. In my case, it was a large non-tech company going through a digital transformation.

When I say “non-tech”, I’m talking about a company that makes and sells doors! One may not realize it, but there are tons of different metrics in that business: height, width, colour, shape – the list goes on forever…

A door opens

The engineering team I was part of was mainly responsible for working on the company’s e-commerce business. We didn’t have any particular background in data engineering or analytics-related development, but it was natural for us to be the first port of call for this kind of work.

The project started with a request from the company’s analysts for us to help them better understand what kind of doors and door-features people were buying. One of the first analysts we worked with wanted to know more about a popular glass-pane style being put in many of the doors, as these were always ordered in bulk from a factory in China. It helped her to project demand.

The Assignment

It was a pretty straightforward brief! We were asked to produce some basic charts and graphs to show information like the size of the glass in the doors, the color breakdown, material content – those sorts of things.

As it was quite simple, we didn’t spend a whole lot of time looking at the entire ecosystem of data analytics. We just decided to build something ourselves. I was proud of the outcome and everyone involved was happy with it, but in hindsight it probably wasn’t the greatest decision in terms of providing the analysts with a full self-service solution.

Build for success: build for scale

There were two key scalability issues with what we built that I’ve realized since joining Imply.

The first is that the project started out with a limited scope, but quickly grew. As I described earlier, we were originally asked to create a simple proof-of-concept solution with a few static pie charts and graphs. As soon as we delivered this, the feature requests started rolling in!

Before we knew it, we were being asked to visualize everything in the company’s database, and filter it by dates, dimensions, and a whole lot more. I would hear something like: “I want to see all the different door sizes for all the doors that were painted red.” While this wouldn’t be particularly difficult to engineer, it would be added to a long queue of similar feature requests. As a result, the analyst who wanted the information would have to wait weeks before the necessary feature would be selected for development and finally released.

All these requests completely blew out the scope of what we were originally asked to do. Suddenly we needed new visualization models and had to deal with a whole new level of complexity in data transformation.

Lesson learned: Your users are going to keep requesting features, so look for something off the shelf before you make your own!

At the end of the day, although you are building a solution for a specific problem, there is a universal problem space around it that analytics platforms are designed for. You’re never just building a door-analytics solution – you’re building an analytics solution.

A sturdier foundation

The other major issue was with the backend.

Our original prototype was backed by Postgres. It was about as straightforward and unsophisticated a solution as you could probably come up with: a simple table with all our analytics facts, driven by plain old SQL queries.

That was fine for the proof of concept prototype we were originally asked to create. But as time marched on, our system accumulated more and more data – meaning it also required more and more time to run the same queries. We were constantly scrambling to come up with one performance optimization after another to cut down on query latency.

Parting thoughts

If I could go back in time, I’d tell myself not to start building out a custom stack before evaluating some of the existing options on the market.

For example, here at Imply we enable our customers with Pivot, a robust and dynamic application that empowers analysts with true self-service visualization and awareness capabilities.

Engineering time is expensive! Don’t waste it reinventing the wheel.

Other blogs you might find interesting

No records found...

Jul 23, 2024

Streamlining Time Series Analysis with Imply Polaris

We are excited to share the latest enhancements in Imply Polaris, introducing time series analysis to revolutionize your analytics capabilities across vast amounts of data in real time.

Learn More

Jul 03, 2024

Using Upserts in Imply Polaris

Transform your data management with upserts in Imply Polaris! Ensure data consistency and supercharge efficiency by seamlessly combining insert and update operations into one powerful action. Discover how Polaris’s...

Learn More

Jul 01, 2024

Make Imply Polaris the New Home for your Rockset Data

Rockset is deprecating its services—so where should you go? Try Imply Polaris, the database built for speed, scale, and streaming data.

Learn More

APACHE DRUID

IMPLY PRODUCTS

INTEGRATIONS

By Functional Use

By Application

FEATURED

DRUID CASE STUDIES

Apache Druid

Content

Support

Other blogs you might find interesting

Let us help with your analytics apps