Lucidity & Imply Druid – Detecting Ad Frauds In Real Time

Mar 26, 2021
Igor Leao

57 Million EVENTS PER DAY

3500 QUERIES PER DAY

20% REDUCTION IN ENGINEERING HOURS

About Lucidity

Lucidity is a digital advertising company that brings transparency to digital advertising. Lucidity stitches together information from the buyer’s ad server (where it inserts its own pixel), the Demand Side Platform (DSP), the ad exchange and the publisher. Then a media buyer can look at a dashboard that shows them all the ways the impression they thought they bought differed from what was actually delivered.

This approach gives media buyers access to insights derived from log-level data.
At Lucidity, we are creating a unified view of an ad impression across the online advertising supply chain – to ensure that the impression was delivered in a manner the advertiser requested or required. For example, that the impression ran consistently through the ad server all the way to the DSP and the exchange. And that the ad ran on the bid the advertiser won, including correct placement and ad format.

We bring value to customers through the operational improvement of doing reconciliation in real time and secondly by delivering insights to media buyers.

Problem we are solving with Imply Druid

There are multiple sorts of frauds that happen when you advertise via the Web – and identifying such fraudulent signals may be hard. We help our clients to identify such signals, while also providing data immutability and transparency that comes by being powered by a blockchain.

Our data is the typical example of data that can be modeled via a OLAP cube:

  • It represents events that happened at a given time in that past
  • It is analyzed with time constraints
  • It contains multiple dimensions
  • It needs be analyzed via different metrics for a given set of dimensions

At the moment we ingest data on an hourly basis. We run multiple batch ingestion jobs throughout the day. Besides those hourly jobs, we also reconstruct the last 30 days of data on a daily basis, ingesting it via a Hadoop batch job. We need this large batch job because new signals may depend on events that happened way back in the past.

We ingest log-level data from the exchange and the DSP, as well as our ad serving pixel. And we are able to track and create a unified ID for each impression and match it across the supply chain.

There are breakages in the supply chain, where the DSP doesn’t match the exchange data, or maybe the exchange data doesn’t match what’s in the pixel. We consider those non-matches. We identify supply paths that create poor matches. One publisher might have a great match in one supply path, and a less efficient one with the other.
User devices generate the data which is fed to Kafka and from there the data is forwarded to Imply Druid. We retain all of our historical data in the Druid Clusters. This allows us and our users to slice and dice the data in a variety of ways to look for signals and insights on the data.

Our platform is built on Blockchain which allows transparency in confirmation of the validity of ad impression. We do that by a transparent smart contract that sits on top of our blockchain. We write data onto the blockchain so it is immutable and can’t be manipulated.

Benefits to Lucidity from Imply Druid

Lucidity deployed Imply Druid to production in 2019. Our core product offering is powered by Imply Druid.

Druid was a known solution for the tech leadership at Lucidity, and we were familiar with its advantages and have always had a deep trust relationship with Imply. We tried to run Open Source Druid via our own setup in the past, but the maintenance overhead was significantly high.

By migrating from Open Source Druid to Imply Druid, we have been able to save about 20% of engineering hours. Imply helps our team, which is quite lean, to focus on what really matters, while also adding reliability for such important components from our architecture.

Any data we offer via a self-service model to our clients comes from Imply. This includes the data used by our dashboards and the data used by our self-service reports.

All of our clients turn to be Imply users, as Druid backs our main product. Therefore, Druid and Imply are quite essential to the value we offer to our clients.

Imply is a complete data platform for real-time analytics. Get started with Imply

Other blogs you might find interesting

No records found...
Nov 14, 2024

Recap: Druid Summit 2024 – A Vibrant Community Shaping the Future of Data Analytics

In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...

Learn More
Oct 29, 2024

Pivot by Imply: A High-Speed Data Exploration UI for Druid

In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...

Learn More
Oct 22, 2024

Introducing Apache Druid® 31.0

We are excited to announce the release of Apache Druid 31.0. This release contains over 525 commits from 45 contributors.

Learn More

Let us help with your analytics apps

Request a Demo