Lucidity & Imply Druid - Detecting Ad Frauds In Real Time

by Igor Leao · March 26, 2021

57 Million Events per day
3,500 Queries per day
20% Reduction in Engineering hours

About Lucidity

Lucidity is a digital advertising company that brings transparency to digital advertising. Lucidity stitches together information from the buyer’s ad server (where it inserts its own pixel), the Demand Side Platform (DSP), the ad exchange and the publisher. Then a media buyer can look at a dashboard that shows them all the ways the impression they thought they bought differed from what was actually delivered.

This approach gives media buyers access to insights derived from log-level data. At Lucidity, we are creating a unified view of an ad impression across the online advertising supply chain – to ensure that the impression was delivered in a manner the advertiser requested or required. For example, that the impression ran consistently through the ad server all the way to the DSP and the exchange. And that the ad ran on the bid the advertiser won, including correct placement and ad format.

We bring value to customers through the operational improvement of doing reconciliation in real time and secondly by delivering insights to media buyers.

Problem we are solving with Imply Druid

There are multiple sorts of frauds that happen when you advertise via the Web - and identifying such fraudulent signals may be hard. We help our clients to identify such signals, while also providing data immutability and transparency that comes by being powered by a blockchain.

Our data is the typical example of data that can be modeled via a OLAP cube:
  • It represents events that happened at a given time in that past
  • It is analyzed with time constraints
  • It contains multiple dimensions
  • It needs be analyzed via different metrics for a given set of dimensions

At the moment we ingest data on an hourly basis. We run multiple batch ingestion jobs throughout the day. Besides those hourly jobs, we also reconstruct the last 30 days of data on a daily basis, ingesting it via a Hadoop batch job. We need this large batch job because new signals may depend on events that happened way back in the past.

We ingest log-level data from the exchange and the DSP, as well as our ad serving pixel. And we are able to track and create a unified ID for each impression and match it across the supply chain.

There are breakages in the supply chain, where the DSP doesn’t match the exchange data, or maybe the exchange data doesn’t match what’s in the pixel. We consider those non-matches. We identify supply paths that create poor matches. One publisher might have a great match in one supply path, and a less efficient one with the other. User devices generate the data which is fed to Kafka and from there the data is forwarded to Imply Druid. We retain all of our historical data in the Druid Clusters. This allows us and our users to slice and dice the data in a variety of ways to look for signals and insights on the data.

Our platform is built on Blockchain which allows transparency in confirmation of the validity of ad impression. We do that by a transparent smart contract that sits on top of our blockchain. We write data onto the blockchain so it is immutable and can’t be manipulated.

Benefits to Lucidity from Imply Druid

Lucidity deployed Imply Druid to production in 2019. Our core product offering is powered by Imply Druid.

Druid was a known solution for the tech leadership at Lucidity, and we were familiar with its advantages and have always had a deep trust relationship with Imply. We tried to run Open Source Druid via our own setup in the past, but the maintenance overhead was significantly high.

By migrating from Open Source Druid to Imply Druid, we have been able to save about 20% of engineering hours. Imply helps our team, which is quite lean, to focus on what really matters, while also adding reliability for such important components from our architecture.

Any data we offer via a self-service model to our clients comes from Imply. This includes the data used by our dashboards and the data used by our self-service reports.

All of our clients turn to be Imply users, as Druid backs our main product. Therefore, Druid and Imply are quite essential to the value we offer to our clients.

