If you are using Apache Druid to analyze customer-oriented data you are probably familiar with the General Data Privacy Regulation (GDPR), which went into effect May 25, 2018. However, you may be less familiar with a new law, the California Consumer Privacy Act (CCPA), which went into effect January 1, 2020 and is likely become a de facto standard in the US.
If you are not familiar with these regulations, you need to learn about them, right now. The reason you need to enforce GDPR and CCPA in your Druid or Imply implementation are the fines: up to 4% of global revenue or 20 million Euro, whichever is bigger. You do not want to bring this onto your company or your career.
GDPR impacts any company globally that captures and stores any personally identifiable information (PII) of any EU citizen. CCPA provides similar protections for California residents. PII is no longer just a social security number or other ID. It is any information that can be used to identify an individual in any way.
The regulations state how you have to protect the data, control access to it and keep an audit trail. Additionally, you are obligated to remove an individual’s data upon request and verify that it has been removed.
The good news is that Druid and Imply have both evolved over the years to make compliance easier. A seasoned Druid practitioner who has worked on many larger customer-facing deployments put together this technical note.