Enhancing Data Security with Role-Based Access Control in Druid and Imply

Jan 23, 2024
Jyoti Shekhar

Managing user access to relevant data is a crucial aspect of any data platform. In a typical Role Based Access Control (RBAC) setup, users are assigned roles that determine their access to relevant data.

How does one achieve RBAC in Druid? We will discuss two different ways of doing this, with one of them being able to achieve this at a more granular level.

Basic RBAC in Druid allows for the limitation of READ and/or WRITE access at the datasource level. For more granular control, especially in restricting access to specific rows and columns, Imply View Manager comes into play.

NOTE: Imply View Manager is an experimental (alpha) feature available with the Imply Druid distribution

RBAC in Druid:

One can limit both READ and/or WRITE access on datasources, system tables, configs, state, query_context, and external sources (for example, S3).

RESOURCE TYPEDESCRIPTION
DATASOURCEDruid table (popularly known as datasource)
SYSTEM TABLESContain metadata on the Druid cluster. Examples: TABLES, segments, server_segments, servers, supervisors ….
CONFIGConfiguration resources exposed by the cluster components. Enabled by default for the ADMIN role
STATECluster-wide state resources: Example: coordinator 
QUERY_CONTEXTFor internal use, no action needed with this resource
EXTERNALQueries to access external data through the EXTERN function.
Example: Ingestion from S3 with MSQ

In the examples below, we will use basic auth from the Druid console:

Under User Management, we see options for managing users as well as roles+privileges:

[ Some examples further down the document under “Examples with Auth at the Resource level” ]

From the table above, one can see that the privileges can be assigned for a specific resource like

But it is for the entire resource. What if we wanted to limit access to specific rows/columns?

In comes Imply View Manager …

Imply View Manager offers a sophisticated way to limit access to data for specific users. Similar to traditional database views, Imply views are object definitions that sit on top of a datasource, eliminating the need to duplicate a subset of the data. Imply Views are a way to implement row and column-level security.

**Imply View Manager is different from the Materialized Views feature in Apache Druid**

The View Manager is managed by an extension “imply-view-manager” that is loaded in the cluster at startup.

For clusters where view Manager is implemented, we see a new resource in the Resource types dropdown:

Note that the admin role (and any other relevant role) needs VIEW READ and WRITE access specifically granted via the Druid UI or APIs.

Here is an example of a view creation

curl -k -u admin:admin --location --request POST \
'https://imply-b23-elbexter-s9ruz7uko71p-10298291.us-east-1.elb.amazonaws.com:9088/proxy/coordinator/druid-ext/view-manager/v1/views/wiki-anonymous' \
--header 'Content-Type: application/json' --header 'Accept: application/json' \
--data-raw '{"viewSql": "SELECT \"__time\", \"channel\", \"page\", \"comment\", \"commentLength\" FROM druid.wikipedia WHERE \"isAnonymous\" = '\''true'\''" }'

Here, the API endpoint is the proxied coordinator API endpoint.

You can also use the coordinator API endpoint where applicable. Example below:

curl -k -u admin:admin --location --request POST \
'http://localhost:8081/druid-ext/view-manager/v1/views/wiki-anonymous' \
--header 'Content-Type: application/json' --header 'Accept: application/json' \
--data-raw '{"viewSql": "SELECT \"__time\", \"channel\", \"page\", \"comment\", \"commentLength\" FROM druid.wikipedia WHERE \"isAnonymous\" = '\''true'\''" }'

The view is now available, and we can grant read access to the view:

Now let us try and access the data as the user view-user-1:

curl  -k -u view-user-1:viewuser1 --request POST --header 'Content-Type: application/json' 'https://imply-b23-elbexter-s9ruz7uko71p-10298291.us-east-1.elb.amazonaws.com:9088/druid/v2/sql/' --data '{ "query": "SELECT * FROM view.wiki_anonymous limit 2"}'|jq

Results:

[
  {
    "__time": "2016-06-27T00:00:34.959Z",
    "channel": "#en.wikipedia",
    "page": "Bailando 2015",
    "comment": "/* Scores */",
    "commentLength": 12
  },
  {
    "__time": "2016-06-27T00:01:14.343Z",
    "channel": "#es.wikipedia",
    "page": "Sumo (banda)",
    "comment": "/* Línea de tiempo */",
    "commentLength": 21
  }
]

The above can also be accomplished via the Druid console by selecting the view namespace in the Query tab:

And running the sql “select * from view.wiki_anonymous limit 2”

[ Additional API examples on Imply Views in the sections below under “Imply Views Examples” ]

Happy Viewing! 

Examples with Auth at the Resource level

Example below using wikiread user (only has read privileges on wikipedia):

curl -k -u wikiread:wikiread --request POST --header 'Content-Type: application/json' 'https://imply-b23-elbexter-s9ruz7uko71p-10298291.us-east-1.elb.amazonaws.com:9088/druid/v2/sql/' --data '{
 "query": "SELECT count(*) FROM wikipedia"
}'

[{"EXPR$0":24433}] --> RESULT

We now try and read another datasource as the wikiread user:

curl -k -u wikiread:wikiread --request POST --header 'Content-Type: application/json' 'https://imply-b23-elbexter-s9ruz7uko71p-10298291.us-east-1.elb.amazonaws.com:9088/druid/v2/sql/' --data '{
 "query": "SELECT count(*) FROM "\wikipedia-kafka\""
}'

{"Access-Check-Result":"Unauthorized"} --> RESULT

The above failed as the user does not have access to the wikipedia-kafka datasource.

Example: Specifying multiple datasources for the role read-wikipedia in the Druid unified console:

Get user role details:

curl -k -u admin:admin --request GET 'https://imply-b23-elbexter-s9ruz7uko71p-10298291.us-east-1.elb.amazonaws.com:9088/proxy/coordinator/druid-ext/basic-security/authorization/db/basic/users/wikiread'

{"name":"wikiread","roles":["read-wikipedia"]} --> RESULT

RBAC examples with Imply View Manager

Get views (and definitions)

curl -k -u admin:admin --location --request GET 'https://imply-b23-elbexter-s9ruz7uko71p-10298291.us-east-1.elb.amazonaws.com:9088/proxy/coordinator/druid-ex/view-manager/v1/views' --header 'Content-Type: application/json' --header 'Accept: application/json' |jq

Response:

{
    "wiki-english": {
    "viewName": "wiki-english",
    "viewNamespace": null,
    "viewSql": "SELECT * FROM druid.wikipedia WHERE channel = '#en.wikipedia'",
    "lastModified": "2024-01-16T17:14:03.705Z"
  },
  "wiki_anonymous": {
    "viewName": "wiki_anonymous",
    "viewNamespace": null,
    "viewSql": "SELECT \"__time\", \"channel\", \"page\", \"comment\", \"commentLength\" FROM druid.wikipedia WHERE \"isAnonymous\" = 'true'",
    "lastModified": "2024-01-16T19:00:29.687Z"
  }
}

Delete view:

curl -k -u admin:admin --location --request DELETE \
'https://imply-b23-elbexter-s9ruz7uko71p-10298291.us-east-1.elb.amazonaws.com:9088/proxy/coordinator/druid-ext/view-manager/v1/views/wiki-english' \
--header 'Content-Type: application/json' --header 'Accept: application/json'

Links to unsupported use cases and a list of known issues in Imply View Manager are in the References table below.

References:

Coordinator security APIshttps://docs.imply.io/latest/druid/development/extensions-core/druid-basic-security/#coordinator-security-api 
Adding view permissions to a rolehttps://docs.imply.io/latest/druid/operations/row-and-column-security/#add-permissions-for-sql-views-to-roles 
Resourceshttps://docs.imply.io/latest/druid/operations/security-user-auth/#datasource
https://docs.imply.io/latest/druid/operations/security-user-auth/#system_table 
https://docs.imply.io/latest/druid/operations/security-user-auth/#config
https://docs.imply.io/latest/druid/operations/security-user-auth/#state
https://docs.imply.io/latest/druid/operations/security-user-auth/#external 
Query privilegeshttps://docs.imply.io/latest/druid/operations/security-user-auth/#sql-permissions 
View Manager APIshttps://docs.imply.io/latest/druid/operations/views/view-apis/#view-manager-endpoints 
Imply View Manager Known issueshttps://docs.imply.io/latest/druid/operations/row-and-column-security/#known-issues 
Unsupported View Manager Use Caseshttps://docs.imply.io/latest/druid/operations/views/view-manager/#unsupported-use-cases 

Other blogs you might find interesting

No records found...
Sep 06, 2024

Real-time analytics architecture with Imply Polaris on Microsoft Azure

This article provides an architectural overview of how Imply Polaris integrates with Microsoft Azure services to power real-time analytics applications.

Learn More
Jul 23, 2024

Streamlining Time Series Analysis with Imply Polaris

We are excited to share the latest enhancements in Imply Polaris, introducing time series analysis to revolutionize your analytics capabilities across vast amounts of data in real time.

Learn More
Jul 03, 2024

Using Upserts in Imply Polaris

Transform your data management with upserts in Imply Polaris! Ensure data consistency and supercharge efficiency by seamlessly combining insert and update operations into one powerful action. Discover how Polaris’s...

Learn More

Let us help with your analytics apps

Request a Demo