Imply lookups for enhanced network flow visibility

Nov 26, 2018
Eric Graham

Modern day TCP/IP networks continue to evolve making management and monitoring ever harder In addition, stricter SLAs for uptime, mission critical enterprise applications and the need for fast MTTR makes ultra-fast databases and easy to use UIs critical for business success Modern day tools need to be flexible, not just supporting basic network flow for IP visibility, but by providing visibility into hostnames, microservice names, usernames and more Imply and Druid were designed from the ground up to help solve these exact problems.

The founders of Imply built one of the most popular open source databases available today for operational analytics, Druid Within Druid there are multiple ways to enhance visibility for existing network flow records This how-to blog covers one way to do this using Druid lookup tables You can think of lookup tables as a secondary fact table that you can use to query based on a key value.

If I want to define hostname, which is not included in standard network flow records, I could use a lookup table to join the two tables based on IP address and assign my latest hostname to a dimension in Pivot, the Imply UI This works by providing a join at query time mapping IP to a hostname (name coming from a secondary lookup table) by using IP as the common dimension between tables.

The following steps can be used to define a basic lookup table using a csv for input This how-to assumes you have a data source loaded in Pivot and there is a dimension that matches your key value in the lookup table.

  1. Include druid-lookups-cached-global in your extension list for Druid This can be defined in imply/conf/druid/_common/ for the following variable.


  2. Create your csv input file using a format similar to the following (primary key,value key) The creation of this file could easily be automated using DNS or IPAM systems Your primary table should already include the IPs you are mapping.,my_mac_laptop,dns_server,dhcp_server
  3. You will need to create a json input file that imports the csv into your lookup table in Druid.

    {   "__default": {     "iana-ports": {       "version": "v1",       "lookupExtractorFactory": {         "type": "cachedNamespace",         "extractionNamespace": {           "type": "uri",           "uri": "file:/Users/egraham/Downloads/lookups/servports2.csv",           "namespaceParseSpec": {             "format": "csv",             "columns": [              "key",              "value"             ]           },           "pollPeriod": "PT30S"         },         "firstCacheTimeout": 0       }     }   } }
    • iana-ports defines the lookup table name.
    • version defines the update revision in Druid. When updates are made to the lookup table this should be incremented.
    • type under extractionNamespace defines how you will be importing mappings. In this case we are using “uri” to define a file location.
    • uri defines the location of the csv file you created in step 2 that includes your mappings.
    • format defines what file format you will be importing. In this case we created a csv.

      You should leave a csv header out for the individual columns.

      Columns can be left as key and value.
      pollPeriod defines how often Druid will poll the lookup table for updates.

      For more information see the documentation for lookups in general
      and the lookups-cached-global extension specifically.

  4. Load the lookup json file into the Druid coordinator

    curl -H "Content-Type: application/json" \  --data @serv-port3.json \  http://<your_coordinator_ip>:8081/druid/coordinator/v1/lookups/config
  5. Check to see that your lookup table was loaded into your broker.
    You may have to wait for your poll interval to expire before the table is updated on the broker.

    curl -X GET http://<your_broker_ip>:8082/druid/listen/v1/lookups

At this point, your lookup table should be ready to use in Pivot To use the lookup table, you can create a new dimension that uses Plywood to query the lookup and primary table The Plywood syntax looks something to the following.


port_dst is the dimension name in my primary table and “iana-ports” is the name I used to define my lookup table in step 3 above.

Save your dimension Now you can use this dimension to extract the lookup table name against the primary key value In the example below, I created multiple associations for IP to hostname and port to name mappings.

Note: Keep in mind that if you want to track changes to a mapped value over time, lookup tables are not the way to do it A better way would be merge your two tables together during ingest using Kafka or some other stream processing database.

In summary, lookup tables are a great way to provide additional visibility at query time Combined with Imply’s easy to use UI and Druid’s very fast response time, lookup tables are a truly powerful feature Although not perfect for every use case, they are a great way to provide additional visibility in certain cases Continue visiting our blog for future network flow related articles.

Other blogs you might find interesting

No records found...
Jun 17, 2024

Community Spotlight: Using Netflix’s Spectator Histogram and Kong’s DDSketch in Apache Druid for Advanced Statistical Analysis

In Apache Druid, sketches can be built from raw data at ingestion time or at query time. Apache Druid 29.0.0 included two community extensions that enhance data accuracy at the extremes of statistical distributions...

Learn More
Jun 17, 2024

Introducing Apache Druid® 30.0

We are excited to announce the release of Apache Druid 30.0. This release contains over 409 commits from 50 contributors. Druid 30 continues the investment across the following three key pillars: Ecosystem...

Learn More
Jun 12, 2024

Why I Joined Imply

After reviewing the high-level technical overview video of Apache Druid and learning about how the world's leading companies use Apache Druid, I immediately saw the immense potential in the product. Data is...

Learn More

Let us help with your analytics apps

Request a Demo