Tutorial: An End-to-end Streaming Analytics Stack for Network Telemetry Data

Imply is a high performance analytics system for many different types of event-driven data.One of the common use cases of the system is to store, analyze, and visualize different types of networking data (Netlink/NFLOG, NetFlow v1/v5/v7/v8/v9, sFlow v2/v4/v5, IPFIX, etc.).

In this tutorial, we will step through how to set up Imply, Kafka, and pmacct to build an end-to-end streaming analytics stack that can handle many different forms of networking data.The setup described will use a single AWS instance for simplicity, but can be used as reference architecture for a fully distributed production deployment.

Prerequisites

A bare metal server or cloud instance (such as an AWS m5d.xlarge instance) with 16GB RAM, 100GB of disk, and an ethernet interface.
The server should be running Linux.
You should have sudo or root access on the server.
A router, switch, firewall, or host that can send networking data.

Install pmacct

Download pmacct from the following URL: http://www.pmacct.net/pmacct-1.7.2.tar.gz
Install libpcap. This should be available in your Linux repository.
```
$ sudo apt-get install libpcap0.8
```

Install librdkafka and librdkafka-dev.

$ sudo apt-get install librdkafka-dev $ sudo apt-get install librdkafka1

Set environment variables for libraries.

$ export KAFKA_LIBS="-L/usr/lib/x86_64-linux-gnu -lrdkafka" $ export KAFKA_CFLAGS="-I/usr/include/librdkafka"

Download and extract the following jansson file: http://www.digip.org/jansson/releases/jansson-2.12.tar.gz

Set environment variable for libraries:

$ export JANSSON_CFLAGS="-I/usr/local/include/" $ export JANSSON_LIBS="-L/usr/local/lib -ljansson"

While in the jansson directory created during extract, compile jansson by running:
```
$ ./configure $ make $ make install
```

Compile pmacct

Extract the pmacct-1.7.2.tar.gz tarball to a desired directory.
cd to pmacct directory that was created after extraction

Run the following commands:

$ ./configure –enable-kafka –enable-jansson $ sudo make $ sudo make install

Create your nfacctd (a binary installed with pmacct) configuration file with the following information. Modify the highlighted areas to add your relevant information. Verify nfacctd is working before removing the # in front of daemonize so logs are displayed at the terminal. Once you know everything is working uncomment this and restart nfacctd.

! kafka_topic: netflow kafka_broker_host: *<IP where your kafka process is running>* kafka_broker_port: *<consumer port – typically 9092>* kafka_refresh_time: 1 #daemonize: true plugins: kafka pcap_interface: *<local interface to listen for netflow packets on>* nfacctd_ip: *<local IP to listen for netflow packets on>* nfacctd_port: *<port netflow is being sent to>* aggregate: src_host, dst_host,in_iface, out_iface, timestamp_start, timestamp_end, src_port, dst_port, proto, tos, tcpflags

Install Imply

Download the most recent Imply distribution by going to the following URL: https://imply.io/get-started
Refer to the following quickstart for installation help and system requirements: https://docs.imply.io/on-prem/quickstart
Modify conf-quickstart/druid/_common/common.runtime.properties with the right directories for segments and logs. If you have plenty of local disk you can keep the default configuration. A good reference is the Imply quickstart documentation: https://docs.imply.io/on-prem/quickstart
Start Imply from the Imply directory with the quickstart configuration by typing the following:
```
$ sudo bin/supervise -c conf/supervise/quickstart.conf &
```

Install Kafka

Download the most recent Kafka distribution from the following URL:
http://www-us.apache.org/dist/kafka/0.11.0.3/kafka_2.11-0.11.0.3.tgz

Note: The Imply distribution already includes Apache Zookeeper, which Kafka will use when you start it.
Start Kafka with the following command from within the Kafka directory:
```
$ sudo ./bin/kafka-server-start.sh config/server.properties &
```
Create a Kafka topic using the following command where is replaced with the name you want – such as netflow. From the Kafka installation directory, run:
```
$ sudo ./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic *<topic name>*
```
Start nfacctd
Nfacctd (included with pmacct) can be started with the configuration above. This will start a process that listens for incoming network flows.
```
$ sudo nfacctd -f ./nfacct.conf
```
Start sending network flows to the system you have just set up. Make sure to change your security rules to allow the source IP of the network flow sender and the destination port that you configured on the router.If everything is working properly you should see nfacctd display number of received packets.Note that this does not happen all the time — counters are incremented only when a flow export takes place from the router and this process is based on flow export timers. When you see packets registered, you can check your Kafka consumer by running the following from the Kafka installation directory.:
```
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic *<topic name>* --from-beginning
```

Connect Kafka and Imply

Start Imply by opening a browser and either going to localhost:9095 (if browser is being run from your localhost) or . Remember to modify your security rules to allow destination port 9095 from your source IP.
Select the Data tab then click +Load Data (upper right), and the following options will be displayed.
Select the Apache Kafka option.
Fill in the details for the Kafka process including IP:consumer port (typically 9092) and the topic name that you created previously (e.g. 192.168.1.2:9092).
Select Sample and continue
Select Next for the remaining screens to start loading your network flows into Imply.
A great way to get hands-on with Druid is through a Free Imply Download or Imply Cloud Trial.

Other blogs you might find interesting

No records found...

Nov 12, 2025

The Breaking Point for Observability Leaders

Observability is at a crossroads For years, observability has promised to give teams the visibility they need to keep digital services resilient. But as data volumes explode, many leaders are realizing the...

Learn More

Nov 04, 2025

The State of Log Management 2025

Logs are exploding. Costs are climbing. Performance is stalling. If you manage logs, you’re in the hot seat Every app, every integration, every security risk—it all generates more data. And when something...

Learn More

Oct 29, 2025

The next evolution in observability: How architecture is following in BI’s footsteps

Modern observability systems are hitting the same wall business intelligence did a decade ago. As data volumes explode, the traditional model — where a single product handles ingestion, storage, compute,...

Learn More

Observability Warehouse

Real Time Analytics Database

OBSERVABILITY CASE STUDIES

Content

Support

Apache Druid

Other blogs you might find interesting

Ready to decouple your observability stack?
No workflow changes. No migrations. More data, less spend.

Observability Warehouse

Real Time Analytics Database

OBSERVABILITY CASE STUDIES

Content

Support

Apache Druid

Tutorial: An End-to-end Streaming Analytics Stack for Network Telemetry Data

Prerequisites

Install pmacct

Compile pmacct

Install Imply

Install Kafka

Start nfacctd

Connect Kafka and Imply

Other blogs you might find interesting

Ready to decouple your observability stack? No workflow changes. No migrations. More data, less spend.

Ready to decouple your observability stack?
No workflow changes. No migrations. More data, less spend.