A Builder’s Guide to Security Analytics

Apr 22, 2024
William To

What is security analytics?

Security analytics is a discipline that uses data analysis to investigate, prevent, or resolve cybersecurity threats and incidents. Security analysts typically collect and analyze data from network and system logs, security devices, and applications to create behavioral patterns and trends—and to pinpoint and investigate anomalies when they occur.

This article is intended for business leaders, executives, IT professionals, and chief information security officers (CISOs). It will discuss the inner workings of security analytics platforms and workflows, some of the possible security threats teams may face, use cases, and finally, the dilemma that all security teams must grapple with: should you build or buy your own security solution?

The range of potential cybersecurity threats is vast. These include old-fashioned password stuffing, ransomware, denial of service (DoS) attacks, phishing exploits, and more. The key, however, is that many types of cyberattacks do leave trails of evidence, such as user authentication logs, telemetry and network data, unexplained data transfers or movements, and suspicious process activity.

That’s where security analytics comes in. With the right platform and procedures—and assuming that they’ve already laid the groundwork for these types of incidents—a security analytics team can alert on outlier, explore it further, and take the appropriate course of action, whether it’s freezing access to a compromised account or stopping data from being exfiltrated out of the environment.

Implemented properly, security analytics provides a host of benefits, including:

Early threat detection

By noticing (and notifying human security analysts) of emerging cyber threats in real time, security analytics platforms provide lead time for teams to get ahead of incidents before they escalate. This could be a financial institution closing a suspicious bank account before it can be used to launder money, or a credit card provider blocking dubious transactions and following up with users.

Improved incident response

Rapid time to insight enables a new world of possibilities for teams and organizations, helping them transform their procedures. 

Enhanced visibility

Security analytics platforms also provide comprehensive visibility into an organization’s environment—which can sometimes be a bit of a black box, with uncharted dependencies and variables that only come to light when something breaks (or when someone enters the system). The right security analytics platform can help teams understand the relationships and vulnerabilities between and within networks, endpoints, applications, and data. 


For this reason, many observability providers have also begun adding security capabilities to their core products. This is a logical approach, as some forms of attacks, such as denial of service, may first be detected as a performance issue rather than flagged as a security threat. 

Data-driven decision making

By analyzing security data en masse, security analytics platforms can provide a unified view into an organization’s security posture—and help executives or other decision makers come to conclusions about investments, resource allocation, and risk management. 

What are some security analytics use cases?

Threat detection and prevention

This use case includes real-time monitoring and analysis of logs, network traffic, and endpoint data in order to identify and preempt suspicious activity or potential threats. This workflow also relies heavily on automated alerts to notify human team members of situations requiring action, as well as interactive dashboards for employees, executives, and stakeholders alike to investigate outliers and coordinate on troubleshooting efforts.

Ideally, teams will also utilize machine learning algorithms to assemble patterns of behavior, and to immediately recognize any deviations that may be indicators of malicious activity. As with other forms of artificial intelligence, teams have to maintain anomaly detection, updating them with new data, fine tuning parameters to eliminate false positives or negatives, and more.

Lastly, some organizations may take a proactive approach, also known as threat hunting, where they search for hidden threats within historical and real-time data using advanced search, query, and aggregation features. This will usually involve threat intelligence feeds and behavior analytics to identify indicators of compromise (IOCs) and potential attack vectors.

Incident response and investigation

Because time is critical during a security crisis, organizations will generally have established workflows involving both automated triggers as well as manual human intervention. The goal of these procedures is to contain the threat, limit any damage, and finally, remediation to fix any wrongdoings that may have occurred. To facilitate responses, teams may create documentation in the form of playbooks or runbooks, to serve as a quick reference or manual.

Another important component is root cause analysis, which teams use to unearth the origin and source of an incident, and better understand how they occurred. This is crucial to preventing future security issues, and will generally involve long-running historical analysis of timelines, logs, and other forensic evidence. Conclusions will usually be presented during a postmortem.

A related step is forensic analysis, which is used to identify attackers, their activities, and past attacks, which can serve as evidence for future legal action. For this step, organizations must preserve the chain of custody and maintain the integrity of digital evidence so that it can be admitted in court.

Addressing security compliance and regulatory requirements

While this is less commonly associated with security analytics, it remains an important part nonetheless. Retaining data is essential for auditing and compliance, which means that old logs, events, and other data types should be archived in an accessible, secure location.

At this stage, teams may want to leverage external threat intelligence feeds to enrich security analytics with the latest in threat intelligence. If it is integrated into a security platform, this enables the platform to automatically block malicious IP addresses, domains, and URLs.

Teams may also take the opportunity to conduct risk assessments to consider potential cyber threats, vulnerabilities, and impacts. After all, it is never too late to prepare for the next possible attack, and identifying areas of high risk and gaming out possible incidents is a valuable learning experience.

Should you build or buy a security analytics platform?

For many organizations, off-the-shelf (OTS) security solutions are adequate for defending their digital environment. These products offer around the clock support, out of the box functionality, and other important capabilities, like user interfaces.

Still, commercial security platforms aren’t the best fit for all organizations and applications. Some infrastructure is so complex, complicated, or unique that using an external software for security could be prohibitively expensive. In these situations, applications may also generate lots of data or have specific compliance requirements, such as those governing privacy and data retention. For instance, some companies may be required to store sensitive data in on-premises servers.

This leads to a fundamental question: should a team build—or buy—its own security analytics platform? 

To be clear, each situation is different, and thus, each organization will take a different path to their final decision. Still, there are some important factors at play, including:

Flexibility

Not every vendor has the required flexibility and customization to adapt to every environment—at least not without the customer organization paying a premium in professional services, or utilizing employee hours to devise workarounds to keep everything functioning. 

Equally important, some teams have very distinct requirements: a security team at a financial institution might need to audit their data regularly, while another cyber security team may need to do a deep dive into a year’s worth of log data to provide historical context for a specific anomaly.

Pricing

Just as every security product has different strengths and weaknesses, so too does each one have different pricing models—some of which can be extremely complex and confusing. Customer organizations can be charged by throughput (such as amount of data ingested or processed per hour), per unit of monitoring (by virtual machine or server), by action (such as user or log line), or any mix of the above. 

Scale

The size of an organization’s data and operations can also create challenges for both pricing and performance. More data, more users, and more user activity will likely lead to more costs—but in addition, scale may also impact operations. 

After all, many platforms aren’t always intended to secure massive, global environments ingesting millions of events per second or billions of events per hour. In this situation, can an off-the-shelf product query this massive dataset and return queries instantly? If there are hundreds of analysts or engineers all working together, will the product perform as usual—or start lagging immediately?

What features are essential to a security analytics platform?

For those teams that intend to build their own platform for investigating security threats, they’ll require the following features:

Data collection

Whether it’s logs, metrics, or traces, any worthwhile solution will collect a huge amount of data types from different sources, including network devices, endpoints, feeds, and more. Given that this is the first step to any security response, a platform needs a database capable of accommodating various kinds of data, such as semi-structured, structured, and unstructured.

It’s also important to ingest data quickly, ideally via streaming technologies like Amazon Kinesis or Apache Kafka (or one of its derivatives, like Confluent). Streaming remains the most reliable method for real-time data intake at speed.

Flexibility with data sources and formats

Still, streaming is only one of many different data sources that a security platform has to draw from. To maximize efficiency, any database used for a security solution must be widely compatible with various data sources, ingesting logs from network devices, servers, and applications; security events from security information and event management (SIEM) systems, antivirus software, and intrusion detection systems; metrics pertaining to CPU usage, network traffic, and resource utilization; and more.

All of this data also comes in a range of formats as well, spanning the spectrum of structured, unstructured, and semi-structured. These could include text-based log files for time stamps or other metadata, JSON, CSV, binary, Syslog messages, CEF, and more. If compatibility is an issue, then a database needs some way to rapidly transform these different formats into a single type that they can work with. 

Data aggregation and correlation

After data is gathered, it has to be organized, processed, and aggregated to derive trends and patterns. At this stage, data may also need to be prepared prior to its aggregation, which means that the database in question should be able to perform real-time preparation of data, either via a stream processor like Apache Flink or its own built-in capabilities, to render data suitable for analysis.

This step is crucial, because this is when clues, usually some sort of anomalous behavior or performance, first surface. As with the previous step, this phase requires a database capable of rapid, real-time data analysis—every second counts in a developing situation.

Advanced features

Automation, such as machine learning and alerting, are critical for reducing workloads of human teams, identifying and notifying analysts of possible issues, and quickly extracting insights from lots of data. These machine learning components are vital to security analytics, if only because many modern digital environments generate such massive amounts of data—far more than any single person or any single team can deal with. Algorithms serve as a trusted partner, filtering out the noise and flagging only the really suspicious activity for review.

Another important requirement is tools for working with time series data. Time is an important element for security analysis because it aids in identifying patterns and outliers. As a result, any database needs to be compatible with timestamped data, with abilities such as backfill or padding to remove any gaps in time series data, or even the ability to create charts and time series from raw data points.

Scalability and performance

Scalability is also critical, because it ensures that the database can keep pace with data as it spikes up or down. During a rapidly unfolding crisis, data—in the form of logs, metrics, traces, and events—will increase exponentially, and any database that forms the foundation of a security solution must be able to cope. 

Simultaneously, any database has to maintain rapid response times even as data volumes and traffic ramp up. During an “all hands on deck” situation, a database would have to accommodate many more users and queries—a challenging situation that can easily introduce latency.

Visualizations

Dashboards and other graphics are important for both troubleshooting and reporting. An analyst might need to further break down suspicious activity, drilling down or zooming into outliers to understand them in more detail. Because so much of security analytics is time sensitive, this exploration should ideally occur in real time.

In addition, reports need to include graphics, so that executives, investors, and other stakeholders can better understand security situations and make the right decisions. A platform that can easily export interactive charts and other rich illustrations will facilitate this task.

Why is Apache Druid the best real-time database for security analytics?

Should a team choose to build their own solution, they will need a real-time database capable of supporting fast analytics under load on massive datasets. Given that there are around 400-odd databases on the market today, the sheer amount of choice is overwhelming.

The best database is Apache Druid. An open source database, Druid is optimized for rapid analytics on massive datasets, providing fast query responses regardless of data size, query traffic, or user numbers. Today, Druid is known for its flexibility in addressing a wide range of use cases, from IoT to observability to security analytics.

Druid has a number of strengths that make it suitable as the basis of a security platform, including:

  • Stream compatibility with products such as Apache Kafka, Amazon Kinesis, and their derivatives.
  • A horizontally scalable architecture that provides distributed, parallel query processing, ensuring minimal latency even under load.
  • Support for interactive, complex operations (such as aggregations).
  • Advanced time series features, such as interpolation, padding, backfill, and more.
  • Ability to accommodate high cardinality dimensions and complex event hierarchies. 

At the same time, Druid has some weaknesses, specifically where it concerns ease of use. Setting up open source Druid clusters can be a complex, time consuming process with a high learning curve. While Druid is supported by a thriving user community, getting answers and troubleshooting issues amidst a security crisis can be frustrating.

In addition, while Druid is open source and thus free to use, teams will still have to pay for the infrastructure on which the clusters will run (such as Amazon EKS or EC2). Given these considerations, many may instead decide to go with a paid distribution of Druid, such as Imply Polaris.

Given its features, Druid is versatile and meets most security analytics use cases, including alerting, security incident investigation, user behavior analytics, as well as detection for real-time threats, insider threat detection, and more.

Why should you choose Imply?

Built on Apache Druid, Imply products provide a seamless, intuitive experience for users that want the power of Druid without its operational complexities. As the fully managed version of Druid, Imply Polaris shares the same architecture, providing scalability, streaming compatibility, and features. As discussed throughout the article, all of these advantages are important for any team building their own security analytics platform.

As with Druid, Imply is natively compatible with Apache Kafka, Amazon Kinesis, and any derivative technologies, enabling users to set up streams with a few clicks (and no additional software). To allow teams to make the most of their streaming data, Imply provides exactly-once data ingestion to guard against duplicate data, and query-on-arrival, so that users can immediately access their data without it having to be persisted into memory first.

Imply also continues Druid’s tradition of putting developers, incorporating features such as schema autodetection, which automatically discovers data schema and updates tables as needed—so that developers don’t need to do so manually. 

This is particularly useful for security analytics, given the sheer variety of data formats and their associated models—and provides the flexibility of a schemaless database alongside the performance advantages of a strongly-typed one.

Like Druid, Imply can retrieve queries in milliseconds, even under load and at scale. Much of this has to do with the unique scatter-gather method, where queries are split up, sent to the appropriate columns where the necessary data is stored, and finally pieced together by broker nodes before being sent back to users. 

Because this process can occur simultaneously across columns and segments, users will not experience latency or degradations in performance, even as query traffic and data volumes rise.

Users can also directly query data from deep storage without having to first load data onto Druid’s data servers—enabling cost savings, a simplified data architecture, and faster, more flexible reporting and data analysis.

To learn more about why and how teams build their own security analytics solution with Imply, read this article
Get started with the easiest way to use Apache Druid—sign up for a free trial of fully managed Imply Polaris.

Other blogs you might find interesting

No records found...
Nov 14, 2024

Recap: Druid Summit 2024 – A Vibrant Community Shaping the Future of Data Analytics

In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...

Learn More
Oct 29, 2024

Pivot by Imply: A High-Speed Data Exploration UI for Druid

In today’s fast-paced world, organizations rely on real-time analytics to make critical decisions. With millions of events streaming in per second, having an intuitive, high-speed data exploration tool to...

Learn More
Oct 22, 2024

Introducing Apache Druid® 31.0

We are excited to announce the release of Apache Druid 31.0. This release contains over 525 commits from 45 contributors.

Learn More

Let us help with your analytics apps

Request a Demo