Should You Build or Buy Security Analytics for SecOps?

May 07, 2023
William To

In many ways, the story of the Internet is about security. After all, security threats have existed since the dawn of the World Wide Web, when hackers launched primitive attacks like executables, malware, and password stuffing to bypass weak encryption and steal from banks, retailers, and even government ministries.

As cybersecurity becomes more sophisticated, so too have malicious parties evolved. Some complex approaches, such as social engineering, rely on winning the trust of unsuspecting targets to gain control of their account logins—without challenging encryption protocols directly. Others, such as botnets, reimagine brute force tactics, using malware to link together a network of unwitting devices to execute denial of service (DDoS) attacks through their combined processing power.

Ultimately, it’s in any company’s best interest to detect and defend against threats as they arise. But the speed and volume of digital interactions raises a difficult question: how does your team protect their environment, with all of its unique loopholes, use cases, and weaknesses, against attackers? Does it make more sense to purchase a cybersecurity solution—or to build your own?

To build or not to build: that is the question

To be honest, there are no uniform answers, as each organization (or even separate teams within the same organization) will have to decide for themselves.

But one good place to start is the degree of customization that a team needs. As an example, a financial institution may be required by compliance purposes to issue audits at predetermined intervals, like months or quarters. Alternatively, if an incident occurs, a cybersecurity team may want to dive into their log data over the past three months to see if this anomaly was a one-off issue or part of a larger pattern of malicious activity.

Whatever your needs may be, not every ready-made solution is as flexible as you need it to be. After all, paying for versatility can quickly add up—whether it’s in the form of extra fees paid to vendors to unlock new abilities, or the workarounds that your engineers devise in order to keep this product functional for your security teams.

Vendors can also have complex pricing models, charging by throughput (GB of data per hour), per unit of monitoring (such as per virtual machine or server), by action (such as a single user session or log line), or some combination of the above. If you have 100 clusters generating millions of events per second, a security platform that charges by traffic and unit of monitoring would be prohibitively expensive.

Aside from its impact on pricing, the scale of your operations can also affect performance. Many solutions are simply not built to handle millions of events per second (or billions of events per hour). If your security team has to query a terabyte’s worth of logs over the past six months, can an off-the-shelf product accommodate this quantity of data? If there are 100 security analysts and engineers all querying the same massive dataset at once, can this vendor’s offering return data in a timely manner? Can it even complete queries for 100 concurrent users, or will it just leave them unfinished?

Automation is another consideration. At times, finding and stopping a single fraudulent interaction amidst an unending, real-time stream of transactions can be as impossible as filtering out a single drop of water from a firehouse. That’s where AI and machine learning can help, especially if security events are too fast for human teams to isolate or respond to in real time. It’s important for teams to determine how automation and human intervention fits into your response procedures—assuming that the security-as-a-service product facilitates building automatic responses like triggers.

Why Druid?

If your organization decides to build and run its own security platform, one critical component is a database. Apache Druid is the database for speed, scale, and streaming data, and possesses unique features that make it an ideal foundation on which to build an in-house security monitoring solution.

Should a security situation arise, Druid can provide fast, subsecond responses—even in response to an influx of simultaneous queries. After all, performance under load is critical to security monitoring: as a situation escalates, more and more people, from analysts to executives to engineers to customers, will have to query data in order to detect and defend against an attack. Whether they’re drilling down, zooming in, or slicing and dicing, each of these parties will require rapid responses in order to keep up with evolving security conditions.

Druid is also designed to scale massively, working with up to millions of events per second (or billions of events per hour). Equally important, Druid’s architecture is designed to operate at scale cheaply and efficiently. In fact, one user estimated that by switching to Druid, their company was able to accommodate 50 percent more data, while only increasing operating costs by 15 percent.

Lastly, because security data (such as logs) are generated in quantity (and in real time), stream processors are the best way to ingest this data quickly for aggregation and analysis. Druid is compatible with two of the most popular streaming technologies today, Amazon Kinesis and Apache Kafka, requiring no workarounds or extra engineering work to ingest data from these products.

In addition, Imply also provides a suite of Druid-based products, including Polaris, a Druid database-as-a-service; Pivot, a GUI for building rich, interactive dashboards; and Manager, for deploying, monitoring, and controlling Druid clusters.

Customer story: DBS

Established in 1968, the Development Bank of Singapore (DBS) is a leading multinational banking corporation, and one of the largest in Southeast Asia. Presiding over a portfolio totaling S$743 billion, DBS saw net profits of S$8.19 billion in 2022—a 15 percent return on equity.

Given its stellar reputation and global reach, one key responsibility of DBS is to combat money laundering, preventing malicious parties from disguising illegal profits as clean, legitimate income. The scale of this problem is massive, as the United Nations Office on Drugs and Crime estimates that money laundering comprises as much as 5 percent of world income, or about $800 billion to $2 trillion annually. Banks that enable money laundering (accidentally or otherwise) can be penalized with heavy fines, audited by government agencies, or even taken out of business.

The challenges of anti-money laundering efforts

While the consequences of money laundering are clear, combating it has become increasingly difficult, due to increasingly complex regulation and exponential growth in data volume, as the banking ecosystem becomes increasingly digitized. For human analysts and legacy banking systems alike, it is not possible to sift through this firehose of data and filter out fraudulent transactions or criminal actors before the fact.

Instead, anti-money laundering (AML) and fraud detection were both reactive, relying on analysis of batch data after the fact to go back and freeze stolen credit cards or ghost accounts, for instance. Historically, AML utilized massive file transfers between various banking systems to process and analyze batch data, an intensive and time-consuming process. Previously, DBS’ security environment also could not easily scale to keep pace with the high volume of user traffic, nor could it support advanced analytics or a high volume of simultaneous users.

Druid and afterwards

Druid was an attractive choice for several reasons. First, DBS relied on Kafka to stream security data, so Druid’s native compatibility was extremely helpful. DBS also increased its team of analysts threefold, and as a result, needed a database that could provide millisecond-response times for their queries. Lastly, DBS was mandated to retain its data for a longer period of time, and with Druid, they were able to retain data for four times as long as their previous database.

By switching to Druid, DBS could attain 360-degree visibility into their banking environment, as well as shift to real-time compliance, analyzing event data rather than responding to issues after they occurred. As part of this transition, DBS introduced more interactive, responsive dashboards for a wide range of users, including executives, analysts, and even regulators, providing instantaneous insights into security alerts and anomalies.

There were also key benefits for customers. Because investigations became faster and more streamlined, wire transfers were also processed more quickly and confidently, leading to fewer business delays. DBS could also automate more processes such as account screening or digital lifestyle services reviews, running them more cheaply, efficiently, and frequently.

Lastly, DBS also used Druid to improve data mining and machine learning use cases, ultimately with an eye for reducing false positives and negatives, while increasing true positives and negatives. In addition, DBS could train both ML and AI on customer behavior patterns, improving their ability to flag and block suspicious transactions, predict and preempt fraud, and build patterns of user behavior based on historical data.

To learn more about Druid, read the architecture guide.

Imply Polaris, a fully managed, database-as-a-service, is the easiest way to get started with Druid. Register for a free trial of Polaris today.

Other blogs you might find interesting

No records found...
Jul 23, 2024

Streamlining Time Series Analysis with Imply Polaris

We are excited to share the latest enhancements in Imply Polaris, introducing time series analysis to revolutionize your analytics capabilities across vast amounts of data in real time.

Learn More
Jul 03, 2024

Using Upserts in Imply Polaris

Transform your data management with upserts in Imply Polaris! Ensure data consistency and supercharge efficiency by seamlessly combining insert and update operations into one powerful action. Discover how Polaris’s...

Learn More
Jul 01, 2024

Make Imply Polaris the New Home for your Rockset Data

Rockset is deprecating its services—so where should you go? Try Imply Polaris, the database built for speed, scale, and streaming data.

Learn More

Let us help with your analytics apps

Request a Demo