Using Apache Hadoop to Meet Cybersecurity Challenges

Rocky DeStefano wrote an interesting post about Using Apache Hadoop to Meet Cybersecurity Challenges that I would like to share.

Apache Hadoop turned 10 this year.

What began as an experiment for processing massive volumes of data has steadily progressed to a mature enterprise solution being used to unlock the value of information in previously unimaginable ways. Today, enterprises use Hadoop to answer questions about everything from how to improve health and human well-being to how to get the most out of financial investments to … you name it. In but a decade, the industry has seen the birth and rapid growth of a vast ecosystem centered on Hadoop.

And in our modern enterprise landscape, security and risk professionals are quickly learning that data insights are key to understanding, identifying and addressing the threats plaguing their organizations.

We’ve come to learn that cybersecurity, in many ways, is a data analytics problem. It is precisely for this reason that Hadoop — with its ability to ingest, process and provide analytics for nearly untold amounts of data — is being adopted to solve many of the challenges we face.

When it comes to security, Hadoop allows users to get their arms around all of the data their modern enterprises generate. It gives them complete access to information created by networks, users, endpoints and even the Internet of Things (IoT) — exactly what’s needed to produce analytics about suspicious behaviors, anomalies and other threat indicators. Add to this the ability to leverage machine learning and flexible plug-and-play applications from both proprietary and open source markets, and you have a solution capable of meeting current and emerging challenges.

It wasn’t always this way. A decade ago, many of us in the security community thought SIEM and other legacy products would be sufficient. We thought they would provide all the relevant information needed to produce the requisite analytics to meet cybersecurity challenges. But as  Cloud, Mobile, distributed computing and the IoT gained steam, we learned that those systems were not nearly robust enough to handle all the variant data sources or the scale at which they grew. SIEM and other legacy solutions simply weren’t designed for massive data volumes and they certainly weren’t designed to provide all the analytic answers nor the context analysts required to ensure an effective detection capability that can keep pace with the advancing capabilities of the adversaries we all face. 

When security professionals were limited by the data and the analytics available through such tools, what could be done to address risk was equally constrained. Restricted capabilities left those of us in charge of defending our organizations able to find only commodity, known and moderately advanced types of attacks. With Hadoop, an expanded set of security use cases is now available.

Thanks to larger data sets and a wider array of accessible analytics techniques, we can answer questions about attacks, threats and risk. With Hadoop, enterprises can access user-behavior analytics to identify and mitigate insider threats, spot suspicious lateral movement in networks, and even share threat intelligence across siloed business units.

Over the course of my career, my security theme, if you will, has always revolved around three points: faster incident detection, enhancing incident response, and understanding how these situations impact the business. With Hadoop, all three things are possible because, by its very design, it provides access to information, analytics and contextual understanding.

The security community is no longer limited by a single application’s view of risk. Hadoop’s flexibility allows teams to get answers to the questions they have as opposed to limiting them to knowing only what different point security applications and systems are capable of telling them. It is also worth noting that Hadoop is on the cutting edge in integrating both proprietary and open source security technologies to produce a comprehensive cybersecurity defense. For example, with open source, the Open Network Insights (ONI) project is one of the first to bring an advanced threat detection solution to the platform using open data models and big data analytics.

In this this article, I’ve given an overview of how and why Hadoop is fast becoming the cybersecurity tool of choice. In the next several installments, I’ll focus on practical applications for the platform as well as how its openness allows cybersecurity professionals to introduce a broad array of capabilities proven to reduce overall risk and exposure.”





Rocky DeStefano


No posts to display