Security is a tricky thing and it means different things to different people. It is truly in the eye of the beholder. There is the checkbox kind, there is the “real” kind, there is the checkbox kind that holds up, and there is the “real” kind that is circumvented, and so on. Don’t kid yourself: the “absolute” kind does not exist.
I want to talk about security solutions based on log data. This is the kind of security that kicks in after the perimeter security (firewalls), intrusion detection (IDS/IPS), vulnerability scanners, and dozens of other security technologies have done their thing. It ties all of these technologies together, correlates their events, reduces false positives and enables forensic investigation. Sometimes this technology is called Log Management and/or Security Information and Event Management (SIEM). I used to build these technologies years ago, but it seems like decades ago.
A typical SIEM product is a hunking appliance, sharp edges, screaming colors – the kind of design that instills confidence and says “Don’t come close, I WILL SHRED YOU! GRRRRRRRRRR”.
Ahhhh, SIEM, makes you feel safe doesn’t it. It should not. I proclaim this at the risk at being yet another one of those guys who wants to rag on SIEM, but I built one, and beat many, so I feel I’ve got some ragging rights. So, what’s wrong with SIEM? Where does it fall apart?
SIEM does not scale
It is hard enough to capture a terabyte of daily logs (40,000 Events Per Second, 3 Billion Events per Day) and store them. It is couple of orders of magnitude harder to run correlation in real time and alert when something bad happens. SIEM tools are extraordinarily difficult to run at scales above 100GB of data per day. This is because they are designed to scale by adding more CPU, memory, and fast spindles to the same box. The exponential growth of data over the two decades when those SIEM tools were designed has outpaced the ability to add CPU, memory, and fast spindles into the box.
Result: Data growth outpaces capacity → Data dropped from collection → Significant data dropped from correlation → Gap in analysis → Serious gap in security
SIEM normalization can’t keep pace
SIEM tools depend on normalization (shoehorning) of all data into one common schema so that you can write queries across all events. That worked fifteen years ago when sources were few. These days sources and infrastructure types are expanding like never before. One enterprise might have multiple vendors and versions of network gear, many versions of operating systems, open source technologies, workloads running in infrastructure as a service (IaaS), and many custom written applications. Writing normalizers to keep pace with changing log formats is not possible.
Result: Too many data types and versions → Falling behind on adding new sources → Reduced source support → Gaps in analysis → Serious gaps in security
SIEM is rule-only based
This is a tough one. Rules are useful, even required, but not sufficient. Rules only catch the thing you express in them, the things you know to look for. To be secure, you must be ahead of new threats. A million monkeys writing rules in real-time: not possible.
Result: Your rules are stale → You hire a million monkeys → Monkeys eat all your bananas → You analyze only a subset of relevant events → Serious gap in security
SIEM is too complex
It is way too hard to run these things. I’ve had too many meetings and discussions with my former customers on how to keep the damned things running and too few meetings on how to get value out of the fancy features we provided. In reality most customers get to use the 20% of features because the rest of the stuff is not reachable. It is like putting your best tools on the shelf just out of reach. You can see them, you could do oh so much with them, but you can’t really use them because they are out of reach.
Result: You spend a lot of money → Your team spends a lot of time running SIEM → They don’t succeed on leveraging the cool capabilities → Value is low → Gaps in analysis → Serious gaps in security
So, what is an honest, forward-looking security professional who does not want to duct tape a solution to do? What you need is what we just started: Sumo Logic Enterprise Security Analytics. No, it is not absolute security, it is not checkbox security, but it is a more real security because it:
Scales
Processes terabytes of your data per day in real time. Evaluates rules regardless of data volume and does not restrict what you collect or analyze. Furthermore, no SIEM style normalization, just add data, a pinch of savvy, a tablespoon of massively parallel compute, and voila.
Result: you add all relevant data → you analyze it all → you get better security
Simple
It is SaaS, there are no appliances, there are no servers, there is no storage, there is just a browser connected to an elastic cloud.
Result: you don’t have to spend time on running it → you spend time on using it → you get more value → better analysis → better security
Machine Learning
Rules, check. What about that other unknown stuff? Answer: machine that learns from data. It detects patterns without human input. It then figures out baselines and normal behavior across sources. In real-time it compares new data to the baseline and notifies you when things are sideways. Even if “things” are things you’ve NEVER even thought about and NOBODY in the universe has EVER written a single rule to detect. Sumo Logic detects those too.
Result: Skynet … nah, benevolent overlord, nah, not yet anyway. New stuff happens → machines go to work → machines notify you → you provide feedback → machines learn and get smarter → bad things are detected → better security
Read more: Sumo Logic Enterprise Security Analytics