In cybersecurity, anomaly detection is focused on finding unusual events that might be considered cyber-attacks. The promise of anomaly detection is that the algorithm will detect attacks that have never been seen before. Some of the problems with network data are that there is so much of it, the number of attacks is low, and attacks evolve every day.

Many intrusion detection applications of machine learning rely on supervised learning. Supervised machine learning is good at detecting known attacks. With proper fitting, a supervised machine learning algorithm may even be able to find some novel attacks. But anomaly detection takes a fresh look at the data without the predefined attack signatures.

Anomalies Are Normal

When processing network traffic data, an anomaly detection algorithm may find events that are truly anomalies, but they might not be the events you were looking for. Large computer networks have a rhythm. They have processes that run periodically. They generally have the same users who do the same thing every day. But network data is far from regular. There are events that happen on networks that interrupt the normal routine. There are operational events that result from system errors, defects, or misconfigurations. There are system changes that are deployed on servers to add software features and to patch security vulnerabilities. There are new users who start working and others whose jobs change and who find themselves with new responsibilities. With all of these changes occurring, which events are really anomalies?

Detecting cyber-attacks in a changing network ecosystem with an anomaly detection algorithm is very challenging, since an anomaly may have nothing to do with exploit attempts. Although it seems contradictory, anomalies are normal.

It is important to realize that the anomalies you find may not be the anomalies you are actually looking for. This applies to intrusion detection as well as to many other areas of unsupervised learning. Just because a piece of data is different does not mean it is bad. In fraud detection, you might look at the buying patterns of customers and see unusual purchases in a store where the customer has never done business. It may be fraudulent activity, or it may just be that the customer decided to do something random, different, or outside of normal. This is one of the reasons why anomaly detectors have such high false positive rates.

#cybersecurity #anomaly-detection #clustering #machine-learning #intrusion-detection

Anomaly Detection is in the Eye of the Beholder
1.30 GEEK