Anomaly Detection can be termed for the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data . It becomes an important problem to solve in domains such as Bank frauds, Cellular networks, etc. Anomalies are also referred to as outliers, noise, and exceptions. This blog will reflect upon my learnings on Week 2 of Intel’s Anomaly Detection course that I have been doing lately. Feel free to do yourself as well, it’s free and self-paced. Below are the two techniques that we will focus upon -
This technique decides an anomaly based on the measurement of the angle formed by a set of three points in the data space. The variation in the magnitude of the angular enclosure comes out to be different for outliers and others, which becomes the metric for us to cluster normal and outlier points in different clusters. See below fig. for better understanding through visuals -
As you can see, in the left fig. we have essentially 2 clusters — Normal Points and Outlier _(Single blue point). _If we choose an orange point as the point of interest and would want to see if this point is an outlier or not, then as per this idea, we would calculate the angle enclosed by this point and any other two points in the space. If we do so, we will be able to observe a lot of variance in the value of an enclosed angle for each new set of two points forming a triangle with the orange point. Such a pattern shows that points are clustered in the close vicinity, hence the enclosed angle would vary a lot. Similarly, now if we observe the right-hand figure, and focus our attention on the green point in space and repeat the process of choosing any two other points and making an angle with the green point in-between, we would see a very less angular variance. Such a pattern shows that the point of concern is far away from the majority cluster and could possibly be an outlier.
#machine-learning #anomaly-detection #deep-learning #intel