Across many business use cases that generate data, it is frequently desirable to automatically identify data samples that deviate from “normal”. In many cases, these deviations are indicative of issues that need to be addressed. For example, an abnormally high cash withdrawal from a previously unseen location may be indicative of fraud. An abnormally high CPU temperature may be indicative of impending hardware failure. This task of finding these anomalies is broadly referred to as Anomaly Detection, and many excellent approaches have been proposed (Clustering based approaches, Nearest Neighbors, Density estimation etc). However, as data become high dimensional, with complex patterns, existing approaches (linear models which mostly focus on univariate data) can be unwieldy to apply. For such problems, deep learning can help.
This post discusses Anomagram — an interactive visualization of how autoencoders can be applied to the task of anomaly detection. I created it as both a learning tool and a prototype example of what an ML product interface could look like (of course what I cover is a small slice of the entire process).
Full Demo here: https://victordibia.github.io/anomagram/
Project source code: https://github.com/victordibia/anomagram
Post on how Anomagram was designed: https://medium.com/@victor.dibia/anomagramdesign-c9391d29b58b
The remainder of the post discusses the following (skip ahead as needed)
Why Anomagram?
Background and Dataset
Interface Affordances and insights
Conclusions.
#machine-learning #visualization #anomaly-detection #deep-learning #autoencoder