What is DBSCAN

DBSCAN(Density-Based Spatial Clustering of Applications with Noise) is a commonly used unsupervised clustering algorithm proposed in 1996. Unlike the most well known K-mean, DBSCAN does not need to specify the number of clusters. It can automatically detect the number of clusters based on your input data and parameters. More importantly, DBSCAN can find arbitrary shape clusters that k-means are not able to find. For example, a cluster surrounded by a different cluster.

Also, DBSCAN can handle noise and outliers. All the outliers will be identified and marked without been classified into any cluster. Therefore, DBSCAN can also be used for Anomaly Detection (Outlier Detection)

Before we take a look at the preusdecode, we need to first understand some basic concepts and terms. Eps, Minpits, Directly density-reachable, density-reachable, density-connected, core point and border point

First of all, there are two parameters we need to set for DBSCAN, Eps, and MinPts.

#data-mining #outlier-detection #python #clustering #dbscan

26.30 GEEK