Cluster Analysis with DBSCAN : Density-based spatial clustering of applications with noise. Here we will focus on Density-based clustering method DBSCAN (Density-based spatial clustering of applications with noise) method.

Cluster Analysis is an unsupervised machine learning method that divides data points into clusters or groups, such that all data points in one cluster/group have similar attributes or characteristics. There are four major categories for cluster analysis: Partitioning methods (K-means), Hierarchical methods (BIRCH), Density based methods (DBSCAN) and Grid based methods.

Usually, all the clustering algorithms have same approach i.e. to find the similarities between data points and group them together. Here we will focus on Density-based clustering method DBSCAN (Density-based spatial clustering of applications with noise) method.

*What is Density-Based Clustering?*

Itis a method that identify distinctive clusters in the data, based on the key idea that a cluster is a group of high data point density, separated from other such clusters by regions of low data point density. The main idea is to find highly dense regions and consider them as one cluster. It can easily discover clusters of different shapes and sizes from a large amount of data, which is containing noise and outliers.

The DBSCAN algorithm uses two major parameters:

**minPts:**The minimum number of points (a threshold) clustered together for a region to be considered dense i.e. the minimum number of data points that can form a cluster**eps (ε):**A distance measure that will be used to locate the points in the neighborhood of any point.

The algorithm takes care of two concepts called Density Reachability and Density Connectivity.

**Density Reachability**: A point to be reachable from another if it lies within a particular distance (eps) from it, which indicates how densely reachable a cluster is.- *
*Density Connectivity: **DBSCAN involves a transitivity based chaining-approach to determine whether points are located in a particular cluster. For example, a and d points could be connected if a->b->c->d, where p->q means q is in the neighborhood of p.

python cluster-analysis dbscan data-science machine-learning

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.