Video demonstrate how to use and implement DBSCAN Clustering in practice with Python in real data. This is one of methods how to clean your data by removing data noise or spatial outliers.

DBSCAN is Density-based spatial clustering of applications with Noise. This unsupervised learning algorithm is perfect method to detect outliers in your data if your data points are densely grouped and you need to extract some data noise from there (outliers).

DBSCAN groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away).

For this lesson I used one of the most popular Machine Learning packages - scikit learn, numpy (for numerical transformations), pandas (for data manipulations), matplotlib (for plotting and visualizing clusters and outliers).

Content of the demonstration:
0:03 : Advantages of DBSCAN Clustering.
0:07 : Disadvantages of DBSCAN.
0:11 : STEP 01. Import modules, packages and dependencies.
0:54 : STEP 02. Load data (plot the geographical points - longitudes and latitudes).
2:17 : STEP 03. Prepare DBSCAN model (train the model and detect outliers).
4:22 : STEP 04. Visualize Clusters and Outliers (data noise).

Subscribe: https://www.youtube.com/c/VytautasBielinskas/featured

#python

Implement DBSCAN Clustering and detecting OUTLIERS with Python
1.90 GEEK