Density-based spatial clustering for applications with noise, DBSCAN, is one mouthful of a clustering algorithm. Created in 1996, it has withstood the test of time and is still one of the most useful approaches to clustering data points today. For fun, and to broaden my horizons, I took a stab brewing up my own DBSCAN class in python.

How does it work?

Clustering in DBSCAN is determined by categorizing data points into three types based on the relationship of a point to its neighboring points. The three types of points are Core points, Border points, and Noise points. A cluster is made by identifying a core point, determining if its neighboring points are core points or border points. A cluster is only expanded beyond a single point’s neighborhood when the neighboring points are themselves, core points.

#python #data-science #algorithms

Homebrewing DBSCAN in Python
2.10 GEEK