Clustering is the general study of grouping similar objects together. This definition is purposely very abstract, because what the objects are and how you define the “distance” between two objects both make a huge difference in the output, but the general algorithms remain similar in all cases. There are numerous clustering algorithms with different properties and different performance characteristics, but in this article I’ll talk about single link clustering, or SLINK for short. Single link is one of the simplest clustering algorithms: it starts out with N clusters consisting of one object each, and at each step merges the two “closest” clusters. Single link clustering is easy to implement and I often use it to gain insight into hard-to-grasp data sets.

### Clustering Geospatial Coordinates

An easy way to visualize how clustering works is by clustering points on a map. The distance metric is pretty obvious and the correct answer is easy to see at a glance. For example, below is a map of the approximate coordinates of 8 MongoDB offices: 4 in North America, 2 in Europe, 1 in the Middle East, and 1 in Australia. Intuitively, if you were to break these locations into 4 clusters based on distance, you’d cluster them into Australia, Europe, Middle East, and Australia. Click the below map for an interactive view.

#node.js #node