An overview of agglomerative hierarchical clustering, dendrogram and their implementation in python

Image for post

Photo by Campaign Creators on Unsplash

Agglomerative Clustering is a type of hierarchical clustering algorithm. It is an unsupervised machine learning technique that divides the population into several clusters such that data points in the same cluster are more similar and data points in different clusters are dissimilar.

  • Points in the same cluster are closer to each other.
  • Points in the different clusters are far apart.

Image for post

(Image by Author), Sample 2-dimension Dataset

In the above sample 2-dimension dataset, it is visible that the dataset forms 3 clusters that are far apart, and points in the same cluster are close to each other.

Hierarchical Clustering: Agglomerative and Divisive — Explained

An overview of agglomeration and divisive clustering algorithms and their implementation

towardsdatascience.com

The intuition behind Agglomerative Clustering:

Agglomerative Clustering is a bottom-up approach, initially, each data point is a cluster of its own, further pairs of clusters are merged as one moves up the hierarchy.

Steps of Agglomerative Clustering:

  1. Initially, all the data-points are a cluster of its own.
  2. Take two nearest clusters and join them to form one single cluster.
  3. Proceed recursively step 2 until you obtain the desired number of clusters.

Image for post

#data-science #artificial-intelligence #machine-learning #programming #python

Agglomerative Clustering and Dendrograms — Explained
1.55 GEEK