Overview of Clustering

Many things around us can be categorized as “this and that” or to be less vague and more specific, we have groupings that could be binary or groups that can be more than two, like a type of pizza base or type of car that you might want ## Overview of Clustering

Many things around us can be categorized as “this and that” or to be less vague and more specific, we have groupings that could be binary or groups that can be more than two, like a type of pizza base or type of car that you might want to purchase. The choices are always clear – or, how the technical lingo wants to put it – predefined groups and the process predicting that is an important process in the Data Science stack called Classification.

But what if we bring into play a quest where we don’t have pre-defined choices initially, rather, we derive those choices! Choices that are based out of hidden patterns, underlying similarities between the constituent variables, salient features from the data etc. This process is known as Clustering in Machine Learning or Cluster Analysis, where we group the data together into an unknown number of groups and later use that information for further business processes.

So, to put it in simple words, in machine learning clustering is the process by which we create groups in a data, like customers, products, employees, text documents, in such a way that objects falling into one group exhibit many similar properties with each other and are different from objects that fall in the other groups that got created during the process.

Clustering algorithms take the data and using some sort of similarity metrics, they form these groups – later these groups can be used in various business processes like information retrieval, pattern recognition, image processing, data compression, bioinformatics etc. In the Machine Learning process for Clustering, as mentioned above, a distance-based similarity metric plays a pivotal role in deciding the clustering.

In this article, we shall understand the various types of clustering, numerous clustering methods used in machine learning and eventually see how they are key to solve various business problems

#machine learning

Different Types of Clustering Methods and Applications
1.15 GEEK