K-means Clustering Algorithm. Getting under the hood of k-means, so that a 7th grader can understand. The algorithm I choose to implement for this project was K-means clustering. The data generated attempted to model a water distribution scenario based on the distance each point is from a proposed water-well.
The algorithm I choose to implement for this project was K-means clustering. The data generated attempted to model a water distribution scenario based on the distance each point is from a proposed water-well. Essentially, the algorithm is ideal for a customer segmentation scenario that clusters customers around a particular water-well based on their location and _k _numbers of wells.
In this imaginary scenario, several local municipalities have struggled for years with its aging water infrastructure. Supply, storage, movement of water from one point to another are a challenge. A data scientist was called in to build some models about how customers will be affected and to shed some light on improving water delivery supply.
The goal was to understand the algorithm; however, when variables have real meaning it helps us to get under the hood to see how it works and the math behind it.
Example Problem: How will customers cluster with respect to their location? Create a visual model for the placement of 3 and 7 water-wells, each well serving as the centroid.
The location of customers were fixed equidistance points, generated using the built-in range function. There were a total of 1392 points.
Image by Author
After plotting the fixed location of each customer, I found what I considered to be the geographical center of the dataset. This I assumed would be the ideal place for the first water-well.
Master Applied Data Science with Python and get noticed by the top Hiring Companies with IgmGuru's Data Science with Python Certification Program. Enroll Now
Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.
Kmeans is a widely used clustering tool for analyzing and classifying data. Often times, however, I suspect, it is not fully understood what is happening under the hood.
In the programming world, Data types play an important role. Each Variable is stored in different data types and responsible for various functions. Python had two different objects, and They are mutable and immutable objects.
This Edureka video on "Developing A K-Means Clustering Model Using Python" will help you understand the various aspects of clustering using K Means in Python.