About 5 months ago, I stumbled upon this article on TheScore. The summary: the traditional 5 positions are no longer enough to describe NBA players. The game has changed after all. The authors come up with a way to classify players in 9 classes, based on the way they play the game.
In this article, I will take another shot at classifying players in various clusters, depending on what they do on the court. However, I will do it using data science and more precisely the K-Means clustering.
I will also take a deeper look at what makes a winning team, i.e. what type of players should be put together for a team to be successful.
Let’s get to it!
I began by scaping data directly from NBA.com. In total, I collected a total of 28 stats for all 529 players that played in the league in 2019–2020.
Along with traditional stats (points per game, assists, rebounds, etc.), I also collected stats describing shot location, type of offensive play (drive, iso, etc.) defensive efficiency and usage rate.
Then, I decided to get rid of players that played less than 12 minutes per game, as I felt classifying players based on how they play when they barely play was not gonna provide accurate results.
#Remove players with at less than 12min per game
df=df[df.MINUTES > 12]
df = df.reset_index(drop=True)
That leaves us with a total of 412 players.
#nba #scikit-learn #python #clustering #k-means