Into A New Way to Classify NBA Players Using Analytics

About 5 months ago, I stumbled upon this article on TheScore. The summary: the traditional 5 positions are no longer enough to describe NBA players. The game has changed after all. The authors come up with a way to classify players in 9 classes, based on the way they play the game.

In this article, I will take another shot at classifying players in various clusters, depending on what they do on the court. However, I will do it using data science and more precisely the K-Means clustering.

I will also take a deeper look at what makes a winning team, i.e. what type of players should be put together for a team to be successful.

Let’s get to it!

Preparing the data

I began by scaping data directly from NBA.com. In total, I collected a total of 28 stats for all 529 players that played in the league in 2019–2020.

Along with traditional stats (points per game, assists, rebounds, etc.), I also collected stats describing shot location, type of offensive play (drive, iso, etc.) defensive efficiency and usage rate.

Then, I decided to get rid of players that played less than 12 minutes per game, as I felt classifying players based on how they play when they barely play was not gonna provide accurate results.

#Remove players with at less than 12min per game
df=df[df.MINUTES > 12]
df = df.reset_index(drop=True)

That leaves us with a total of 412 players.

#nba #scikit-learn #python #clustering #k-means

Preparing the data

towardsdatascience.com

Into A New Way to Classify NBA Players Using Analytics