Introduction

Music is the pleasure the human mind experiences from counting without being aware that it is counting_” — _Leibniz

Music helps anybody to connect with what you are doing. It elevates mood and rejuvenates the waves of thoughts. People are fond of listening to music every time whether it’s commute time, work time or focus time. Different people have different flavours of music. Music has served its users with various platforms like waves of Victrola, a culture of Cassette, Walkman era, i-pods, FM-Radios and now latest musical apps like Spotify, Amazon Prime Music, Deezer, SoundCloud, Gaana, etc.

Intenet made life easy in terms of selecting music of users’ choice, but still, algorithms are needed to recommend favourite music to users without selecting manually.


1. Business Problem and constrains:

Our business objective is to serve users with songs of their taste! This recommendation should not take hours, seconds would be sufficient to predict the chances of listening.

  • ML Problem Formulation

We have to build the model which will predict whether a user will re-listen to the song by evaluating given features of the user and songs. We can convert this problem as a classification problem and can apply various classification algorithms.


2. Data Discussion:

Dataset source: https://www.kaggle.com/c/kkbox-music-recommendation-challenge/data

The problem has 6 data files:

1.train.csv: This file includes

user_id (msno), song_id, source_system_tab (where the event was triggered),

source_type (an entry point a user first plays music), source_screen_name (name of the layout user sees) and target ( 1 means there is a recurring listening event(s) triggered within a month after the user’s a very first observable listening event, target=0 otherwise ).

2. test.csv: This file includes

user_id (msno), song_id, source_system_tab (where the event was triggered),

source_type (an entry point a user first plays music) and source_screen_name (name of the layout user sees).

3. songs.csv: This file has features like

song_id, song_length, genre_id, artist_name, composer, lyricist and language.

4. members.csv: This file has msno (user_id), city, bd (may contains outliers), gender, register_via (register method), register_init_time (date) and expirartion_date (date).

5. song_extra_info.csv: This file has features like song_id, song_name and

ISRC (International Standard Recording Code) used to identify songs.

#machine-learning #classification #recommendation-system #music-recommendations #data-science #deep learning

WSDM — KKBox’s Music Recommendation Challange
3.40 GEEK