The cost of acquiring new customers is high, so companies are spending more on customer loyalty and retention. Identifying the total value generated by a customer in the entire customer life cycle would help companies in business campaigns and in other activities. So naturally Customer Relationship Management (CRM) becomes a key element of modern marketing strategies.
If we can predict a score that allows us to project, on a given population, quantifiable information then it can be used by the information system (IS) to personalize the customer relationship.
KDD (Knowledge Discovery and Data Mining) Cup 2009 challenge consists of three tasks, predicting the churn, appentency and upselling, through the data provided by the telecom company Orange. The business idea is to :
The challenge is to beat the in-house system developed by Orange Labs For large dataset, in-house AUC score is following:
We have two versions of the data and both have 50,000 samples but the large version contains 15,000 features and the small version contains only 230. The target variable values are +1 or -1 indicating positive and negative class labels respectively.
In the small version of the dataset 40 features are categorical with high cardinality and rest are all numerical. As per challenge rules the performance of the predictions was evaluated according to the average area under the ROC curve of three tasks; churn, appentency and upselling (collectively called score).
#machine-learning #data-science