The goal of this article is to outline projects that a professional Data Scientist will eventually perform or should perform. I have taken a lot of bootcamps and educational courses in Data Science. While they have all been useful in some way, I find that some forget to highlight real-world applications of Data Science. It is beneficial to know what to expect as you transition from educational to professional Data Scientist. Customer segmentation, text classification, sentiment analysis, time series forecasting, and recommender systems can all help your company that you are employed at tremendously. I will perform a deep dive an explain why these specific five projects come to mind, and we will hopefully motivate you to employ these where you work.
Customer segmentation is a form of Data Science where an unsupervised and clustering modeling technique is employed to develop groups or segments of a human population or observations in data. The goal is to create groups that are separate, but the groups themselves have closely related features. The technical term for this separation and togetherness is called:
Between-groups sum of squares (BGSS)
Within-group sum of squares (WGSS)
K-means clustering. Image by Author .
As you can see in the image above, these groups are well separated — BGSS and are closely centered — WGSS. This example is ideal. Think of each of the clusters as those groups that you will target with a specific marketing advertisement: ‘we want to appeal to recent college graduates by marketing our company product as young-professional centered’. Some useful clustering algorithms are:
Agglomerative Hierarchical Clustering
What happens with customer segmentation results?
— finding insights about specific groups
— marketing towards specific groups
— defining groups in the first place
— tracking metrics about certain groups
This type of Data Science project is broadly used, but most useful in the marketing industry.
#technology #machine-learning #towards-data-science #data-science #education
Customer segmentation, text classification, sentiment, time series, and recommender systems.