Learn how to code a (almost) one liner python function to calculate (manually) cosine similarity or correlation matrices used in many data science algorithms using the broadcasting feature of numpy library in Python.

Do you think we can say that a professional MotoGP rider and the kid in the picture have the same passion for motorsports even if they will never meet and are different in all the other aspects of their life ? If you think yes then you grasped the idea of cosine similarity and correlation.

Now suppose you work for a pay tv channel and you have the results of a survey from two groups of subscribers. One of the anaysis could be about the similarities of tastes between the two groups. For this type of analysis we are interested to select people sharing similar behaviours regardless of “how much time” they watch TV. This is well represented by the concept of **cosine similarity** which allow to consider as “close” those ‘observations’ aligned to some interesting for us directions regardless of how different the magnitude of the measures are from each other .

#data-science #marketing #machine-learning #analytics #python

2.20 GEEK