Sofia  Maggio

Sofia Maggio


Visualizing Machine Learning Tasks With Word Embeddings

On the second look, the scarry-looking plot uncovers meaningful spatial relationships (eg. tasks of detecting hate speech, abusive language and emotions are embeded close to each other).

Machine learning is delivering value to a rapidly increasing breadth of industries. Often in a form of clever and novel use cases that leverage its incredibly horizontal application potential. Contrary to intuition, what lies behind the innovation are quite often well-understood tasks and the creative leap is in their identification and impactful integration into the context of the unique problem at hand.


In this article, we peek into the landscape of those tasks as they are categorized under machine learning research datasets by Papers With Code (an amazing resource that compiles research papers along with the supplied code and data).

We’ve web scraped** well over 400 machine learning tasks** and embedded them into a 25-dimensional space (based on a semantic understanding of words) using a GloVe (Global Vectors for Word Representation) model, pre-trained on 2 billion Tweets. After reducing the dimensionality from 25D to 3D and 2D, using PCA (Principal Component Analysis), we’ve been able to visualize the tasks in a way that captures meaning within their spatial interrelations.

You can, for instance, explore the neighbours of a chosen task to discover other semantically similar tasks.


You can walk through the code and explore the visualizations hands-on in

this Jupyter notebook 📔 !

Web Scraping Machine Learning Tasks

Downloading GloVe Model and Preparing the Input
Embedding Tasks Into 25D Space
Reducing Dimensionality and Plotting Interactive Visualizations

#machine-learning #nlp #artificial-intelligence #deep-learning #visualizing machine learning

Visualizing Machine Learning Tasks With Word Embeddings