Easy Text Annotation in a Jupyter Notebook

At the heart of any sentiment analysis project is a good set of labeled data. Pre-labeled datasets can be found on various sites all over the internet. But…

What if you have come up with a custom dataset that has no labels?
What if you have to provide those labels before proceeding with your project?
What if you are not willing to pay to outsource the task of labeling?

I was recently faced with this very issue while retrieving text data from the Twitter Streaming API for a sentiment analysis project. I quickly discovered annotating the data myself would be a painful task without a good tool. This was the inspiration behind building tortus, a tool that makes it easy to label your text data within a Jupyter Notebook!

Begin by installing tortus and enabling the appropriate nbextentions.

$ pip install tortus
$ jupyter nbextension enable --py widgetsnbextension

After opening a Jupyter Notebook, read your data into a pandas dataframe.

import pandas as pd
movie_reviews = pd.read_csv('movie_reviews.csv')

Import the package and create an instance of Tortus.

from tortus import Tortus

tortus = Tortus(df, text, num_records=10, id_column=None, annotations=None, random=True, labels=['Positve', 'Negative', 'Neutral'])

#annotation-tools #sentiment-analysis #jupyter-notebook #nlp #data-science

towardsdatascience.com

Easy Text Annotation in a Jupyter Notebook