On the face of it, Natural Language Processing (NLP) and time series analysis do not necessarily appear to have that much in common.
In the context of data science, the main reasons for analysing text are typically as follows:
I particularly wish to address the domains of text classification and sentiment analysis in this regard.
Let’s consider an example. Suppose that one built a sentiment analysis model in 2019 in order to gauge sentiment on travel. Data might have been collated from a variety of social networks, e.g. Twitter, Reddit, etc.
Chances are — sentiment on travel might have still been quite positive — notwithstanding a degree of concern due to the impacts of travel on climate change.
However, 2020 is a vastly different landscape for travel (or lack thereof), with air passenger numbers having plummeted as a result of the COVID-19 pandemic.
As a result, any sentiment model that would have been trained on 2019 data would likely perform quite poorly if run today. Travel restrictions, virus fears, and economic concerns are likely to have been under-represented in any corpus that would have been used to train a text classification model to gauge travel sentiment. Moreover, the term “COVID-19” did not exist before this year, and a text classification model would not know to assign a negative sentiment to this term in the context of travel.
#data-science #timeseries #nlp #machine-learning