In this article I explore using Reddit sentiment data to inform trading strategies. I derive market sentiment in two ways using the wallstreetbets subreddit:
Collecting comments from daily discussion submissions then running the VADER sentiment model to assess overall daily positive/negative sentiment.
Collecting all submission titles per day then assessing daily bullish/bearish sentiment using keyword analysis.
In the featurization phase I apply Fourier transforms to smooth the two very noisy time-series datasets. Finally, in the strategy development phase I explore two possible strategies. The first involves exploiting the spread between SPY (SPDR S&P 500 Trust ETF) price and daily positive/negative sentiment. The second strategy involves training a LSTM (long short term memory) model to predict the next day’s SPY price based on bullish/bearish sentiment.
#data-science #machine-learning