2020 has been a year to say the least. One of the reasons I’ve felt so has to do with politics. I know a lot of folks do not necessarily like to speak about it (fine), but I wanted to do some (political) data science!

I wanted to make use of machine learning to predict the winners of a given election, but soon came to realize there were not many prepackaged datasets with this information available on the internet. This meant that, in order to do what I wanted, I would need to _make my own data set _through the use of web scraping. This process was a significant portion of my overall project, so I thought I would share a bit about what I learned along the way.

Scraping Wikipedia

As alluded to earlier, it took longer than expected just to scrape and organize the data in a manner that would be usable for EDA (Exploratory Data Analysis) and/or machine learning. So, hopefully for those reading this, my experiences will save you some time and grief during your own data collection!

#beautifulsoup #web-scraping #politics #html

1.15 GEEK