Web Scrapping Cricket ODI data from HowStat and Preprocessing

Web Scrapping Cricket ODI data from HowStat and Preprocessing

And since World Cup is drawing close, I am sure millions of cricket fans are trying to predict who is going to take the cricket glory home.

Cricket is one of my favorite sports(although I can’t play it to save my life). And since World Cup is drawing close, I am sure millions of cricket fans are trying to predict who is going to take the cricket glory home. I had carried out some predictions for Indian Twenty20 Cricket League called IPL, which is present atiplpredictormatches.pythonanywhere.com*. *I am hoping to make similar predictor machine for World Cup, too. And, this article covers the first step for this — Data Gathering Phase.

Data Gathering Phase is a task that can take up to 70 to 80% of your total time dedicated to any project. For gathering data, I am going to use Web Scraping as all major cricket data is present on the web and we can easily access it through web scraping. [HowStat _](http://howstat.com/)_is an excellent structured cricket statistics site that I will be using in this article. Another great option is [espncricinfo.com_](http://www.espncricinfo.com/)._

For this article, I will only be carrying out only two tasks** —**

  1. Finding all the players that have ever played an ODI match and
  2. Finding the scores of all the players in each year and how many matches they played in that year.

Let’s start with the first task. For web scrapping, we will need the following basic libraries which we will first import:

Filename : scrapping.py

Image for post

import pandas as pd  # file operations
from bs4 import BeautifulSoup as soup  #Scrapping tool
from urllib.request import urlopen as ureq # For requesting data from link
import numpy as np
import re

Next, we will write code for web scraping using Beautiful Soup:

For the URL, I go to HowStat Website and decide to first take the data of the players with alphabet starting from A — as they provide different sets of players on their starting character for name. For simplicity, we first take the character A.

Hence, the website URL is http://howstat.com/cricket/Statistics/Players/PlayerList.asp?Country=ALL&Group=A. Go to this website link and press Ctrl+Shift+I to Inspect the HTML Code. Through this, you can understand the location of the needed data in the HTML code. This is important as we will scrap through HTML code. Next, since we need data of all the players that can be seen, we have two options.

  1. Take each data individually.
  2. Take the whole table.

Obviously the second idea is more appealing and requires less lines of code.

data-science data-collection data-preprocessing web-scrapping cricket data analysis

What is Geek Coin

What is GeekCash, Geek Token

Best Visual Studio Code Themes of 2021

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

How To Build A Data Science Career In 2021

In Conversation With Dr Suman Sanyal, NIIT University,he shares his insights on how universities can contribute to this highly promising sector and what aspirants can do to build a successful data science career.

Your Data Architecture: Simple Best Practices for Your Data Strategy

Your Data Architecture: Simple Best Practices for Your Data Strategy. Don't miss this helpful article.

What Are The Advantages and Disadvantages of Data Science?

Online Data Science Training in Noida at CETPA, best institute in India for Data Science Online Course and Certification. Call now at 9911417779 to avail 50% discount.

Data Science vs Big Data: Difference Between Data Science & Big Data

In the digital era that we live in, data has become the biggest and most valuable asset for most organisations. Data is rapidly transforming the way we live and communicate, and it is by collecting, sorting and studying this data, that organisations across the world are looking for ways to impact their bottom lines. In this post, we'll learn Data Science vs Big Data: Difference Between Data Science & Big Data.

50 Data Science Jobs That Opened Just Last Week

Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.