Web scraping with Scrapy: Theoretical Understanding

Web scraping with Scrapy: Theoretical Understanding

This is the first part of a 4 part tutorial series on web scraping using Scrapy and Selenium. This tutorial and subsequent ones will focus on data collection through web scraping using Scrapy. Scrapy is an application framework for crawling web sites and extracting structured data that can be used for a wide range of useful applications, like data mining, information processing or historical archival.

In this knowledge era, data is everything. It drives our day-to-day activities either implicitly or explicitly. In a typical data science project, data collection & data cleaning contributes to approximately 80% of the total work. This tutorial and subsequent ones will focus on data collection through web scraping using Scrapy. Scrapy is an application framework for crawling web sites and extracting structured data that can be used for a wide range of useful applications, like data mining, information processing or historical archival.

Scrapy has many advantages, some of which are:

  • 20 times faster than other web scraping tools
  • Best for developing complex web crawlers and scrapers
  • Consumes less RAM and use minimal CPU resources

Despite its advantages, Scrapy has a reputation for having a steep learning curve and not being beginner-friendly. But, once mastered, it would be the one go-to tool for web scraping. This tutorial is my humble attempt at making it somewhat beginner-friendly. My goal is to make you understand the working of Scrapy and induce confidence in yourself to work with Scrapy using Python as the programming language. In order to work confidently with Scrapy, it is imperative to first understand how it works.

This is the first part of a 4 part tutorial series on web scraping using Scrapy and Selenium.

web-scraping-series python web-scraping scrapy

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Web scraping with Scrapy: Practical Understanding

This is the second part of a 4 part tutorial series on web scraping using Scrapy and Selenium. With all the theoretical aspects of using Scrapy being dealt with in part-1, it's now time for some practical examples. I shall put these theoretical aspects into examples of increasing complexity.

Basic Data Types in Python | Python Web Development For Beginners

In the programming world, Data types play an important role. Each Variable is stored in different data types and responsible for various functions. Python had two different objects, and They are mutable and immutable objects.

Web Scraping With Selenium & Scrapy

Web Scraping With Selenium & Scrapy.A hands-on combining Selenium with Scrapy. In the previous tutorials, we have understood and worked with Scrapy and Selenium individually. In this tutorial, I shall be highlighting the need to combine these two and explaining how to do it. Let us start with the need to combine Selenium with Scrapy.

Web Scraping With Python

There is an inordinate amount of data online that is available to be accessed. Knowing how to retrieve and analyze this data is an extremely useful skill to have. In this tutorial, we will use the python requests and Beautiful Soup libraries for quickly web scraping such data.

Web Scraping with Selenium

This is the third part of a 4 part tutorial series on web scraping using Scrapy and Selenium. You can reach part-1 by clicking here and part-2 by clicking here. These two parts dealt with web scraping using Scrapy.