Joseph  Norton

Joseph Norton


Crawling the Web with Python and Scrapy


Web scraping, often called web crawling or web spidering, or “programmatically going over a collection of web pages and extracting data,” is a powerful tool for working with data on the web.

With a web scraper, you can mine data about a set of products, get a large corpus of text or quantitative data to play around with, get data from a site without an official API, or just satisfy your own personal curiosity.

In this tutorial, you’ll learn about the fundamentals of the scraping and spidering process as you explore a playful data set. We’ll use BrickSet, a community-run site that contains information about LEGO sets. By the end of this tutorial, you’ll have a fully functional Python web scraper that walks through a series of pages on Brickset and extracts data about LEGO sets from each page, displaying the data to your screen.

The scraper will be easily expandable so you can tinker around with it and use it as a foundation for your own projects scraping data from the web.

Step 1 — Creating a Basic Scraper

Scraping is a two step process:

  1. You systematically find and download web pages.
  2. You take those web pages and extract information from them.

Both of those steps can be implemented in a number of ways in many languages.

You can build a scraper from scratch using modules or libraries provided by your programming language, but then you have to deal with some potential headaches as your scraper grows more complex. For example, you’ll need to handle concurrency so you can crawl more than one page at a time. You’ll probably want to figure out how to transform your scraped data into different formats like CSV, XML, or JSON. And you’ll sometimes have to deal with sites that require specific settings and access patterns.

You’ll have better luck if you build your scraper on top of an existing library that handles those issues for you. For this tutorial, we’re going to use Python and Scrapy to build our scraper.

Scrapy is one of the most popular and powerful Python scraping libraries; it takes a “batteries included” approach to scraping, meaning that it handles a lot of the common functionality that all scrapers need so developers don’t have to reinvent the wheel each time. It makes scraping a quick and fun process!

Scrapy, like most Python packages, is on PyPI (also known as pip). PyPI, the Python Package Index, is a community-owned repository of all published Python software.

If you have a Python installation like the one outlined in the prerequisite for this tutorial, you already have pip installed on your machine, so you can install Scrapy with the following command:

pip install scrapy

If you run into any issues with the installation, or you want to install Scrapy without using pip, check out the official installation docs.

With Scrapy installed, let’s create a new folder for our project. You can do this in the terminal by running:

mkdir brickset-scraper

Now, navigate into the new directory you just created:

cd brickset-scraper

Then create a new Python file for our scraper called We’ll place all of our code in this file for this tutorial. You can create this file in the terminal with the touch command, like this:


Or you can create the file using your text editor or graphical file manager.

We’ll start by making a very basic scraper that uses Scrapy as its foundation. To do that, we’ll create a Python class that subclasses scrapy.Spider, a basic spider class provided by Scrapy. This class will have two required attributes:

  • name — just a name for the spider.
  • start_urls — a list of URLs that you start to crawl from. We’ll start with one URL.

Open the file in your text editor and add this code to create the basic spider:

import scrapy

class BrickSetSpider(scrapy.Spider):
    name = "brickset_spider"
    start_urls = ['']

Let’s break this down line by line:

First, we import scrapy so that we can use the classes that the package provides.

Next, we take the Spider class provided by Scrapy and make a subclass out of it called BrickSetSpider. Think of a subclass as a more specialized form of its parent class. The Spider subclass has methods and behaviors that define how to follow URLs and extract data from the pages it finds, but it doesn’t know where to look or what data to look for. By subclassing it, we can give it that information.

Then we give the spider the name brickset_spider.

Finally, we give our scraper a single URL to start from: If you open that URL in your browser, it will take you to a search results page, showing the first of many pages containing LEGO sets.

Now let’s test out the scraper. You typically run Python files by running a command like python path/to/ However, Scrapy comes with its own command line interface to streamline the process of starting a scraper. Start your scraper with the following command:

scrapy runspider

You’ll see something like this:

Output2016-09-22 23:37:45 [scrapy] INFO: Scrapy 1.1.2 started (bot: scrapybot)
2016-09-22 23:37:45 [scrapy] INFO: Overridden settings: {}
2016-09-22 23:37:45 [scrapy] INFO: Enabled extensions:
2016-09-22 23:37:45 [scrapy] INFO: Enabled downloader middlewares:
2016-09-22 23:37:45 [scrapy] INFO: Enabled spider middlewares:
2016-09-22 23:37:45 [scrapy] INFO: Enabled item pipelines:
2016-09-22 23:37:45 [scrapy] INFO: Spider opened
2016-09-22 23:37:45 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-09-22 23:37:45 [scrapy] DEBUG: Telnet console listening on
2016-09-22 23:37:47 [scrapy] DEBUG: Crawled (200) <GET> (referer: None)
2016-09-22 23:37:47 [scrapy] INFO: Closing spider (finished)
2016-09-22 23:37:47 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 224,
 'downloader/request_count': 1,
 'scheduler/enqueued/memory': 1,
 'start_time': datetime.datetime(2016, 9, 23, 6, 37, 45, 995167)}
2016-09-22 23:37:47 [scrapy] INFO: Spider closed (finished)

That’s a lot of output, so let’s break it down.

  • The scraper initialized and loaded additional components and extensions it needed to handle reading data from URLs.
  • It used the URL we provided in the start_urls list and grabbed the HTML, just like your web browser would do.
  • It passed that HTML to the parse method, which doesn’t do anything by default. Since we never wrote our own parse method, the spider just finishes without doing any work.

Now let’s pull some data from the page.

Step 2 — Extracting Data from a Page

We’ve created a very basic program that pulls down a page, but it doesn’t do any scraping or spidering yet. Let’s give it some data to extract.

If you look at the page we want to scrape, you’ll see it has the following structure:

  • There’s a header that’s present on every page.
  • There’s some top-level search data, including the number of matches, what we’re searching for, and the breadcrumbs for the site.
  • Then there are the sets themselves, displayed in what looks like a table or ordered list. Each set has a similar format.

When writing a scraper, it’s a good idea to look at the source of the HTML file and familiarize yourself with the structure. So here it is, with some things removed for readability:<body>
  <section class="setlist">
    <article class='set'>
      <a href="" 
      class="highslide plain mainimg" onclick="return hs.expand(this)"><img 
      src="" title="10251-1: 
      Brick Bank" onError="this.src='/assets/images/spacer.png'" /></a>
      <div class="highslide-caption">
        <h1>Brick Bank</h1><div class='tags floatleft'><a href='/sets/10251-1/Brick- 
        Bank'>10251-1</a> <a href='/sets/theme-Creator-Expert'>Creator Expert</a> <a 
        class='subtheme' href='/sets/theme-Creator-Expert/subtheme-Modular- 
        Buildings'>Modular Buildings</a> <a class='year' href='/sets/theme-Creator- 
        Expert/year-2016'>2016</a> </div><div class='floatright'>&copy;2016 LEGO 
          <div class="pn">
            <a href="#" onclick="return hs.previous(this)" title="Previous (left arrow 
            key)">&#171; Previous</a>
            <a href="#" onclick="return" title="Next (right arrow key)">Next 



Scraping this page is a two step process:

  1. First, grab each LEGO set by looking for the parts of the page that have the data we want.
  2. Then, for each set, grab the data we want from it by pulling the data out of the HTML tags.

scrapy grabs data based on selectors that you provide. Selectors are patterns we can use to find one or more elements on a page so we can then work with the data within the element. scrapy supports either CSS selectors or XPath selectors.

We’ll use CSS selectors for now since CSS is the easier option and a perfect fit for finding all the sets on the page. If you look at the HTML for the page, you’ll see that each set is specified with the class set. Since we’re looking for a class, we’d use .set for our CSS selector. All we have to do is pass that selector into the response object, like this:

class BrickSetSpider(scrapy.Spider):
    name = "brickset_spider"
    start_urls = ['']

    def parse(self, response):
        SET_SELECTOR = '.set'
        for brickset in response.css(SET_SELECTOR):

This code grabs all the sets on the page and loops over them to extract the data. Now let’s extract the data from those sets so we can display it.

Another look at the source of the page we’re parsing tells us that the name of each set is stored within an h1 tag for each set:<h1>Brick Bank</h1><div class='tags floatleft'><a href='/sets/10251-1/Brick-Bank'>10251-1</a>

The brickset object we’re looping over has its own css method, so we can pass in a selector to locate child elements. Modify your code as follows to locate the name of the set and display it:

class BrickSetSpider(scrapy.Spider):
    name = "brickset_spider"
    start_urls = ['']

    def parse(self, response):
        SET_SELECTOR = '.set'
        for brickset in response.css(SET_SELECTOR):

            NAME_SELECTOR = 'h1 ::text'
            yield {
                'name': brickset.css(NAME_SELECTOR).extract_first(),

Note: The trailing comma after extract_first() isn’t a typo. We’re going to add more to this section soon, so we’ve left the comma there to make adding to this section easier later.

You’ll notice two things going on in this code:

  • We append ::text to our selector for the name. That’s a CSS pseudo-selector that fetches the text inside of the a tag rather than the tag itself.
  • We call extract_first() on the object returned by brickset.css(NAME_SELECTOR) because we just want the first element that matches the selector. This gives us a string, rather than a list of elements.

Save the file and run the scraper again:

scrapy runspider

This time you’ll see the names of the sets appear in the output:

[scrapy] DEBUG: Scraped from <200>
{'name': 'Brick Bank'}
[scrapy] DEBUG: Scraped from <200>
{'name': 'Volkswagen Beetle'}
[scrapy] DEBUG: Scraped from <200>
{'name': 'Big Ben'}
[scrapy] DEBUG: Scraped from <200>
{'name': 'Winter Holiday Train'}

Let’s keep expanding on this by adding new selectors for images, pieces, and miniature figures, or minifigs that come with a set.

Take another look at the HTML for a specific set:<article class="set">
  <a class="highslide plain mainimg" href="" onclick="return hs.expand(this)">
    <img src="" title="10251-1: Brick Bank"></a>
  <div class="meta">
    <h1><a href="/sets/10251-1/Brick-Bank"><span>10251:</span> Brick Bank</a> </h1>
    <div class="col">
        <dd><a class="plain" href="/inventories/10251-1">2380</a></dd>
        <dd><a class="plain" href="/minifigs/inset-10251-1">5</a></dd>

We can see a few things by examining this code:

  • The image for the set is stored in the src attribute of an img tag inside an a tag at the start of the set. We can use another CSS selector to fetch this value just like we did when we grabbed the name of each set.
  • Getting the number of pieces is a little trickier. There’s a dt tag that contains the text Pieces, and then a dd tag that follows it which contains the actual number of pieces. We’ll use XPath, a query language for traversing XML, to grab this, because it’s too complex to be represented using CSS selectors.
  • Getting the number of minifigs in a set is similar to getting the number of pieces. There’s a dt tag that contains the text Minifigs, followed by a dd tag right after that with the number.

So, let’s modify the scraper to get this new information:

class BrickSetSpider(scrapy.Spider):
    name = 'brick_spider'
    start_urls = ['']

    def parse(self, response):
        SET_SELECTOR = '.set'
        for brickset in response.css(SET_SELECTOR):

            NAME_SELECTOR = 'h1 ::text'
            PIECES_SELECTOR = './/dl[dt/text() = "Pieces"]/dd/a/text()'
            MINIFIGS_SELECTOR = './/dl[dt/text() = "Minifigs"]/dd[2]/a/text()'
            IMAGE_SELECTOR = 'img ::attr(src)'
            yield {
                'name': brickset.css(NAME_SELECTOR).extract_first(),
                'pieces': brickset.xpath(PIECES_SELECTOR).extract_first(),
                'minifigs': brickset.xpath(MINIFIGS_SELECTOR).extract_first(),
                'image': brickset.css(IMAGE_SELECTOR).extract_first(),

Save your changes and run the scraper again:

scrapy runspider

Now you’ll see that new data in the program’s output:

Output2016-09-22 23:52:37 [scrapy] DEBUG: Scraped from <200>
{'minifigs': '5', 'pieces': '2380', 'name': 'Brick Bank', 'image': ''}
2016-09-22 23:52:37 [scrapy] DEBUG: Scraped from <200>
{'minifigs': None, 'pieces': '1167', 'name': 'Volkswagen Beetle', 'image': ''}
2016-09-22 23:52:37 [scrapy] DEBUG: Scraped from <200>
{'minifigs': None, 'pieces': '4163', 'name': 'Big Ben', 'image': ''}
2016-09-22 23:52:37 [scrapy] DEBUG: Scraped from <200>
{'minifigs': None, 'pieces': None, 'name': 'Winter Holiday Train', 'image': ''}
2016-09-22 23:52:37 [scrapy] DEBUG: Scraped from <200>
{'minifigs': None, 'pieces': None, 'name': 'XL Creative Brick Box', 'image': '/assets/images/misc/blankbox.gif'}
2016-09-22 23:52:37 [scrapy] DEBUG: Scraped from <200>
{'minifigs': None, 'pieces': '583', 'name': 'Creative Building Set', 'image': ''}

Now let’s turn this scraper into a spider that follows links.

Step 3 — Crawling Multiple Pages

We’ve successfully extracted data from that initial page, but we’re not progressing past it to see the rest of the results. The whole point of a spider is to detect and traverse links to other pages and grab data from those pages too.

You’ll notice that the top and bottom of each page has a little right carat (>) that links to the next page of results. Here’s the HTML for that:<ul class="pagelength">


  <li class="next">
    <a href="">&#8250;</a>
  <li class="last">
    <a href="">&#187;</a>

As you can see, there’s a li tag with the class of next, and inside that tag, there’s an a tag with a link to the next page. All we have to do is tell the scraper to follow that link if it exists.

Modify your code as follows:

class BrickSetSpider(scrapy.Spider):
    name = 'brick_spider'
    start_urls = ['']

    def parse(self, response):
        SET_SELECTOR = '.set'
        for brickset in response.css(SET_SELECTOR):

            NAME_SELECTOR = 'h1 ::text'
            PIECES_SELECTOR = './/dl[dt/text() = "Pieces"]/dd/a/text()'
            MINIFIGS_SELECTOR = './/dl[dt/text() = "Minifigs"]/dd[2]/a/text()'
            IMAGE_SELECTOR = 'img ::attr(src)'
            yield {
                'name': brickset.css(NAME_SELECTOR).extract_first(),
                'pieces': brickset.xpath(PIECES_SELECTOR).extract_first(),
                'minifigs': brickset.xpath(MINIFIGS_SELECTOR).extract_first(),
                'image': brickset.css(IMAGE_SELECTOR).extract_first(),

        NEXT_PAGE_SELECTOR = '.next a ::attr(href)'
        next_page = response.css(NEXT_PAGE_SELECTOR).extract_first()
        if next_page:
            yield scrapy.Request(

First, we define a selector for the “next page” link, extract the first match, and check if it exists. The scrapy.Request is a value that we return saying “Hey, crawl this page”, and callback=self.parse says “once you’ve gotten the HTML from this page, pass it back to this method so we can parse it, extract the data, and find the next page.“

This means that once we go to the next page, we’ll look for a link to the next page there, and on that page we’ll look for a link to the next page, and so on, until we don’t find a link for the next page. This is the key piece of web scraping: finding and following links. In this example, it’s very linear; one page has a link to the next page until we’ve hit the last page, But you could follow links to tags, or other search results, or any other URL you’d like.

Now, if you save your code and run the spider again you’ll see that it doesn’t just stop once it iterates through the first page of sets. It keeps on going through all 779 matches on 23 pages! In the grand scheme of things it’s not a huge chunk of data, but now you know the process by which you automatically find new pages to scrape.

Here’s our completed code for this tutorial, using Python-specific highlighting:

import scrapy

class BrickSetSpider(scrapy.Spider):
    name = 'brick_spider'
    start_urls = ['']

    def parse(self, response):
        SET_SELECTOR = '.set'
        for brickset in response.css(SET_SELECTOR):

            NAME_SELECTOR = 'h1 ::text'
            PIECES_SELECTOR = './/dl[dt/text() = "Pieces"]/dd/a/text()'
            MINIFIGS_SELECTOR = './/dl[dt/text() = "Minifigs"]/dd[2]/a/text()'
            IMAGE_SELECTOR = 'img ::attr(src)'
            yield {
                'name': brickset.css(NAME_SELECTOR).extract_first(),
                'pieces': brickset.xpath(PIECES_SELECTOR).extract_first(),
                'minifigs': brickset.xpath(MINIFIGS_SELECTOR).extract_first(),
                'image': brickset.css(IMAGE_SELECTOR).extract_first(),

        NEXT_PAGE_SELECTOR = '.next a ::attr(href)'
        next_page = response.css(NEXT_PAGE_SELECTOR).extract_first()
        if next_page:
            yield scrapy.Request(


In this tutorial you built a fully-functional spider that extracts data from web pages in less than thirty lines of code. That’s a great start, but there’s a lot of fun things you can do with this spider. Here are some ways you could expand the code you’ve written. They’ll give you some practice scraping data.

  1. Right now we’re only parsing results from 2016, as you might have guessed from the 2016 part of — how would you crawl results from other years?
  2. There’s a retail price included on most sets. How do you extract the data from that cell? How would you get a raw number out of it? Hint: you’ll find the data in a dt just like the number of pieces and minifigs.
  3. Most of the results have tags that specify semantic data about the sets or their context. How do we crawl these, given that there are multiple tags for a single set?

Originally published by Justin Duke at


What is GEEK

Buddha Community

Crawling the Web with Python and Scrapy
Shardul Bhatt

Shardul Bhatt


Why use Python for Software Development

No programming language is pretty much as diverse as Python. It enables building cutting edge applications effortlessly. Developers are as yet investigating the full capability of end-to-end Python development services in various areas. 

By areas, we mean FinTech, HealthTech, InsureTech, Cybersecurity, and that's just the beginning. These are New Economy areas, and Python has the ability to serve every one of them. The vast majority of them require massive computational abilities. Python's code is dynamic and powerful - equipped for taking care of the heavy traffic and substantial algorithmic capacities. 

Programming advancement is multidimensional today. Endeavor programming requires an intelligent application with AI and ML capacities. Shopper based applications require information examination to convey a superior client experience. Netflix, Trello, and Amazon are genuine instances of such applications. Python assists with building them effortlessly. 

5 Reasons to Utilize Python for Programming Web Apps 

Python can do such numerous things that developers can't discover enough reasons to admire it. Python application development isn't restricted to web and enterprise applications. It is exceptionally adaptable and superb for a wide range of uses.

Robust frameworks 

Python is known for its tools and frameworks. There's a structure for everything. Django is helpful for building web applications, venture applications, logical applications, and mathematical processing. Flask is another web improvement framework with no conditions. 

Web2Py, CherryPy, and Falcon offer incredible capabilities to customize Python development services. A large portion of them are open-source frameworks that allow quick turn of events. 

Simple to read and compose 

Python has an improved sentence structure - one that is like the English language. New engineers for Python can undoubtedly understand where they stand in the development process. The simplicity of composing allows quick application building. 

The motivation behind building Python, as said by its maker Guido Van Rossum, was to empower even beginner engineers to comprehend the programming language. The simple coding likewise permits developers to roll out speedy improvements without getting confused by pointless subtleties. 

Utilized by the best 

Alright - Python isn't simply one more programming language. It should have something, which is the reason the business giants use it. Furthermore, that too for different purposes. Developers at Google use Python to assemble framework organization systems, parallel information pusher, code audit, testing and QA, and substantially more. Netflix utilizes Python web development services for its recommendation algorithm and media player. 

Massive community support 

Python has a steadily developing community that offers enormous help. From amateurs to specialists, there's everybody. There are a lot of instructional exercises, documentation, and guides accessible for Python web development solutions. 

Today, numerous universities start with Python, adding to the quantity of individuals in the community. Frequently, Python designers team up on various tasks and help each other with algorithmic, utilitarian, and application critical thinking. 

Progressive applications 

Python is the greatest supporter of data science, Machine Learning, and Artificial Intelligence at any enterprise software development company. Its utilization cases in cutting edge applications are the most compelling motivation for its prosperity. Python is the second most well known tool after R for data analytics.

The simplicity of getting sorted out, overseeing, and visualizing information through unique libraries makes it ideal for data based applications. TensorFlow for neural networks and OpenCV for computer vision are two of Python's most well known use cases for Machine learning applications.


Thinking about the advances in programming and innovation, Python is a YES for an assorted scope of utilizations. Game development, web application development services, GUI advancement, ML and AI improvement, Enterprise and customer applications - every one of them uses Python to its full potential. 

The disadvantages of Python web improvement arrangements are regularly disregarded by developers and organizations because of the advantages it gives. They focus on quality over speed and performance over blunders. That is the reason it's a good idea to utilize Python for building the applications of the future.

#python development services #python development company #python app development #python development #python in web development #python software development

Arvel  Parker

Arvel Parker


Basic Data Types in Python | Python Web Development For Beginners

At the end of 2019, Python is one of the fastest-growing programming languages. More than 10% of developers have opted for Python development.

In the programming world, Data types play an important role. Each Variable is stored in different data types and responsible for various functions. Python had two different objects, and They are mutable and immutable objects.

Table of Contents  hide

I Mutable objects

II Immutable objects

III Built-in data types in Python

Mutable objects

The Size and declared value and its sequence of the object can able to be modified called mutable objects.

Mutable Data Types are list, dict, set, byte array

Immutable objects

The Size and declared value and its sequence of the object can able to be modified.

Immutable data types are int, float, complex, String, tuples, bytes, and frozen sets.

id() and type() is used to know the Identity and data type of the object







Built-in data types in Python

a**=str(“Hello python world”)****#str**














Numbers (int,Float,Complex)

Numbers are stored in numeric Types. when a number is assigned to a variable, Python creates Number objects.

#signed interger




Python supports 3 types of numeric data.

int (signed integers like 20, 2, 225, etc.)

float (float is used to store floating-point numbers like 9.8, 3.1444, 89.52, etc.)

complex (complex numbers like 8.94j, 4.0 + 7.3j, etc.)

A complex number contains an ordered pair, i.e., a + ib where a and b denote the real and imaginary parts respectively).


The string can be represented as the sequence of characters in the quotation marks. In python, to define strings we can use single, double, or triple quotes.

# String Handling

‘Hello Python’

#single (') Quoted String

“Hello Python”

# Double (") Quoted String

“”“Hello Python”“”

‘’‘Hello Python’‘’

# triple (‘’') (“”") Quoted String

In python, string handling is a straightforward task, and python provides various built-in functions and operators for representing strings.

The operator “+” is used to concatenate strings and “*” is used to repeat the string.


output**:****‘Hello python’**

"python "*****2

'Output : Python python ’

#python web development #data types in python #list of all python data types #python data types #python datatypes #python types #python variable type

sophia tondon

sophia tondon


Hire Python Developer | Python web development company india

Are you looking to hire Python developers online? ValueCoders provide dedicated and certified Python engineers who are proficient in building robust, secure & scalable web applications utilizing the best Python development strategies.

Visit Website -

#python web development #hire python developers #hiring python developers #hire python developer #web-development #python

How To Compare Tesla and Ford Company By Using Magic Methods in Python

Magic Methods are the special methods which gives us the ability to access built in syntactical features such as ‘<’, ‘>’, ‘==’, ‘+’ etc…

You must have worked with such methods without knowing them to be as magic methods. Magic methods can be identified with their names which start with __ and ends with __ like init, call, str etc. These methods are also called Dunder Methods, because of their name starting and ending with Double Underscore (Dunder).

Now there are a number of such special methods, which you might have come across too, in Python. We will just be taking an example of a few of them to understand how they work and how we can use them.

1. init

class AnyClass:
    def __init__():
        print("Init called on its own")
obj = AnyClass()

The first example is _init, _and as the name suggests, it is used for initializing objects. Init method is called on its own, ie. whenever an object is created for the class, the init method is called on its own.

The output of the above code will be given below. Note how we did not call the init method and it got invoked as we created an object for class AnyClass.

Init called on its own

2. add

Let’s move to some other example, add gives us the ability to access the built in syntax feature of the character +. Let’s see how,

class AnyClass:
    def __init__(self, var):
        self.some_var = var
    def __add__(self, other_obj):
        print("Calling the add method")
        return self.some_var + other_obj.some_var
obj1 = AnyClass(5)
obj2 = AnyClass(6)
obj1 + obj2

#python3 #python #python-programming #python-web-development #python-tutorials #python-top-story #python-tips #learn-python

August  Larson

August Larson


Automating WhatsApp Web with Alright and Python

Alright is a python wrapper that helps you automate WhatsApp web using python, giving you the capability to send messages, images, video, and files to both saved and unsaved contacts without having to rescan the QR code every time.

Why Alright?

I was looking for a way to control and automate WhatsApp web with Python; I came across some very nice libraries and wrappers implementations, including:

  1. pywhatkit
  2. pywhatsapp
  3. PyWhatsapp
  4. WebWhatsapp-Wrapper

So I tried

pywhatkit, a well crafted to be used, but its implementations require you to open a new browser tab and scan QR code every time you send a message, no matter if it’s the same person, which was a deal-breaker for using it.

I then tried

pywhatsapp,which is based onyowsupand require you to do some registration withyowsupbefore using it of which after a bit of googling, I got scared of having my number blocked. So I went for the next option.

I then went for WebWhatsapp-Wrapper. It has some good documentation and recent commits so I had hoped it is going to work. But It didn’t for me, and after having a couple of errors, I abandoned it to look for the next alternative.

PyWhatsapp by shauryauppal, which was more of a CLI tool than a wrapper, surprisingly worked. Its approach allows you to dynamically send WhatsApp messages to unsaved contacts without rescanning QR-code every time.

So what I did is refactoring the implementation of that tool to be more of a wrapper to easily allow people to run different scripts on top of it. Instead of just using it as a tool, I then thought of sharing the codebase with people who might struggle to do this as I did.

#python #python-programming #python-tutorials #python-programming-lists #selenium #python-dev-tips #python-developers #programming #web-monetization