Easy Access to the World’s Largest Data Source

Easy Access to the World’s Largest Data Source

However, the Wikipedia API for Python might be the simplest one to use. In this post, we will see how to use the Wikipedia API to: Access the content of a particular page; Search for pages.

The importance of data comes way before then building state-of-the-art algorithms in data science. Without the proper and vast amount of data, we cannot train the models well enough to get satisfying results.

Wikipedia, being the largest encyclopedia of the world, can serve as a great data source for many projects. There are many web scraping tools and frameworks that allow getting data from Wikipedia. However, the Wikipedia API for Python might be the simplest one to use.

In this post, we will see how to use the Wikipedia API to:

  • Access the content of a particular page
  • Search for pages

You can easily install and import it. I will be using Google Colab so here is how it is done in Colab:

pip install wikipedia
import wikipedia

The content of a page can be extracted with the page method. The title of the page is passed as an argument. The following code will return the Support Vector Machine page as a WikipediaPage object.

page_svm = wikipedia.page("Support vector machine")

type(page_svm)
wikipedia.wikipedia.WikipediaPage

This object holds the URL of the page which can be accessed with the url method.

page_svm.url

https://en.wikipedia.org/wiki/Support_vector_machine

We can access the content of the page with the content method.

svm_content = page_svm.content

type(svm_content)
str

artificial-intelligence programming data-analysis data-science machine-learning

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Most popular Data Science and Machine Learning courses — July 2020

Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant

Artificial Intelligence vs Machine Learning vs Data Science

Artificial Intelligence, Machine Learning, and Data Science are amongst a few terms that have become extremely popular amongst professionals in almost all the fields.

Pipelines in Machine Learning | Data Science | Machine Learning | Python

Machine Learning Pipelines performs a complete workflow with an ordered sequence of the process involved in a Machine Learning task. The Pipelines can also

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.