Ruby Web Scraping Using Nokogiri

Ruby Web Scraping Using Nokogiri

Ruby has an amazing web scraping gem called Nokogiri. Among other features, it allows you to search HTML documents by CSS selectors. That means if we know the ids, classes, or even types of elements where the data is stored in the DOM, we're able to pluck it out.

Installation

To add Nokogiri to your application you just need to run the command

gem install nokogiri

After installation make sure you require nokogiri and open-uri.

require 'nokogiri' 
require 'open-uri'

Scraping from Website

To scrap from a website you need the url from the page you want to scrape from. Then pass the the url to the URI.open method to get the HTML. After that pass the HTML to the Nokogiri::HTML method to get a set of nodes that you can parse through using Nokogiri.

url = 'https://www.101cookbooks.com/ingredient.html'
html = URI.open(url)
doc = Nokogiri::HTML(html)

Process Data

Scraping data from the website is a bit complicated. You need to figure out where the data you want to read is from the DOM. One way to do this is to inspect the element and hover over the element in the HTML. The pop up will show the CSS of that element which you can use.

Image for post

content = doc.css("div.maincontent.fullarchives.ingredients.col-lg-8.col-xl-8")

In this case we want to read all the ingredients in this website. All of the ingredients are grouped alphabetically but are all contained in a div container. All the ingredients are going to be in the first node so we use the first index and parse through its children.

Image for post

If we inspect each group of ingredients they are contained in a div with the class “archives” and “flex-wrap” which we can check. After that we need to look at its children.

nokogiri ruby ruby-on-rails web-scraping backend-development

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Hire Ruby on Rails Developer | Hire RoR Developer

#1 Ruby on Rails development company. Hire Ruby on rails developer or a team to build secure, scalable and complex web solutions with a quick turnaround time.

Hire Dedicated Backend Developers

Want to create a backend for a web or mobile app using PHP & JS frameworks? **[Hire Dedicated Backend Developers](https://hourlydeveloper.io/hire-back-end-developer/ "Hire Dedicated Backend Developers")** who offer end-to-end, robust, scalable...

Hire Backend Developers India

Are you looking to hire experienced Backend Developers at a reasonable cost to boost-up your IT business? **[Hire Backend Developers India](https://hourlydeveloper.io/hire-back-end-developer/ "Hire Backend Developers India")** and accomplish...

Converting Your First Ruby CLI App Into a Ruby on Rails Web App

How to install Rails, create a new Rails application, and connect your ... Laying down the groundwork; The first form; Creating articles; Creating the Article model ... Any commands prefaced with a dollar sign $ should be run in the command line. ... in this tutorial will happen in the app folder, but here's a basic rundown on the ...

How to Deploy a Ruby on Rails App on Heroku

The most satisfying thing beyond building something is to make it available to the world. Heroku is great for beginners because it’s a free and “simple” push-to-deploy system.