Joshua Rowe

Joshua Rowe

1574136022

Getting Started with Puppeteer and Nodejs

Browser developer tools provide an amazing array of options for delving under the hood of websites and web apps. These capabilities can be further enhanced and automated by third-party tools. In this article, we’ll look at Puppeteer, a Node-based library for use with Chrome/Chromium.

The puppeteer website describes Puppeteer as

a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

Puppeteer is made by the team behind Google Chrome, so you can be pretty sure it will be well maintained. It lets us perform common actions on the Chromium browser, programmatically through JavaScript, via a simple and easy-to-use API.

With Puppeteer, you can:

  • scrape websites
  • generate screenshots of websites including SVG and Canvas
  • create PDFs of websites
  • crawl an SPA (single-page application)
  • access web pages and extract information using the standard DOM API
  • generate pre-rendered content — that is, server-side rendering
  • automate form submission
  • automate performance analysis
  • automate UI testing like Cypress
  • test chrome extensions

Puppeteer does nothing new that Selenium, PhantomJS (which is now deprecated), and the like do, but it provides a simple and easy-to-use API and provides a great abstraction so we don’t have to worry about the nitty-gritty details when dealing with it.

It’s also actively maintained so we get all the new features of ECMAScript as Chromium supports it.

Prerequisites

For this tutorial, you need a basic knowledge of JavaScript, ES6+ and Node.js.

You must also have installed the latest version of Node.js.

We’ll be using yarn throughout this tutorial. If you don’t have yarn already installed, install it from here.

To make sure we’re on the same page, these are the versions used in this tutorial:

  • Node 12.12.0
  • yarn 1.19.1
  • puppeteer 2.0.0

Installation

To use Puppeteer in your project, run the following command in the terminal:

$ yarn add puppeteer

Note: when you install Puppeteer, it downloads a recent version of Chromium (~170MB macOS, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API. To skip the download, see Environment variables.

If you don’t need to download Chromium, then you can install puppeteer-core:

$ yarn add puppeteer-core

puppeteer-core is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one. Be sure that the version of puppeteer-core you install is compatible with the browser you intend to connect to.

Note: puppeteer-core is only published from version 1.7.0.

Usage

Puppeteer requires at least Node v6.4.0, but we’re going to use async/await, which is only supported in Node v7.6.0 or greater, so make sure to update your Node.js to the latest version to get all the goodies.

Let’s dive into some practical examples using Puppeteer. In this tutorial, we’ll be:

  1. generating a screenshot of Unsplash using Puppeteer
  2. creating a PDF of Hacker News using Puppeteer
  3. signing in to Facebook using Puppeteer

1. Generate a Screenshot of Unsplash using Puppeteer

It’s really easy to do this with Puppeteer. Go ahead and create a screenshot.js file in the root of your project. Then paste in the following code:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash.png' })

  await browser.close()
}

main()

Firstly, we require the puppeteer package. Then we call the launch method on it that initializes the instance. This method is asynchronous as it returns a Promise. So we await for it to get the browser instance.

Then we call newPage on it and go to Unsplash and take a screenshot of it and save the screenshot as unsplash.png.

Now go ahead and run the above code in the terminal by typing:

$ node screenshot

Unsplash - 800px x 600px resolution

Now after 5–10 seconds you’ll see an unsplash.png file in your project that contains the screenshot of Unsplash. Notice that the viewport is set to 800px x 600px as Puppeteer sets this as the initial page size, which defines the screenshot size. The page size can be customized with Page.setViewport().

Let’s change the viewport to be 1920px x 1080px. Insert the following code before the goto method:

await page.setViewport({
  width: 1920,
  height: 1080,
  deviceScaleFactor: 1,
})

Now go ahead and also change the filename from unsplash.png to unsplash2.png in the screenshot method like so:

await page.screenshot({ path: 'unsplash2.png' })

The whole screenshot.js file should now look like this:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.setViewport({
    width: 1920,
    height: 1080,
    deviceScaleFactor: 1,
  })
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash2.png' })

  await browser.close()
}

main()

Unsplash - 1920px x 1080px

2. Create PDF of Hacker News using Puppeteer

Now create a file named pdf.js and paste the following code into it:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })
  await page.pdf({ path: 'hn.pdf', format: 'A4' })

  await browser.close()
}

main()

We’ve only changed two lines from the screenshot code.

Firstly, we’ve replaced the URL with Hacker News and then added networkidle2:

await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })

networkidle2 comes in handy for pages that do long polling or any other side activity and considers navigation to be finished when there are no more than two network connections for at least 500ms.

Then we called the pdf method to create a PDf and called it hn.pdf and we formatted it to be A4 size:

await page.pdf({ path: 'hn.pdf', format: 'A4' })

That’s it. We can now run the file to generate a PDF of Hacker News. Let’s go ahead and run the following command in the terminal:

$ node pdf

This will generate a PDF file called hn.pdf in the root directory of the project in A4 size.

3. Sign In to Facebook Using Puppeteer

Create a new file called signin.js with the following code:

const puppeteer = require('puppeteer')

const SECRET_EMAIL = 'example@gmail.com'
const SECRET_PASSWORD = 'secretpass123'

const main = async () => {
  const browser = await puppeteer.launch({
    headless: false,
  })
  const page = await browser.newPage()
  await page.goto('https://facebook.com', { waitUntil: 'networkidle2' })
  await page.waitForSelector('#login_form')
  await page.type('input#email', SECRET_EMAIL)
  await page.type('input#pass', SECRET_PASSWORD)
  await page.click('#loginbutton')
  // await browser.close()
}

main()

We’ve created two variables, SECRET_EMAIL and SECRET_PASSWORD, which should be replaced by your email and password of Facebook.

We then launch the browser and set headless mode to false to launch a full version of Chromium browser.

Then we go to Facebook and wait until everything is loaded.

On Facebook, there’s a #login_form selector that can be accessed via DevTools. This selector contains the login form, so we wait for it using waitForSelector method.

Then we have to type our email and password, so we grab the selectors input#email and input#pass from DevTools and pass in our SECRET_EMAIL and SECRET_PASSWORD.

After that, we click the #loginbutton to log in to Facebook.

The last line is commented out so that we see the whole process of typing email and password and clicking the login button.

Go ahead and run the code by typing the following in the terminal:

$ node signin

This will launch a whole Chromium browser and then log in to Facebook.

Conclusion

In this tutorial, we made a project that creates a screenshot of any given page within a specified viewport. We also built a project where we can create a PDF of any website. We then programmatically managed to sign in to Facebook.

Puppeteer recently released version 2, and it’s a nice piece of software to automate trivial tasks with a simple and easy-to-use API.

You can learn more about Puppeteer on its official website. The docs are very good, with tons of examples, and everything is well documented.

Now go ahead and automate boring tasks in your day-to-day life with Puppeteer.

#nodejs #Puppeteer

What is GEEK

Buddha Community

Getting Started with Puppeteer and Nodejs

Hire NodeJs Developer

Looking to build dynamic, extensively featured, and full-fledged web applications?

Hire NodeJs Developer to create a real-time, faster, and scalable application to accelerate your business. At HourlyDeveloper.io, we have a team of expert Node.JS developers, who have experience in working with Bootstrap, HTML5, & CSS, and also hold the knowledge of the most advanced frameworks and platforms.

Contact our experts: https://bit.ly/3hUdppS

#hire nodejs developer #nodejs developer #nodejs development company #nodejs development services #nodejs development #nodejs

Joshua Rowe

Joshua Rowe

1574136022

Getting Started with Puppeteer and Nodejs

Browser developer tools provide an amazing array of options for delving under the hood of websites and web apps. These capabilities can be further enhanced and automated by third-party tools. In this article, we’ll look at Puppeteer, a Node-based library for use with Chrome/Chromium.

The puppeteer website describes Puppeteer as

a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

Puppeteer is made by the team behind Google Chrome, so you can be pretty sure it will be well maintained. It lets us perform common actions on the Chromium browser, programmatically through JavaScript, via a simple and easy-to-use API.

With Puppeteer, you can:

  • scrape websites
  • generate screenshots of websites including SVG and Canvas
  • create PDFs of websites
  • crawl an SPA (single-page application)
  • access web pages and extract information using the standard DOM API
  • generate pre-rendered content — that is, server-side rendering
  • automate form submission
  • automate performance analysis
  • automate UI testing like Cypress
  • test chrome extensions

Puppeteer does nothing new that Selenium, PhantomJS (which is now deprecated), and the like do, but it provides a simple and easy-to-use API and provides a great abstraction so we don’t have to worry about the nitty-gritty details when dealing with it.

It’s also actively maintained so we get all the new features of ECMAScript as Chromium supports it.

Prerequisites

For this tutorial, you need a basic knowledge of JavaScript, ES6+ and Node.js.

You must also have installed the latest version of Node.js.

We’ll be using yarn throughout this tutorial. If you don’t have yarn already installed, install it from here.

To make sure we’re on the same page, these are the versions used in this tutorial:

  • Node 12.12.0
  • yarn 1.19.1
  • puppeteer 2.0.0

Installation

To use Puppeteer in your project, run the following command in the terminal:

$ yarn add puppeteer

Note: when you install Puppeteer, it downloads a recent version of Chromium (~170MB macOS, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API. To skip the download, see Environment variables.

If you don’t need to download Chromium, then you can install puppeteer-core:

$ yarn add puppeteer-core

puppeteer-core is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one. Be sure that the version of puppeteer-core you install is compatible with the browser you intend to connect to.

Note: puppeteer-core is only published from version 1.7.0.

Usage

Puppeteer requires at least Node v6.4.0, but we’re going to use async/await, which is only supported in Node v7.6.0 or greater, so make sure to update your Node.js to the latest version to get all the goodies.

Let’s dive into some practical examples using Puppeteer. In this tutorial, we’ll be:

  1. generating a screenshot of Unsplash using Puppeteer
  2. creating a PDF of Hacker News using Puppeteer
  3. signing in to Facebook using Puppeteer

1. Generate a Screenshot of Unsplash using Puppeteer

It’s really easy to do this with Puppeteer. Go ahead and create a screenshot.js file in the root of your project. Then paste in the following code:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash.png' })

  await browser.close()
}

main()

Firstly, we require the puppeteer package. Then we call the launch method on it that initializes the instance. This method is asynchronous as it returns a Promise. So we await for it to get the browser instance.

Then we call newPage on it and go to Unsplash and take a screenshot of it and save the screenshot as unsplash.png.

Now go ahead and run the above code in the terminal by typing:

$ node screenshot

Unsplash - 800px x 600px resolution

Now after 5–10 seconds you’ll see an unsplash.png file in your project that contains the screenshot of Unsplash. Notice that the viewport is set to 800px x 600px as Puppeteer sets this as the initial page size, which defines the screenshot size. The page size can be customized with Page.setViewport().

Let’s change the viewport to be 1920px x 1080px. Insert the following code before the goto method:

await page.setViewport({
  width: 1920,
  height: 1080,
  deviceScaleFactor: 1,
})

Now go ahead and also change the filename from unsplash.png to unsplash2.png in the screenshot method like so:

await page.screenshot({ path: 'unsplash2.png' })

The whole screenshot.js file should now look like this:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.setViewport({
    width: 1920,
    height: 1080,
    deviceScaleFactor: 1,
  })
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash2.png' })

  await browser.close()
}

main()

Unsplash - 1920px x 1080px

2. Create PDF of Hacker News using Puppeteer

Now create a file named pdf.js and paste the following code into it:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })
  await page.pdf({ path: 'hn.pdf', format: 'A4' })

  await browser.close()
}

main()

We’ve only changed two lines from the screenshot code.

Firstly, we’ve replaced the URL with Hacker News and then added networkidle2:

await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })

networkidle2 comes in handy for pages that do long polling or any other side activity and considers navigation to be finished when there are no more than two network connections for at least 500ms.

Then we called the pdf method to create a PDf and called it hn.pdf and we formatted it to be A4 size:

await page.pdf({ path: 'hn.pdf', format: 'A4' })

That’s it. We can now run the file to generate a PDF of Hacker News. Let’s go ahead and run the following command in the terminal:

$ node pdf

This will generate a PDF file called hn.pdf in the root directory of the project in A4 size.

3. Sign In to Facebook Using Puppeteer

Create a new file called signin.js with the following code:

const puppeteer = require('puppeteer')

const SECRET_EMAIL = 'example@gmail.com'
const SECRET_PASSWORD = 'secretpass123'

const main = async () => {
  const browser = await puppeteer.launch({
    headless: false,
  })
  const page = await browser.newPage()
  await page.goto('https://facebook.com', { waitUntil: 'networkidle2' })
  await page.waitForSelector('#login_form')
  await page.type('input#email', SECRET_EMAIL)
  await page.type('input#pass', SECRET_PASSWORD)
  await page.click('#loginbutton')
  // await browser.close()
}

main()

We’ve created two variables, SECRET_EMAIL and SECRET_PASSWORD, which should be replaced by your email and password of Facebook.

We then launch the browser and set headless mode to false to launch a full version of Chromium browser.

Then we go to Facebook and wait until everything is loaded.

On Facebook, there’s a #login_form selector that can be accessed via DevTools. This selector contains the login form, so we wait for it using waitForSelector method.

Then we have to type our email and password, so we grab the selectors input#email and input#pass from DevTools and pass in our SECRET_EMAIL and SECRET_PASSWORD.

After that, we click the #loginbutton to log in to Facebook.

The last line is commented out so that we see the whole process of typing email and password and clicking the login button.

Go ahead and run the code by typing the following in the terminal:

$ node signin

This will launch a whole Chromium browser and then log in to Facebook.

Conclusion

In this tutorial, we made a project that creates a screenshot of any given page within a specified viewport. We also built a project where we can create a PDF of any website. We then programmatically managed to sign in to Facebook.

Puppeteer recently released version 2, and it’s a nice piece of software to automate trivial tasks with a simple and easy-to-use API.

You can learn more about Puppeteer on its official website. The docs are very good, with tons of examples, and everything is well documented.

Now go ahead and automate boring tasks in your day-to-day life with Puppeteer.

#nodejs #Puppeteer

How to Install NodeJS on Ubuntu 19.04

Overview
In this tutorial, you will learn how to install Node onto Ubuntu 19.04 Disco Dingo. We will cover installation from the default repositories and, for those wanting more recent releases, how to install from the NodeSource repositories.

Installing from Ubuntu
The Ubuntu 19.04 Disco Dingo repository includes NodeJS version 10.15. Like most packages found here, it certainly is not the most recent release; however, if stability is more important than features, it will be your preferred choice.

#nodejs #nodejs 10.x #nodejs 11.x #nodejs 12.x #nodejs 8.x

Top NodeJS Mobile App Development Company in USA

AppClues Infotech is one of the leading NodeJS app development company in USA that offering excellent NodeJS development services for web app development. We provide customized and high-quality NodeJS app development services to clients for different industries with advanced technology and functionalities.

Our dedicated app developers have years of experience in NodeJS development and thus successfully deliver cost-effective and highly customized solutions using the robust JavaScript engine of NodeJS.

Why Choose AppClues Infotech for NodeJS Application Development?
• Fast App Development
• Real-Time Application
• JSON (JavaScript Object Notation) in your Database
• Single Codebase
• Lower Cost
• Built-in NPM Support
• Inexpensive Testing and Hosting

For more info:
Website: https://www.appcluesinfotech.com/
Email: info@appcluesinfotech.com
Call: +1-978-309-9910

#top nodejs app development company in usa #nodejs web app development #nodejs development agency in usa #hire nodejs app developers in usa #custom nodejs app development company #best nodejs app development service company

Arvel  Miller

Arvel Miller

1603068240

Decoding Nodejs

The main goal of this blog is to explain the “Architecture of Nodejs” and to know how the Nodejs works behind the scenes,

Generally, most of the server-side languages, like PHP, ASP.NET, Ruby, and including Nodejs follows multi-threaded architecture. That means for each client-side request initiates a new thread or even a new process.

In Nodejs, all those requests from the clients are handled in a single-thread using shared resources concurrently as It follows the “Single-Threaded Event Loop Model”.

ARCHITECTURE OF NODEJS

What Is EVENT-LOOP?

Event-Loop programming is a flow control in an application-defined by events. The basic principle of Nodejs’s event-driven loop is implementing a central mechanism that hears for events and calls the callback function once an event is turning up.

Nodejs is an event-loop that implements a run-time environment model to achieve non-blocking asynchronous behavior runs on Google Chrome’s V8 engine.

#nodejs #nodejs-developer #nodejs-architecture #nodejs-tutorial #backend #javascript #beginners #event-loop