Joshua Rowe

Joshua Rowe

1574136022

Getting Started with Puppeteer and Nodejs

Browser developer tools provide an amazing array of options for delving under the hood of websites and web apps. These capabilities can be further enhanced and automated by third-party tools. In this article, we’ll look at Puppeteer, a Node-based library for use with Chrome/Chromium.

The puppeteer website describes Puppeteer as

a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

Puppeteer is made by the team behind Google Chrome, so you can be pretty sure it will be well maintained. It lets us perform common actions on the Chromium browser, programmatically through JavaScript, via a simple and easy-to-use API.

With Puppeteer, you can:

  • scrape websites
  • generate screenshots of websites including SVG and Canvas
  • create PDFs of websites
  • crawl an SPA (single-page application)
  • access web pages and extract information using the standard DOM API
  • generate pre-rendered content — that is, server-side rendering
  • automate form submission
  • automate performance analysis
  • automate UI testing like Cypress
  • test chrome extensions

Puppeteer does nothing new that Selenium, PhantomJS (which is now deprecated), and the like do, but it provides a simple and easy-to-use API and provides a great abstraction so we don’t have to worry about the nitty-gritty details when dealing with it.

It’s also actively maintained so we get all the new features of ECMAScript as Chromium supports it.

Prerequisites

For this tutorial, you need a basic knowledge of JavaScript, ES6+ and Node.js.

You must also have installed the latest version of Node.js.

We’ll be using yarn throughout this tutorial. If you don’t have yarn already installed, install it from here.

To make sure we’re on the same page, these are the versions used in this tutorial:

  • Node 12.12.0
  • yarn 1.19.1
  • puppeteer 2.0.0

Installation

To use Puppeteer in your project, run the following command in the terminal:

$ yarn add puppeteer

Note: when you install Puppeteer, it downloads a recent version of Chromium (~170MB macOS, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API. To skip the download, see Environment variables.

If you don’t need to download Chromium, then you can install puppeteer-core:

$ yarn add puppeteer-core

puppeteer-core is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one. Be sure that the version of puppeteer-core you install is compatible with the browser you intend to connect to.

Note: puppeteer-core is only published from version 1.7.0.

Usage

Puppeteer requires at least Node v6.4.0, but we’re going to use async/await, which is only supported in Node v7.6.0 or greater, so make sure to update your Node.js to the latest version to get all the goodies.

Let’s dive into some practical examples using Puppeteer. In this tutorial, we’ll be:

  1. generating a screenshot of Unsplash using Puppeteer
  2. creating a PDF of Hacker News using Puppeteer
  3. signing in to Facebook using Puppeteer

1. Generate a Screenshot of Unsplash using Puppeteer

It’s really easy to do this with Puppeteer. Go ahead and create a screenshot.js file in the root of your project. Then paste in the following code:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash.png' })

  await browser.close()
}

main()

Firstly, we require the puppeteer package. Then we call the launch method on it that initializes the instance. This method is asynchronous as it returns a Promise. So we await for it to get the browser instance.

Then we call newPage on it and go to Unsplash and take a screenshot of it and save the screenshot as unsplash.png.

Now go ahead and run the above code in the terminal by typing:

$ node screenshot

Unsplash - 800px x 600px resolution

Now after 5–10 seconds you’ll see an unsplash.png file in your project that contains the screenshot of Unsplash. Notice that the viewport is set to 800px x 600px as Puppeteer sets this as the initial page size, which defines the screenshot size. The page size can be customized with Page.setViewport().

Let’s change the viewport to be 1920px x 1080px. Insert the following code before the goto method:

await page.setViewport({
  width: 1920,
  height: 1080,
  deviceScaleFactor: 1,
})

Now go ahead and also change the filename from unsplash.png to unsplash2.png in the screenshot method like so:

await page.screenshot({ path: 'unsplash2.png' })

The whole screenshot.js file should now look like this:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.setViewport({
    width: 1920,
    height: 1080,
    deviceScaleFactor: 1,
  })
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash2.png' })

  await browser.close()
}

main()

Unsplash - 1920px x 1080px

2. Create PDF of Hacker News using Puppeteer

Now create a file named pdf.js and paste the following code into it:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })
  await page.pdf({ path: 'hn.pdf', format: 'A4' })

  await browser.close()
}

main()

We’ve only changed two lines from the screenshot code.

Firstly, we’ve replaced the URL with Hacker News and then added networkidle2:

await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })

networkidle2 comes in handy for pages that do long polling or any other side activity and considers navigation to be finished when there are no more than two network connections for at least 500ms.

Then we called the pdf method to create a PDf and called it hn.pdf and we formatted it to be A4 size:

await page.pdf({ path: 'hn.pdf', format: 'A4' })

That’s it. We can now run the file to generate a PDF of Hacker News. Let’s go ahead and run the following command in the terminal:

$ node pdf

This will generate a PDF file called hn.pdf in the root directory of the project in A4 size.

3. Sign In to Facebook Using Puppeteer

Create a new file called signin.js with the following code:

const puppeteer = require('puppeteer')

const SECRET_EMAIL = 'example@gmail.com'
const SECRET_PASSWORD = 'secretpass123'

const main = async () => {
  const browser = await puppeteer.launch({
    headless: false,
  })
  const page = await browser.newPage()
  await page.goto('https://facebook.com', { waitUntil: 'networkidle2' })
  await page.waitForSelector('#login_form')
  await page.type('input#email', SECRET_EMAIL)
  await page.type('input#pass', SECRET_PASSWORD)
  await page.click('#loginbutton')
  // await browser.close()
}

main()

We’ve created two variables, SECRET_EMAIL and SECRET_PASSWORD, which should be replaced by your email and password of Facebook.

We then launch the browser and set headless mode to false to launch a full version of Chromium browser.

Then we go to Facebook and wait until everything is loaded.

On Facebook, there’s a #login_form selector that can be accessed via DevTools. This selector contains the login form, so we wait for it using waitForSelector method.

Then we have to type our email and password, so we grab the selectors input#email and input#pass from DevTools and pass in our SECRET_EMAIL and SECRET_PASSWORD.

After that, we click the #loginbutton to log in to Facebook.

The last line is commented out so that we see the whole process of typing email and password and clicking the login button.

Go ahead and run the code by typing the following in the terminal:

$ node signin

This will launch a whole Chromium browser and then log in to Facebook.

Conclusion

In this tutorial, we made a project that creates a screenshot of any given page within a specified viewport. We also built a project where we can create a PDF of any website. We then programmatically managed to sign in to Facebook.

Puppeteer recently released version 2, and it’s a nice piece of software to automate trivial tasks with a simple and easy-to-use API.

You can learn more about Puppeteer on its official website. The docs are very good, with tons of examples, and everything is well documented.

Now go ahead and automate boring tasks in your day-to-day life with Puppeteer.

#nodejs #Puppeteer

What is GEEK

Buddha Community

Getting Started with Puppeteer and Nodejs
Shubham Ankit

Shubham Ankit

1657081614

How to Automate Excel with Python | Python Excel Tutorial (OpenPyXL)

How to Automate Excel with Python

In this article, We will show how we can use python to automate Excel . A useful Python library is Openpyxl which we will learn to do Excel Automation

What is OPENPYXL

Openpyxl is a Python library that is used to read from an Excel file or write to an Excel file. Data scientists use Openpyxl for data analysis, data copying, data mining, drawing charts, styling sheets, adding formulas, and more.

Workbook: A spreadsheet is represented as a workbook in openpyxl. A workbook consists of one or more sheets.

Sheet: A sheet is a single page composed of cells for organizing data.

Cell: The intersection of a row and a column is called a cell. Usually represented by A1, B5, etc.

Row: A row is a horizontal line represented by a number (1,2, etc.).

Column: A column is a vertical line represented by a capital letter (A, B, etc.).

Openpyxl can be installed using the pip command and it is recommended to install it in a virtual environment.

pip install openpyxl

CREATE A NEW WORKBOOK

We start by creating a new spreadsheet, which is called a workbook in Openpyxl. We import the workbook module from Openpyxl and use the function Workbook() which creates a new workbook.

from openpyxl
import Workbook
#creates a new workbook
wb = Workbook()
#Gets the first active worksheet
ws = wb.active
#creating new worksheets by using the create_sheet method

ws1 = wb.create_sheet("sheet1", 0) #inserts at first position
ws2 = wb.create_sheet("sheet2") #inserts at last position
ws3 = wb.create_sheet("sheet3", -1) #inserts at penultimate position

#Renaming the sheet
ws.title = "Example"

#save the workbook
wb.save(filename = "example.xlsx")

READING DATA FROM WORKBOOK

We load the file using the function load_Workbook() which takes the filename as an argument. The file must be saved in the same working directory.

#loading a workbook
wb = openpyxl.load_workbook("example.xlsx")

 

GETTING SHEETS FROM THE LOADED WORKBOOK

 

#getting sheet names
wb.sheetnames
result = ['sheet1', 'Sheet', 'sheet3', 'sheet2']

#getting a particular sheet
sheet1 = wb["sheet2"]

#getting sheet title
sheet1.title
result = 'sheet2'

#Getting the active sheet
sheetactive = wb.active
result = 'sheet1'

 

ACCESSING CELLS AND CELL VALUES

 

#get a cell from the sheet
sheet1["A1"] <
  Cell 'Sheet1'.A1 >

  #get the cell value
ws["A1"].value 'Segment'

#accessing cell using row and column and assigning a value
d = ws.cell(row = 4, column = 2, value = 10)
d.value
10

 

ITERATING THROUGH ROWS AND COLUMNS

 

#looping through each row and column
for x in range(1, 5):
  for y in range(1, 5):
  print(x, y, ws.cell(row = x, column = y)
    .value)

#getting the highest row number
ws.max_row
701

#getting the highest column number
ws.max_column
19

There are two functions for iterating through rows and columns.

Iter_rows() => returns the rows
Iter_cols() => returns the columns {
  min_row = 4, max_row = 5, min_col = 2, max_col = 5
} => This can be used to set the boundaries
for any iteration.

Example:

#iterating rows
for row in ws.iter_rows(min_row = 2, max_col = 3, max_row = 3):
  for cell in row:
  print(cell) <
  Cell 'Sheet1'.A2 >
  <
  Cell 'Sheet1'.B2 >
  <
  Cell 'Sheet1'.C2 >
  <
  Cell 'Sheet1'.A3 >
  <
  Cell 'Sheet1'.B3 >
  <
  Cell 'Sheet1'.C3 >

  #iterating columns
for col in ws.iter_cols(min_row = 2, max_col = 3, max_row = 3):
  for cell in col:
  print(cell) <
  Cell 'Sheet1'.A2 >
  <
  Cell 'Sheet1'.A3 >
  <
  Cell 'Sheet1'.B2 >
  <
  Cell 'Sheet1'.B3 >
  <
  Cell 'Sheet1'.C2 >
  <
  Cell 'Sheet1'.C3 >

To get all the rows of the worksheet we use the method worksheet.rows and to get all the columns of the worksheet we use the method worksheet.columns. Similarly, to iterate only through the values we use the method worksheet.values.


Example:

for row in ws.values:
  for value in row:
  print(value)

 

WRITING DATA TO AN EXCEL FILE

Writing to a workbook can be done in many ways such as adding a formula, adding charts, images, updating cell values, inserting rows and columns, etc… We will discuss each of these with an example.

 

CREATING AND SAVING A NEW WORKBOOK

 

#creates a new workbook
wb = openpyxl.Workbook()

#saving the workbook
wb.save("new.xlsx")

 

ADDING AND REMOVING SHEETS

 

#creating a new sheet
ws1 = wb.create_sheet(title = "sheet 2")

#creating a new sheet at index 0
ws2 = wb.create_sheet(index = 0, title = "sheet 0")

#checking the sheet names
wb.sheetnames['sheet 0', 'Sheet', 'sheet 2']

#deleting a sheet
del wb['sheet 0']

#checking sheetnames
wb.sheetnames['Sheet', 'sheet 2']

 

ADDING CELL VALUES

 

#checking the sheet value
ws['B2'].value
null

#adding value to cell
ws['B2'] = 367

#checking value
ws['B2'].value
367

 

ADDING FORMULAS

 

We often require formulas to be included in our Excel datasheet. We can easily add formulas using the Openpyxl module just like you add values to a cell.
 

For example:

import openpyxl
from openpyxl
import Workbook

wb = openpyxl.load_workbook("new1.xlsx")
ws = wb['Sheet']

ws['A9'] = '=SUM(A2:A8)'

wb.save("new2.xlsx")

The above program will add the formula (=SUM(A2:A8)) in cell A9. The result will be as below.

image

 

MERGE/UNMERGE CELLS

Two or more cells can be merged to a rectangular area using the method merge_cells(), and similarly, they can be unmerged using the method unmerge_cells().

For example:
Merge cells

#merge cells B2 to C9
ws.merge_cells('B2:C9')
ws['B2'] = "Merged cells"

Adding the above code to the previous example will merge cells as below.

image

UNMERGE CELLS

 

#unmerge cells B2 to C9
ws.unmerge_cells('B2:C9')

The above code will unmerge cells from B2 to C9.

INSERTING AN IMAGE

To insert an image we import the image function from the module openpyxl.drawing.image. We then load our image and add it to the cell as shown in the below example.

Example:

import openpyxl
from openpyxl
import Workbook
from openpyxl.drawing.image
import Image

wb = openpyxl.load_workbook("new1.xlsx")
ws = wb['Sheet']
#loading the image(should be in same folder)
img = Image('logo.png')
ws['A1'] = "Adding image"
#adjusting size
img.height = 130
img.width = 200
#adding img to cell A3

ws.add_image(img, 'A3')

wb.save("new2.xlsx")

Result:

image

CREATING CHARTS

Charts are essential to show a visualization of data. We can create charts from Excel data using the Openpyxl module chart. Different forms of charts such as line charts, bar charts, 3D line charts, etc., can be created. We need to create a reference that contains the data to be used for the chart, which is nothing but a selection of cells (rows and columns). I am using sample data to create a 3D bar chart in the below example:

Example

import openpyxl
from openpyxl
import Workbook
from openpyxl.chart
import BarChart3D, Reference, series

wb = openpyxl.load_workbook("example.xlsx")
ws = wb.active

values = Reference(ws, min_col = 3, min_row = 2, max_col = 3, max_row = 40)
chart = BarChart3D()
chart.add_data(values)
ws.add_chart(chart, "E3")
wb.save("MyChart.xlsx")

Result
image


How to Automate Excel with Python with Video Tutorial

Welcome to another video! In this video, We will cover how we can use python to automate Excel. I'll be going over everything from creating workbooks to accessing individual cells and stylizing cells. There is a ton of things that you can do with Excel but I'll just be covering the core/base things in OpenPyXl.

⭐️ Timestamps ⭐️
00:00 | Introduction
02:14 | Installing openpyxl
03:19 | Testing Installation
04:25 | Loading an Existing Workbook
06:46 | Accessing Worksheets
07:37 | Accessing Cell Values
08:58 | Saving Workbooks
09:52 | Creating, Listing and Changing Sheets
11:50 | Creating a New Workbook
12:39 | Adding/Appending Rows
14:26 | Accessing Multiple Cells
20:46 | Merging Cells
22:27 | Inserting and Deleting Rows
23:35 | Inserting and Deleting Columns
24:48 | Copying and Moving Cells
26:06 | Practical Example, Formulas & Cell Styling

📄 Resources 📄
OpenPyXL Docs: https://openpyxl.readthedocs.io/en/stable/ 
Code Written in This Tutorial: https://github.com/techwithtim/ExcelPythonTutorial 
Subscribe: https://www.youtube.com/c/TechWithTim/featured 

#python 

Monty  Boehm

Monty Boehm

1659453850

Twitter.jl: Julia Package to Access Twitter API

Twitter.jl

A Julia package for interacting with the Twitter API.

Twitter.jl is a Julia package to work with the Twitter API v1.1. Currently, only the REST API methods are supported; streaming API endpoints aren't implemented at this time.

All functions have required arguments for those parameters required by Twitter and an options keyword argument to provide a Dict{String, String} of optional parameters Twitter API documentation. Most function calls will return either a Dict or an Array <: TwitterType. Bad requests will return the response code from the API (403, 404, etc).

DataFrame methods are defined for functions returning composite types: Tweets, Places, Lists, and Users.

Authentication

Before one can make use of this package, you must create an application on the Twitter's Developer Platform.

Once your application is approved, you can access your dashboard/portal to grab your authentication credentials from the "Details" tab of the application.

Note that you will also want to ensure that your App has Read / Write OAuth access in order to post tweets. You can find out more about this on Stack Overflow.

Installation

To install this package, enter ] on the REPL to bring up Julia's package manager. Then add the package:

julia> ]
(v1.7) pkg> add Twitter

Tip: Press Ctrl+C to return to the julia> prompt.

Usage

To run Twitter.jl, enter the following command in your Julia REPL

julia> using Twitter

Then the a global variable has to be declared with the twitterauth function. This function holds the consumer_key(API Key), consumer_secret(API Key Secret), oauth_token(Access Token), and oauth_secret(Access Token Secret) respectively.

twitterauth("6nOtpXmf...", # API Key
            "sES5Zlj096S...", # API Key Secret
            "98689850-Hj...", # Access Token
            "UroqCVpWKIt...") # Access Token Secret
  • Ensure you put your credentials in an env file to avoid pushing your secrets to the public 🙀.

Note: This package does not currently support OAuth authentication.

Code examples

See runtests.jl for example function calls.

using Twitter, Test
using JSON, OAuth

# set debugging
ENV["JULIA_DEBUG"]=Twitter

twitterauth(ENV["CONSUMER_KEY"], ENV["CONSUMER_SECRET"], ENV["ACCESS_TOKEN"], ENV["ACCESS_TOKEN_SECRET"])

#get_mentions_timeline
mentions_timeline_default = get_mentions_timeline()
tw = mentions_timeline_default[1]
tw_df = DataFrame(mentions_timeline_default)
@test 0 <= length(mentions_timeline_default) <= 20
@test typeof(mentions_timeline_default) == Vector{Tweets}
@test typeof(tw) == Tweets
@test size(tw_df)[2] == 30

#get_user_timeline
user_timeline_default = get_user_timeline(screen_name = "randyzwitch")
@test typeof(user_timeline_default) == Vector{Tweets}

#get_home_timeline
home_timeline_default = get_home_timeline()
@test typeof(home_timeline_default) == Vector{Tweets}

#get_single_tweet_id
get_tweet_by_id = get_single_tweet_id(id = "434685122671939584")
@test typeof(get_tweet_by_id) == Tweets

#get_search_tweets
duke_tweets = get_search_tweets(q = "#Duke", count = 200)
@test typeof(duke_tweets) <: Dict

#test sending/deleting direct messages
#commenting out because Twitter API changed. Come back to fix
# send_dm = post_direct_messages_send(text = "Testing from Julia, this might disappear later $(time())", screen_name = "randyzwitch")
# get_single_dm = get_direct_messages_show(id = send_dm.id)
# destroy = post_direct_messages_destroy(id = send_dm.id)
# @test typeof(send_dm) == Tweets
# @test typeof(get_single_dm) == Tweets
# @test typeof(destroy) == Tweets

#creating/destroying friendships
add_friend = post_friendships_create(screen_name = "kyrieirving")

unfollow = post_friendships_destroy(screen_name = "kyrieirving")
unfollow_df = DataFrame(unfollow)
@test typeof(add_friend) == Users
@test typeof(unfollow) == Users
@test size(unfollow_df)[2] == 40

# create a cursor for follower ids
follow_cursor_test = get_followers_ids(screen_name = "twitter", count = 10_000)
@test length(follow_cursor_test["ids"]) == 10_000

# create a cursor for friend ids - use barackobama because he follows a lot of accounts!
friend_cursor_test = get_friends_ids(screen_name = "BarackObama", count = 10_000)
@test length(friend_cursor_test["ids"]) == 10_000

# create a test for home timelines
home_t = get_home_timeline(count = 2)
@test length(home_t) > 1

# TEST of cursoring functionality on user timelines
user_t = get_user_timeline(screen_name = "stefanjwojcik", count = 400)
@test length(user_t) == 400
# get the minimum ID of the tweets returned (the earliest)
minid = minimum(x.id for x in user_t);

# now iterate until you hit that tweet: should return 399
# WARNING: current versions of julia cannot use keywords in macros? read here: https://github.com/JuliaLang/julia/pull/29261
# eventually replace since_id = minid
tweets_since = get_user_timeline(screen_name = "stefanjwojcik", count = 400, since_id = 1001808621053898752, include_rts=1)

@test length(tweets_since)>=399

# testing get_mentions_timeline
mentions = get_mentions_timeline(screen_name = "stefanjwojcik", count = 300) 
@test length(mentions) >= 50 #sometimes API doesn't return number requested (twitter API specifies count is the max returned, may be much lower)
@test Tweets<:typeof(mentions[1])

# testing retweets_of_me
my_rts = get_retweets_of_me(count = 300)
@test Tweets<:typeof(my_rts[1])

Want to contribute?

Contributions are welcome! Kindly refer to the contribution guidelines.

Linux: Build Status 

CodeCov: codecov

Author: Randyzwitch
Source Code: https://github.com/randyzwitch/Twitter.jl 
License: View license

#julia #api #twitter 

Hire NodeJs Developer

Looking to build dynamic, extensively featured, and full-fledged web applications?

Hire NodeJs Developer to create a real-time, faster, and scalable application to accelerate your business. At HourlyDeveloper.io, we have a team of expert Node.JS developers, who have experience in working with Bootstrap, HTML5, & CSS, and also hold the knowledge of the most advanced frameworks and platforms.

Contact our experts: https://bit.ly/3hUdppS

#hire nodejs developer #nodejs developer #nodejs development company #nodejs development services #nodejs development #nodejs

Joshua Rowe

Joshua Rowe

1574136022

Getting Started with Puppeteer and Nodejs

Browser developer tools provide an amazing array of options for delving under the hood of websites and web apps. These capabilities can be further enhanced and automated by third-party tools. In this article, we’ll look at Puppeteer, a Node-based library for use with Chrome/Chromium.

The puppeteer website describes Puppeteer as

a Node library which provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default, but can be configured to run full (non-headless) Chrome or Chromium.

Puppeteer is made by the team behind Google Chrome, so you can be pretty sure it will be well maintained. It lets us perform common actions on the Chromium browser, programmatically through JavaScript, via a simple and easy-to-use API.

With Puppeteer, you can:

  • scrape websites
  • generate screenshots of websites including SVG and Canvas
  • create PDFs of websites
  • crawl an SPA (single-page application)
  • access web pages and extract information using the standard DOM API
  • generate pre-rendered content — that is, server-side rendering
  • automate form submission
  • automate performance analysis
  • automate UI testing like Cypress
  • test chrome extensions

Puppeteer does nothing new that Selenium, PhantomJS (which is now deprecated), and the like do, but it provides a simple and easy-to-use API and provides a great abstraction so we don’t have to worry about the nitty-gritty details when dealing with it.

It’s also actively maintained so we get all the new features of ECMAScript as Chromium supports it.

Prerequisites

For this tutorial, you need a basic knowledge of JavaScript, ES6+ and Node.js.

You must also have installed the latest version of Node.js.

We’ll be using yarn throughout this tutorial. If you don’t have yarn already installed, install it from here.

To make sure we’re on the same page, these are the versions used in this tutorial:

  • Node 12.12.0
  • yarn 1.19.1
  • puppeteer 2.0.0

Installation

To use Puppeteer in your project, run the following command in the terminal:

$ yarn add puppeteer

Note: when you install Puppeteer, it downloads a recent version of Chromium (~170MB macOS, ~282MB Linux, ~280MB Win) that is guaranteed to work with the API. To skip the download, see Environment variables.

If you don’t need to download Chromium, then you can install puppeteer-core:

$ yarn add puppeteer-core

puppeteer-core is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one. Be sure that the version of puppeteer-core you install is compatible with the browser you intend to connect to.

Note: puppeteer-core is only published from version 1.7.0.

Usage

Puppeteer requires at least Node v6.4.0, but we’re going to use async/await, which is only supported in Node v7.6.0 or greater, so make sure to update your Node.js to the latest version to get all the goodies.

Let’s dive into some practical examples using Puppeteer. In this tutorial, we’ll be:

  1. generating a screenshot of Unsplash using Puppeteer
  2. creating a PDF of Hacker News using Puppeteer
  3. signing in to Facebook using Puppeteer

1. Generate a Screenshot of Unsplash using Puppeteer

It’s really easy to do this with Puppeteer. Go ahead and create a screenshot.js file in the root of your project. Then paste in the following code:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash.png' })

  await browser.close()
}

main()

Firstly, we require the puppeteer package. Then we call the launch method on it that initializes the instance. This method is asynchronous as it returns a Promise. So we await for it to get the browser instance.

Then we call newPage on it and go to Unsplash and take a screenshot of it and save the screenshot as unsplash.png.

Now go ahead and run the above code in the terminal by typing:

$ node screenshot

Unsplash - 800px x 600px resolution

Now after 5–10 seconds you’ll see an unsplash.png file in your project that contains the screenshot of Unsplash. Notice that the viewport is set to 800px x 600px as Puppeteer sets this as the initial page size, which defines the screenshot size. The page size can be customized with Page.setViewport().

Let’s change the viewport to be 1920px x 1080px. Insert the following code before the goto method:

await page.setViewport({
  width: 1920,
  height: 1080,
  deviceScaleFactor: 1,
})

Now go ahead and also change the filename from unsplash.png to unsplash2.png in the screenshot method like so:

await page.screenshot({ path: 'unsplash2.png' })

The whole screenshot.js file should now look like this:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.setViewport({
    width: 1920,
    height: 1080,
    deviceScaleFactor: 1,
  })
  await page.goto('https://unsplash.com')
  await page.screenshot({ path: 'unsplash2.png' })

  await browser.close()
}

main()

Unsplash - 1920px x 1080px

2. Create PDF of Hacker News using Puppeteer

Now create a file named pdf.js and paste the following code into it:

const puppeteer = require('puppeteer')

const main = async () => {
  const browser = await puppeteer.launch()
  const page = await browser.newPage()
  await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })
  await page.pdf({ path: 'hn.pdf', format: 'A4' })

  await browser.close()
}

main()

We’ve only changed two lines from the screenshot code.

Firstly, we’ve replaced the URL with Hacker News and then added networkidle2:

await page.goto('https://news.ycombinator.com', { waitUntil: 'networkidle2' })

networkidle2 comes in handy for pages that do long polling or any other side activity and considers navigation to be finished when there are no more than two network connections for at least 500ms.

Then we called the pdf method to create a PDf and called it hn.pdf and we formatted it to be A4 size:

await page.pdf({ path: 'hn.pdf', format: 'A4' })

That’s it. We can now run the file to generate a PDF of Hacker News. Let’s go ahead and run the following command in the terminal:

$ node pdf

This will generate a PDF file called hn.pdf in the root directory of the project in A4 size.

3. Sign In to Facebook Using Puppeteer

Create a new file called signin.js with the following code:

const puppeteer = require('puppeteer')

const SECRET_EMAIL = 'example@gmail.com'
const SECRET_PASSWORD = 'secretpass123'

const main = async () => {
  const browser = await puppeteer.launch({
    headless: false,
  })
  const page = await browser.newPage()
  await page.goto('https://facebook.com', { waitUntil: 'networkidle2' })
  await page.waitForSelector('#login_form')
  await page.type('input#email', SECRET_EMAIL)
  await page.type('input#pass', SECRET_PASSWORD)
  await page.click('#loginbutton')
  // await browser.close()
}

main()

We’ve created two variables, SECRET_EMAIL and SECRET_PASSWORD, which should be replaced by your email and password of Facebook.

We then launch the browser and set headless mode to false to launch a full version of Chromium browser.

Then we go to Facebook and wait until everything is loaded.

On Facebook, there’s a #login_form selector that can be accessed via DevTools. This selector contains the login form, so we wait for it using waitForSelector method.

Then we have to type our email and password, so we grab the selectors input#email and input#pass from DevTools and pass in our SECRET_EMAIL and SECRET_PASSWORD.

After that, we click the #loginbutton to log in to Facebook.

The last line is commented out so that we see the whole process of typing email and password and clicking the login button.

Go ahead and run the code by typing the following in the terminal:

$ node signin

This will launch a whole Chromium browser and then log in to Facebook.

Conclusion

In this tutorial, we made a project that creates a screenshot of any given page within a specified viewport. We also built a project where we can create a PDF of any website. We then programmatically managed to sign in to Facebook.

Puppeteer recently released version 2, and it’s a nice piece of software to automate trivial tasks with a simple and easy-to-use API.

You can learn more about Puppeteer on its official website. The docs are very good, with tons of examples, and everything is well documented.

Now go ahead and automate boring tasks in your day-to-day life with Puppeteer.

#nodejs #Puppeteer

Build a GraphQL app in Node.js with TypeScript and graphql-request

In this article, you will build a full-stack app using GraphQL and Node.js in the backend. Meanwhile, our frontend will use the graphql-request library to perform network operations on our backend.

Why use graphql-request and TypeScript?

Whenever developers build a GraphQL server using Apollo, the library generates a “frontend” which looks like so:

Frontend Developed By GraphQL And Apollo

This interface allows users to make query or mutation requests to the server via code. However, let’s address the elephant in the room: it doesn’t look very user friendly. Since the frontend doesn’t feature any buttons or any helpful interface elements, it might be hard for many users to navigate around your app. Consequently, this shrinks your user base. So how do we solve this problem?

This is where graphql-request comes in. It is an open source library which lets users perform queries on a GraphQL server. It boasts the following features:

  • Lightweight — This library is just over 21 kilobytes minified, which ensures your app stays performant
  • Promise-based API — This brings in support for asynchronous applications
  • TypeScript support — graphql-request is one of many libraries which allows for TypeScript. One major advantage of Typescript is that it allows for stable and predictable code

For example, look at the following program:

let myNumber = 9; //here, myNumber is an integer
myNumber = 'hello'; //now it is a string.
myNumber = myNumber + 10; //even though we are adding a string to an integer,
//JavaScript won't return an error. In the real world, it might bring unexpected outputs.
//However, in Typescript, we can tell the compiler..
//what data types we need to choose.
let myNumber:number = 39; //tell TS that we want to declare an integer.
myNumber = 9+'hello'; //returns an error. Therefore, it's easier to debug the program
//this promises stability and security. 

In this article, we will build a full-stack app using GraphQL and TypeScript. Here, we will use the apollo-server-express package to build a backend server. Furthermore, for the frontend, we will use Next and graphql-request to consume our GraphQL API.

Building our server

Project initialization

To initialize a blank Node.js project, run these terminal commands:

mkdir graphql-ts-tutorial #create project folder 
cd graphql-ts-tutorial 
npm init -y #initialize the app

When that’s done, we now have to tell Node that we need to use TypeScript in our codebase:

#configure our Typescript:
npx tsc --init --rootDir app --outDir dist --esModuleInterop --resolveJsonModule --lib es6 --module commonjs --allowJs true --noImplicitAny true
mkdir app #our main code folder
mkdir dist #Typescript will use this folder to compile our program.

Next, install these dependencies:

#development dependencies. Will tell Node that we will use Typescript
npm install -d ts-node @types/node typescript @types/express nodemon
#Installing Apollo Server and its associated modules. Will help us build our GraphQL
#server
npm install apollo-server-express apollo-server-core express graphql

After this step, navigate to your app folder. Here, create the following files:

  • index.ts: Our main file. This will execute and run our Express GraphQL server
  • dataset.ts: This will serve as our database, which will be served to the client
  • Resolvers.ts: This module will handle user commands. We will learn about resolvers later in this article
  • Schema.ts: As the name suggests, this file will store the schematics needed to send data to the client

In the end, your folder structure should look like so:

Folder Structure

Creating our database

In this section, we will create a dummy database which will be used to send requested data. To do so, go to app/dataset.ts and write the following code:

let people: { id: number; name: string }[] = [
  { id: 1, name: "Cassie" },
  { id: 2, name: "Rue" },
  { id: 3, name: "Lexi" },
];
export default people;
  • First, we created an array of objects called people
  • This array will have two fields: id of type number, and name of type string

Defining our schema

Here, we will now create a schema for our GraphQL server.

To put it simply, a GraphQL schema is a description of the dataset that clients can request from an API. This concept is similar to that of the Mongoose library.
To build a schema, navigate to the app/Schema.ts file. There, write the following code:

import { gql } from "apollo-server-express"; //will create a schema
const Schema = gql`
  type Person {
    id: ID!
    name: String
  }
  #handle user commands
  type Query {
    getAllPeople: [Person] #will return multiple Person instances
    getPerson(id: Int): Person #has an argument of 'id` of type Integer.
  }
`;
export default Schema; 
//export this Schema so we can use it in our project

Let’s break down this code piece by piece:

  • The Schema variable contains our GraphQL schema
  • First, we created a Person schema. It will have two fields: id of type ID and name of type String
  • Later on, we instructed GraphQL that if the client runs the getAllPeople command, the server will return an array of Person objects
  • Furthermore, if the user uses the getPerson command, GraphQL will return a single Person instance

Creating resolvers

Now that we have coded our schema, our next step is to define our resolvers.
In simple terms, a resolver is a group of functions that generate response for a GraphQL query. In other words, a resolver serves as a GraphQL query handler.
In Resolvers.ts, write the following code:

import people from "./dataset"; //get all of the available data from our database.
const Resolvers = {
  Query: {
    getAllPeople: () => people, //if the user runs the getAllPeople command
    //if the user runs the getPerson command:
    getPerson: (_: any, args: any) => { 
      console.log(args);
      //get the object that contains the specified ID.
      return people.find((person) => person.id === args.id);
    },
  },
};
export default Resolvers;
  • Here, we created a Query object that handles all the incoming queries going to the server
  • If the user executes the getAllPeople command, the program will return all the objects present in our database
  • Moreover, the getPerson command requires an argument id. This will return a Person instance with the matching ID
  • In the end, we exported our resolver so that it could be linked with our app

Configuring our server

We’re almost done! Now that we have built both our schema and resolver, our next step is to link them together.

In index.js, write this block of code:

import { ApolloServer } from "apollo-server-express";
import Schema from "./Schema";
import Resolvers from "./Resolvers";
import express from "express";
import { ApolloServerPluginDrainHttpServer } from "apollo-server-core";
import http from "http";

async function startApolloServer(schema: any, resolvers: any) {
  const app = express();
  const httpServer = http.createServer(app);
  const server = new ApolloServer({
    typeDefs: schema,
    resolvers,
    //tell Express to attach GraphQL functionality to the server
    plugins: [ApolloServerPluginDrainHttpServer({ httpServer })],
  }) as any;
  await server.start(); //start the GraphQL server.
  server.applyMiddleware({ app });
  await new Promise<void>((resolve) =>
    httpServer.listen({ port: 4000 }, resolve) //run the server on port 4000
  );
  console.log(`Server ready at http://localhost:4000${server.graphqlPath}`);
}
//in the end, run the server and pass in our Schema and Resolver.
startApolloServer(Schema, Resolvers);

Let’s test it out! To run the code, use this Bash command:

npx nodemon app/index.ts 

This will create a server at the localhost:4000/graphql URL.

Here, you can see your available schemas within the UI:

Available Schemas Within The UI

This means that our code works!

All of our GraphQL queries will go within the Operation panel. To see it in action, type this snippet within this box:

#make a query:
query {
  #get all of the people available in the server
  getAllPeople {
    #procure their IDs and names.
    id
    name
  }
}

To see the result, click on the Run button:

Run Button For Results

We can even search for a specific entity via the getPerson query:

query ($getPersonId: Int) { #the argument will be of type Integer
  getPerson(id: 1) {
    #get the person with the ID of 1
    name
    id
  }
}

Getperson Query

Creating mutations

In the GraphQL world, mutations are commands that perform side effects on the database. Common examples of this include:

  • Adding a user to the database — When a client signs up for a website, the user performs a mutation to save their data in their database
  • Editing or deleting an object — If a user modifies or removes data from a database, they are essentially creating a mutation on the server

To handle mutations, go to your Schema.ts module. Here, within the Schema variable, add the following lines of code:

const Schema = gql`
  #other code..
  type Mutation {
    #the addPerson commmand will accept an argument of type String.
    #it will return a 'Person' instance. 
    addPerson(name: String): Person
  }
`;

Our next step is to create a resolver to handle this mutation. To do so, within the Resolvers.ts file, add this block of code:

const Resolvers = {
  Query: {
    //..further code..
  },
  //code to add:
  //all our mutations go here.
  Mutation: {
    //create our mutation:
    addPerson: (_: any, args: any) => {
      const newPerson = {
        id: people.length + 1, //id field
        name: args.name, //name field
      };
      people.push(newPerson);
      return newPerson; //return the new object's result
    },
  },
};
  • The addPerson mutation accepts a name argument
  • When a name is passed, the program will create a new object with a matching name key
  • Next, it will use the push method to add this object to the people dataset
  • Finally, it will return the new object’s properties to the client

That’s it! To test it out, run this code within the Operations window:

#perform a mutation on the server
mutation($name: String) {
  addPerson(name:"Hussain") { #add a new person with the name "Hussain"
    #if the execution succeeds, return its 'id' and 'name` to the user.
    id
    name
  }
}

Addperson

Let’s verify if GraphQL has added the new entry to the database:

query {
  getAllPeople { #get all the results within the 'people' database. 
  #return only their names
  name 
  }
}

Verify That GraphQL Added A New Entry

Building our client

We have successfully built our server. In this section, we will build a client app using Next that will listen to the server and render data to the UI.

As a first step, initialize a blank Next.js app like so:

npx create-next-app@latest graphql-client --ts
touch constants.tsx #our query variables go here.

To perform GraphQL operations, we will use the graphql-request library. This is a minimal, open source module that will help us make mutations and queries on our server:

npm install graphql-request graphql
npm install react-hook-form #to capture user input

Creating query variables

In this section, we will code our queries and mutations to help us make GraphQL operations. To do so, go to constants.tsx and add the following code:

import { gql } from "graphql-request";
//create our query
const getAllPeopleQuery = gql`
  query {
    getAllPeople { #run the getAllPeople command
      id
      name
    }
  }
`;
//Next, declare a mutation
const addPersonMutation = gql`
  mutation addPeople($name: String!) {
    addPerson(name: $name) { #add a new entry. Argument will be 'name'
      id
      name
    }
  }
`;
export { getAllPeopleQuery, addPersonMutation };
  • In the first part, we created the getAllPeopleQuery variable. When the user runs this query, the program will instruct the server to get all the entries present in the database
  • Later on, the addPerson mutation tells GraphQL to add a new entry with its respected name field
  • In the end, we used the export keyword to link our variables with the rest of the project

Performing queries

In pages/index.ts, write the following code:

import type { NextPage, GetStaticProps, InferGetStaticPropsType } from "next";
import { request } from "graphql-request"; //allows us to perform a request on our server
import { getAllPeopleQuery } from "../constants"; 
import Link from "next/link";
const Home: NextPage = ({
  result, //extract the 'result' prop 
}: InferGetStaticPropsType<typeof getStaticProps>) => {
  return (
    <div className={styles.container}>
      {result.map((item: any) => { //render the 'result' array to the UI 
        return <p key={item.id}>{item.name}</p>;
      })}
    <Link href="/addpage">Add a new entry </Link>
    </div>
  );
};
//fetch data from the server
export const getStaticProps: GetStaticProps = async () => {
  //the first argument is the URL of our GraphQL server
  const res = await request("http://localhost:4000/graphql", getAllPeopleQuery);
  const result = res.getAllPeople;
  return {
    props: {
      result,
    }, // will be passed to the page component as props
  };
};
export default Home;

Here is a breakdown of this code piece by piece:

  • In the getStaticProps method, we instructed Next to run the getAllPeople command on our GraphQL server
  • Later on, we returned its response to the Home functional component. This means that we can now render the result to the UI
  • Next, the program used the map method to render all of the results of the getAllPeople command to the UI. Each paragraph element will display the name fields of each entry
  • Furthermore, we also used a Link component to redirect the user to the addpage route. This will allow the user to add a new Person instance to the table

To test out the code, run the following terminal command:

npm run dev

This will be the result:

Addpage Route

Our GraphQL server even updates in real time.

GraphQL Updating In Real Time

Performing mutations

Now that we have successfully performed a query, we can even perform mutations via the graphql-request library.

Within your pages folder, create a new file called addpage.tsx. As the name suggests, this component will allow the user to add a new entry to the database. Here, start by writing the following block of code:

import type { NextPage, GetStaticProps, InferGetStaticPropsType } from "next";
import { request } from "graphql-request";
import { addPersonMutation } from "../constants";
const AddPage: NextPage = () => {
  return (
    <div>
      <p>We will add a new entry here. </p>
    </div>
  );
};
export default AddPage;

In this piece of code, we are creating a blank page with a piece of text. We are doing this to ensure whether our URL routing system works.

Creating A Blank Page To Ensure URL Routing Works

This means that we used routing successfully! Next, write this snippet in your addpage.tsx file:

import { useForm } from "react-hook-form";
const { register, handleSubmit } = useForm();
//if the user submits the form, then the program will output the value of their input.
const onSubmit = (data: any) => console.log(data);
return (
  <div>
    <form onSubmit={handleSubmit(onSubmit)}> {/*Bind our handler to this form.*/}
      {/* The user's input will be saved within the 'name' property */}
      <input defaultValue="test" {...register("name")} />
      <input type="submit" />
    </form>
  </div>
);

This will be the output:

 Output

Now that we have successfully captured the user’s input, our last step is to add their entry to the server.

To do so, change the onSubmit handler located in pages/addpage.tsx file like so:

const onSubmit = async (data: any) => {
  const response = await request(
    "http://localhost:4000/graphql",
    addPersonMutation,
    data
  );
  console.log(response);
};
  • Here, we’re performing a mutation request to our GraphQL server via the request function
  • Furthermore, we also passed in the addPerson mutation command to our request header. This will tell GraphQL to perform the addMutation action on our server

This will be the result:

Result Of Addmutation Action

And we’re done!

Conclusion

Here is the full source code of this project.

In this article, you learned how to create a full-stack app using GraphQL and TypeScript. They both are extremely crucial skills within the programming world since they are in high demand nowadays.

If you encountered any difficulty in this code, I advise you to deconstruct the code and play with it so that you can fully grasp this concept.

Thank you so much for reading! Happy coding!

This story was originally published at https://blog.logrocket.com/build-graphql-app-node-js-typescript-graphql-request/

#graphql #typescript #nodejs