Sometimes the data you need is available online, but not through a public API. Web scraping can be useful in these situations, but only if this data can be accessed statically on a web page. Fortunately for developers everywhere, most things that you can do manually in the browser can be done using Puppeteer, a Node library which provides a high-level API to control Chrome or Chromium over the DevTools protocol.

Let’s walk through how to use Puppeteer to write scripts to interact with web pages programmatically. In this example we’ll use the Native Land Digital tool, an awesome project built to help people learn more about their local indigenous history. In this case, an API does exist, but it only takes location data in the form of geo-coordinates rather than a more user-friendly address. We’ll write code to programmatically type in an address and figure out which Native land corresponds to that location.

#node #puppeteer

Automated Headless Browser scripts in Node.js with Puppeteer
2.50 GEEK