If you manage a data and analytics pipeline in Google Cloud, you may want to monitor it and obtain a comprehensive view of the end-to-end analytics process in order to react quickly when something breaks.

This article shows you how you can capture Cloud Dataprep jobs status via APIs leveraging Cloud Functions. We then input the statuses to a Google Sheet for an easy way to check the statuses of the jobs. Using the same principle, you can combine other Google Cloud service statuses in Google Sheets to obtain a comprehensive view of your data pipeline.

To illustrate this concept, we will assume you want to monitor a daily scheduled Dataprep job with a quick look at a Google Sheet to get an overview of potential failure. The icing on the cake is that you will also be able to check the recipe name and jobs profile results in Google Sheets.

This article is a step-by-step guide to the process of triggering Cloud Functions when a Cloud Dataprep job is finished and publishing the job results, status, and direct links into a Google Sheet.

Here is an example of a Google Sheet with jobs results and links published.

Image for post

Fig. 1 — Google Sheet Dataprep job results with access to job profile PDF

Image for post

Fig. 2 — High-level process to trigger a Cloud Function based on a Cloud Dataprep job execution

1. Getting Started

To make this guide practical, we are sharing it here in Github, the Node.js code for the Cloud Function.

You need a valid Google account and access to Cloud Dataprep and Cloud Functions to try it out. You can start from the Google Console https://console.cloud.google.com/ to activate the services.

REMARK: To call APIs, one needs an Access Token. One must be a Google Cloud project owner to generate this Access Token. If you are not a Google Cloud project owner, you can try it out by using a personal Gmail account.

Image for post

#google-cloud-functions #google-cloud #data-preparation #function

Leverage Cloud Functions
1.25 GEEK