See how to use tagging and templates inside Data Catalog, Google Cloud’s metadata management service that covers operational and business metadata. Google Cloud Data Catalog is a fully managed and scalable metadata management service. Data Catalog helps your organization quickly discover, understand, and manage all your data from one simple interface, letting you gain valuable business insights out of your data investments. One of Data Catalog’s core concepts, called tag templates, helps you organize complex metadata while making it searchable under Cloud Identity and Access Management ( Cloud IAM) control.
Google Cloud Data Catalog is a fully managed and scalable metadata management service. Data Catalog helps your organization quickly discover, understand, and manage all your data from one simple interface, letting you gain valuable business insights out of your data investments. One of Data Catalog’s core concepts, called tag templates, helps you organize complex metadata while making it searchable under Cloud Identity and Access Management ( Cloud IAM) control. In this post, we’ll offer some best practices and useful tag templates (referred to as templates from here) to help you start your journey.
tag template is a collection of related fields that represent your vocabulary for classifying data assets. Each field has a name and a type. The type can be a
datetime. When the type is an
enum, the template also stores the possible values for this field. The fields are stored as an unordered set in the template and each field is treated as optional unless marked as required. A required field means that a value must be assigned to this field each time the template is in use. An optional field means it can be left out when an instance of this template is created.
You’ll create instances of templates when tagging data resources, such as BigQuery tables and views. Tagging means associating a tag template with a specific resource and assigning values to the template fields to describe the resource. We refer to these tags as
structured tags because the fields in these tags are typed as instances of the template. Typed fields let you avoid common misspellings and other inconsistencies, a known pitfall with simple key value pairs.
Two common questions we hear about Data Catalog templates are: What kind of fields should go into a template and how should templates be organized? The answer to the first question really depends on what kind of metadata your organization wants to keep track of and how that metadata will be used. There are various metadata use cases, ranging from data discovery to data governance, and the requirements for each one should drive the contents of the templates.
Let’s look at a simple example of how you might organize your templates. Suppose the goal is to make it easier for analysts to discover data assets in a data lake because they spend a lot of time searching for the right assets. In that case, create a Data Discovery template, which would categorize the assets along the dimensions that the analysts want to search. This would include fields such as
creation_date, etc. If the data governance team wants to categorize the assets for data compliance purposes, you can create a separate template with governance-specific fields, such as
storage_location, etc. In other words, we recommend creating templates to represent a single concept, rather than placing multiple concepts into one template. This avoids confusing those who are using the templates and helps the template administrators maintain them over time.
**Link: https://www.youtube.com/watch?v=gud65lqebrc** In this [**Google Cloud Training**](https://www.youtube.com/watch?v=gud65lqebrc "Google Cloud Training") live session, you will know everything about google cloud from basic to advance level...
If you looking to learn about Google Cloud in depth or in general with or without any prior knowledge in cloud computing, then you should definitely check this quest out.
Mismanagement of multi-cloud expense costs an arm and leg to business and its management has become a major pain point. Here we break down some crucial tips to take some of the management challenges off your plate and help you optimize your cloud spend.
The Cloud is a complicated space. It’s not a simple plug and play as most people would imagine. Let’s simplify the Cloud: GCP Edition. The Cloud is a complicated space.
For Big Data Analytics, the challenges faced by businesses are unique and so will be the solution required to help access the full potential of Big Data.