Exploring Dataset Metadata Between Projects with Data Catalog
Data Catalog is a fully managed, scalable metadata management service in Google Cloud's Data Analytics family of products.
Managing data assets can be time consuming and expensive without the right tools. Data Catalog provides a centralized place where organizations can find, curate and describe their data assets.
Using Data Catalog
There are two main ways you interact with Data Catalog:
Searching for data assets that you have access to
Tagging assets with metadata
What you will learn
In this lab, you will learn how to:
Explore a simulated enterprise environment of 2 projects, 2 datasets, and 2 user accounts.
Navigate through a BigQuery table manually in the UI.
Run queries to better understand sensitive data columns that we want to tag later.
Use Data Catalog to search for existing datasets across projects.
Use Data Catalog tag templates to tag assets with rich metadata.
Why is this useful?
View data assets across multiple projects in your organization.
Create re-usable tag templates to add rich data descriptions for your teams.
Quickly highlight which datasets have PII (Personally Identifiable Information).
Metadata Access control is inherited based on logged in user (no separate Data Catalog ACLs needed).
Very Important: Before starting this lab, log out of your personal or corporate gmail account, or run this lab in Incognito. This prevents sign-in confusion while the lab is running.
Join Qwiklabs to read the rest of this lab...and more!
- Get temporary access to the cloud console.
- Over 200 labs from beginner to advanced levels.
- Bite-sized so you can learn at your own pace.