menu
arrow_back

Exploring Dataset Metadata Between Projects with Data Catalog

Exploring Dataset Metadata Between Projects with Data Catalog

1 个小时 30 分钟 1 个积分

GSP789

Google Cloud Self-Paced Labs

Overview

Data Catalog is a fully managed, scalable metadata management service in Google Cloud's Data Analytics family of products.

Managing data assets can be time consuming and expensive without the right tools. Data Catalog provides a centralized place where organizations can find, curate and describe their data assets.

Using Data Catalog

There are two main ways you interact with Data Catalog:

  • Searching for data assets that you have access to

  • Tagging assets with metadata

What you will learn

In this lab, you will learn how to:

  • Explore a simulated enterprise environment of 2 projects, 2 datasets, and 2 user accounts.

  • Navigate through a BigQuery table manually in the UI.

  • Run queries to better understand sensitive data columns that we want to tag later.

  • Use Data Catalog to search for existing datasets across projects.

  • Use Data Catalog tag templates to tag assets with rich metadata.

Why is this useful?

  • View data assets across multiple projects in your organization.

  • Create re-usable tag templates to add rich data descriptions for your teams.

  • Quickly highlight which datasets have PII (Personally Identifiable Information).

  • Metadata Access control is inherited based on logged in user (no separate Data Catalog ACLs needed).

Prerequisites

Very Important: Before starting this lab, log out of your personal or corporate gmail account, or run this lab in Incognito. This prevents sign-in confusion while the lab is running.

加入 Qwiklabs 即可阅读本实验的剩余内容…以及更多精彩内容!

  • 获取对“Cloud Console”的临时访问权限。
  • 200 多项实验,从入门级实验到高级实验,应有尽有。
  • 内容短小精悍,便于您按照自己的节奏进行学习。
加入以开始此实验