menu
arrow_back

Exploring the Lineage of Data with Cloud Data Fusion

—/100

Checkpoints

arrow_forward

Create a Cloud Data Fusion instance

Add Cloud Data Fusion API Service Agent role to service account

Import, Deploy and Run Shipment Data Cleansing pipeline

Import, Deploy, and Run the Delayed Shipments data pipeline

Exploring the Lineage of Data with Cloud Data Fusion

1시간 30분 크레딧 7개

GSP812

Google Cloud Self-Paced Labs

Overview

This lab shows how to use Cloud Data Fusion to explore data lineage: the data's origins and its movement over time.

Cloud Data Fusion data lineage helps you:

  • Detect the root cause of bad data events
  • Perform an impact analysis prior to making data changes

Cloud Data Fusion provides lineage at the dataset level and field level, and is time-bound to show lineage over time.

  • Dataset level lineage shows the relationship between datasets and pipelines in a selected time interval.

  • Field level lineage shows the operations that were performed on a set of fields in the source dataset to produce a different set of fields in the target dataset.

For the purpose of this lab, you will use two pipelines that demonstrate a typical scenario in which raw data is cleaned then sent for downstream processing. This data trail from raw data to the cleaned shipment data to analytic output can be explored using the Cloud Data Fusion lineage feature.

Note: Currently, the Cloud Data Fusion Lineage feature is only available with the Cloud Data Fusion Enterprise Edition.

이 실습의 나머지 부분과 기타 사항에 대해 알아보려면 Qwiklabs에 가입하세요.

  • Google Cloud Console에 대한 임시 액세스 권한을 얻습니다.
  • 초급부터 고급 수준까지 200여 개의 실습이 준비되어 있습니다.
  • 자신의 학습 속도에 맞춰 학습할 수 있도록 적은 분량으로 나누어져 있습니다.
이 실습을 시작하려면 가입하세요