Transform and Clean your Data with Dataprep by Trifacta on Google Cloud

Advanced 5 étapes 7 heures 29 crédits

Dataprep is Google's self-service data preparation tool built in collaboration with Trifacta. Learn the basics of cleaning and preparing data for analysis and visualization, all in the Google ecosystem. In this quest, you will learn how to connect Dataprep to your data in Cloud Storage and BigQuery, clean data using the interactive UI, profile the data, and publish your results back into the Google ecosystem. You will learn the basics of data transformation, including filtering values, reshaping the data, combining multiple datasets, deriving new values, and aggregating your dataset.

Objectifs :

  • Understand what is Data Preparation for modern analytics
  • How Dataprep is positioned in the Google Cloud Smart Analytics Suite
  • Assess data quality with Dataprep Profiling
  • Transform, filter, reshape, combine, calculate, aggregate data visually
  • Schedule Dataprep jobs to transform data for BigQuery

Quest Outline


Utiliser Google Cloud Dataprep

Cloud Dataprep est l'outil de préparation de données en libre-service de Google. Au cours de cet atelier, vous allez apprendre à nettoyer et enrichir plusieurs ensembles de données dans Cloud Dataprep à l'aide d'un scénario d'utilisation fictif comprenant des informations sur les clients et un historique des achats.

Preparing and Aggregating Data for Visualizations using Cloud Dataprep

Dataprep by Trifacta is Google's self-service data preparation tool built in collaboration with Trifacta. In this lab you will learn some more advanced techniques with Dataprep.


Creating Advanced Data Transformations using Cloud Dataprep

In this lab, you will build upon a previous flow and learn some advanced tactics for preparing data.


Automating your BigQuery Data Pipeline with Cloud Dataprep

In this lab, you will examine how Dataprep can be used on complicated data structures in BigQuery.


Streaming IoT Core Data to Dataprep

Configure Cloud IoT Core and Cloud Pub/Sub to create a Pub/Sub topic and registry on GCP. Using a simulated device, stream data to Google Cloud Storage, then design a Dataprep flow to analyze data.


