Troubleshooting and Solving Data Join Pitfalls

search share Dołącz Zaloguj się

Troubleshooting and Solving Data Join Pitfalls

1 godz. Punkty: 5


Google Cloud Self-Paced Labs


BigQuery is Google's fully managed, NoOps, low cost analytics database. With BigQuery you can query terabytes and terabytes of data without having any infrastructure to manage or needing a database administrator. BigQuery uses SQL and can take advantage of the pay-as-you-go model. BigQuery allows you to focus on analyzing data to find meaningful insights.

Joining data tables can provide meaningful insight into your dataset. However when you join your data, there are common pitfalls that could corrupt your results. This lab focuses on avoiding those pitfalls. Types of joins:

  • Cross join: combines each row of the first dataset with each row of the second dataset, where every combination is represented in the output.
  • Inner join: requires that key values exist in both tables for the records to appear in the results table. Records appear in the merge only if there are matches in both tables for the key values.
  • Left join: Each row in the left table appears in the results, regardless of whether there are matches in the right table.
  • Right join: the reverse of a left join. Each row in the right table appears in the results, regardless of whether there are matches in the left table.

For more information about joins, see Join Page.

The dataset you'll use is an ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery. You have a copy of that dataset for this lab and will explore the available fields and row for insights.

For syntax information to help you follow and update the queries, see Standard SQL Query Syntax.

What you'll do

In this lab, you perform these tasks:

  • Use BigQuery to explore a dataset

  • Troubleshoot duplicate rows in a dataset

  • Create joins between data tables

  • Understand each join type

Dołącz do Qwiklabs, aby zapoznać się z resztą tego modułu i innymi materiałami.

  • Uzyskaj tymczasowy dostęp do Google Cloud Console.
  • Ponad 200 modułów z poziomów od początkującego do zaawansowanego.
  • Podzielono na części, więc można uczyć się we własnym tempie.
Dołącz, aby rozpocząć ten moduł