Troubleshooting and Solving Data Join Pitfalls




Create a new dataset

Identify a key field in your ecommerce dataset

Pitfall: non-unique key

Join pitfall solution

Troubleshooting and Solving Data Join Pitfalls

1 hour 5 Credits


Google Cloud Self-Paced Labs


BigQuery is Google's fully managed, NoOps, low cost analytics database. With BigQuery you can query terabytes and terabytes of data without having any infrastructure to manage or needing a database administrator. BigQuery uses SQL and can take advantage of the pay-as-you-go model. BigQuery allows you to focus on analyzing data to find meaningful insights.

Joining data tables can provide meaningful insight into your dataset. However when you join your data, there are common pitfalls that could corrupt your results. This lab focuses on avoiding those pitfalls. Types of joins:

  • Cross join: combines each row of the first dataset with each row of the second dataset, where every combination is represented in the output.
  • Inner join: requires that key values exist in both tables for the records to appear in the results table. Records appear in the merge only if there are matches in both tables for the key values.
  • Left join: Each row in the left table appears in the results, regardless of whether there are matches in the right table.
  • Right join: the reverse of a left join. Each row in the right table appears in the results, regardless of whether there are matches in the left table.

For more information about joins, see Join Page.

The dataset you'll use is an ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery. You have a copy of that dataset for this lab and will explore the available fields and row for insights.

For syntax information to help you follow and update the queries, see Standard SQL Query Syntax.

What you'll do

In this lab, you perform these tasks:

  • Use BigQuery to explore a dataset

  • Troubleshoot duplicate rows in a dataset

  • Create joins between data tables

  • Understand each join type

Join Qwiklabs to read the rest of this lab...and more!

  • Get temporary access to the Google Cloud Console.
  • Over 200 labs from beginner to advanced levels.
  • Bite-sized so you can learn at your own pace.
Join to Start This Lab