menu
arrow_back

Data Pipeline: Process Stream Data and Visualize Real Time Geospatial Data

Data Pipeline: Process Stream Data and Visualize Real Time Geospatial Data

40分 クレジット: 7

GSP439

Google Cloud Self-Paced Labs

Overview

In this lab you will learn how to use Google Dataflow to process real-time streaming data from a real-time real world historical data set, store the results in Google BigQuery, then use Google Data Studio to visualize real-time geospatial data.

Cloud Dataflow is a fully-managed service for transforming and enriching data in stream (real time) and batch (historical) modes via Java and Python APIs with the Apache Beam SDK. Cloud Dataflow provides a serverless architecture that can be used to shard and process very large batch data sets, or high volume live streams of data, in parallel.

Google BigQuery is a RESTful web service that enables interactive analysis of massively large datasets working in conjunction with Google Storage.

The data set that is used provides historic information about internal flights in the United States retrieved from the US Bureau of Transport Statistics website.

Objectives

  • Create a Google Dataflow processing job for streaming data.

  • Generate real-time streaming data using Python.

  • Analyze streaming data in Google BigQuery.

  • Create a real-time geospatial dashboard for streaming data.

Qwiklabs に参加してこのラボの残りの部分や他のラボを確認しましょう。

  • Google Cloud Console への一時的なアクセス権を取得します。
  • 初心者レベルから上級者レベルまで 200 を超えるラボが用意されています。
  • ご自分のペースで学習できるように詳細に分割されています。
参加してこのラボを開始
スコア

—/100

Run the simulation script

ステップを実行

/ 30

Deploy the Google Dataflow Job to Process Stream Data

ステップを実行

/ 20

Inspect the data in BiqQuery

ステップを実行

/ 20

Create a BiqQuery view for Data Studio visualization

ステップを実行

/ 30