ETL Processing on GCP Using Dataflow and BigQuery


2293 评论

Alexis T. · 评论大约 1 小时之前

A little too much copy paste and would like more learning material.

Michael G. · 评论大约 1 小时之前

Federico F. · 评论大约 3 小时之前


Carlos L. · 评论大约 6 小时之前

Huzaen J. · 评论大约 13 小时之前

Muhammad A. · 评论大约 14 小时之前

Puneet C. · 评论大约 14 小时之前

Wei Lun K. · 评论大约 15 小时之前

Ankit G. · 评论大约 15 小时之前

Puneet C. · 评论大约 15 小时之前

Chun Yu C. · 评论大约 15 小时之前

Good but little slow

Eli K. · 评论大约 16 小时之前

Sridhar E. · 评论大约 16 小时之前

The lab needs more time. There is just enough time to give a quick read of the python code files. An additional 30 min should be good. The explanation in Step5 & Step6 (see below) just repeat of that from the Step2. Adding a block diagram like in the DataFlow charts would be good. You will now build a Dataflow pipeline with a TextIO source and a BigQueryIO destination to ingest data into BigQuery. More specifically, it will: Ingest the files from GCS. Filter out the header row in the files. Convert the lines read to dictionary objects. Output the rows to BigQuery.

Sreedevi G. · 评论大约 19 小时之前

Punyanuch S. · 评论大约 20 小时之前

Andrew Z. · 评论大约 22 小时之前

Dataflow is long shutting down worker it may longer than 3 min calculate to 50% of all processes that's very inefficiency.

Thatchapoom T. · 评论大约 23 小时之前

Graciela G. · 评论1 天之前

Jorge A. · 评论1 天之前

Nicholas F. · 评论1 天之前

Dmytro R. · 评论1 天之前

Cliff L. · 评论1 天之前

atmi t. · 评论2 天之前

Murali K. · 评论2 天之前

Too slow execution of dataflow, I need more practice it seems

Salvador L. · 评论2 天之前