Create a new Cloud Storage bucket
Run a text processing pipeline on Cloud Dataflow
Run a Big Data Text Processing Pipeline in Cloud Dataflow
Dataflow is a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation. Because Dataflow is a managed service, it can allocate resources on demand to minimize latency while maintaining high utilization efficiency.
The Dataflow model combines batch and stream processing so developers don't have to make tradeoffs between correctness, cost, and processing time. In this lab, you'll learn how to run a Dataflow pipeline that counts the occurrences of unique words in a text file.
What you'll learn
- How to create a Maven project with the Cloud Dataflow SDK
- Run an example pipeline using the Cloud Console
- How to delete the associated Cloud Storage bucket and its contents
Crea un account Qwiklabs per leggere il resto del lab e tanto altro ancora.
- Acquisisci accesso temporaneo a Google Cloud Console.
- Oltre 200 lab dal livello iniziale a quelli più avanzati.
- Corsi brevi per apprendere secondo i tuoi ritmi.