Evaluating a Data Model
In this lab you will learn the process for partitioning a data set into two separate parts, a training set that will be used to develop a model, and a test set that can then be used to evaluate the accuracy of the model and then independently evaluate predictive models in a repeatable manner. Then you'll re-create the model developed in a previous lab in this quest using the training data set and evaluate it against the test data set. The data is stored in Google BigQuery and the analysis will be performed using Google Cloud Datalab.
The data set that is used provides historic information about internal flights in the United States retrieved from the US Bureau of Transport Statistics website. This data set can be used to demonstrate a wide range of data science concepts and techniques and will be used in all of the other labs in the Data Science on GCP quest.
Cloud Datalab is a powerful interactive tool created to explore, analyze, transform and visualize data and build machine learning models on Google Cloud Platform. It runs on Google Compute Engine and connects to multiple cloud services such as Google BigQuery, Cloud SQL or simple text data stored on Google Cloud Storage so you can focus on your data science tasks.
Google BigQuery is a RESTful web service that enables interactive analysis of massively large datasets working in conjunction with Google Storage.
Join Qwiklabs to read the rest of this lab...and more!
- Get temporary access to the Google Cloud Console.
- Over 200 labs from beginner to advanced levels.
- Bite-sized so you can learn at your own pace.