arrow_back

Optimizing Applications Using Cloud Profiler

Join Sign in
Test and share your knowledge with our community!
done
Get access to over 700 hands-on labs, skill badges, and courses

Optimizing Applications Using Cloud Profiler

Lab 1 hour 30 minutes universal_currency_alt 1 Credit show_chart Introductory
Test and share your knowledge with our community!
done
Get access to over 700 hands-on labs, skill badges, and courses

GSP976

Google Cloud self-paced labs logo

Introduction

In this lab, you will deploy an inefficient Go application that is configured to collect profile data. You will learn how to use Cloud Profiler to view the application's profile data and identify potential optimizations. Finally you will evaluate approaches to modify the application, re-deploy it and evaluate the effect of the modifications made.

Objectives

In this lab, you will learn how to:

  1. Use Cloud Profiler to understand CPU cycle times of an ineffecient application
  2. Maximize the number of queries per second a server can process
  3. Reduce the memory usage of an application by eliminating unnecessary memory allocations

Setup

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to the Google Cloud console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is the Lab Details panel with the following:

    • The Open Google Cloud console button
    • Time remaining
    • The temporary credentials that you must use for this lab
    • Other information, if needed, to step through this lab
  2. Click Open Google Cloud console (or right-click and select Open Link in Incognito Window if you are running the Chrome browser).

    The lab spins up resources, and then opens another tab that shows the Sign in page.

    Tip: Arrange the tabs in separate windows, side-by-side.

    Note: If you see the Choose an account dialog, click Use Another Account.
  3. If necessary, copy the Username below and paste it into the Sign in dialog.

    {{{user_0.username | "Username"}}}

    You can also find the Username in the Lab Details panel.

  4. Click Next.

  5. Copy the Password below and paste it into the Welcome dialog.

    {{{user_0.password | "Password"}}}

    You can also find the Password in the Lab Details panel.

  6. Click Next.

    Important: You must use the credentials the lab provides you. Do not use your Google Cloud account credentials. Note: Using your own Google Cloud account for this lab may incur extra charges.
  7. Click through the subsequent pages:

    • Accept the terms and conditions.
    • Do not add recovery options or two-factor authentication (because this is a temporary account).
    • Do not sign up for free trials.

After a few moments, the Google Cloud console opens in this tab.

Note: To view a menu with a list of Google Cloud products and services, click the Navigation menu at the top-left. Navigation menu icon

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

  1. Click Activate Cloud Shell Activate Cloud Shell icon at the top of the Google Cloud console.

When you are connected, you are already authenticated, and the project is set to your Project_ID, . The output contains a line that declares the Project_ID for this session:

Your Cloud Platform project in this session is set to {{{project_0.project_id | "PROJECT_ID"}}}

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  1. (Optional) You can list the active account name with this command:
gcloud auth list
  1. Click Authorize.

Output:

ACTIVE: * ACCOUNT: {{{user_0.username | "ACCOUNT"}}} To set the active account, run: $ gcloud config set account `ACCOUNT`
  1. (Optional) You can list the project ID with this command:
gcloud config list project

Output:

[core] project = {{{project_0.project_id | "PROJECT_ID"}}} Note: For full documentation of gcloud, in Google Cloud, refer to the gcloud CLI overview guide.

Scenario

In this scenario, the server uses a gRPC framework, receives a word or phrase, and returns the number of times the word or phrase appears in the works of Shakespeare.

The average number of queries per second that the server can handle is determined by load testing the server. For each round of tests, a client simulator is called and instructed to issue 20 sequential queries. At the completion of a round, the number of queries sent by the client simulator, the elapsed time, and the average number of queries per second are displayed. The server code is inefficient (by design) for improvements to be made.

Task 1. Download and run the sample Application

In this lab, the sample application's execution time is just long enough to collect profile data. In practice, it's desirable to have at least 10 profiles before analyzing profile data.

Note: Profiler has to be enabled in the project
  1. In Cloud Shell, run the following command:
gcloud services enable cloudprofiler.googleapis.com git clone https://github.com/GoogleCloudPlatform/golang-samples.git cd golang-samples/profiler/shakesapp
  1. Run the application with the version set to 1 and the number of rounds set to 15:
go run . -version 1 -num_rounds 15

If prompted, click Authorize on the modal dialog.

  1. Use the top search to enter Profiler and select the corresponding service from the drop down when it appers.
Note: Wait until the application is run all 15 rounds before viewing profile information in Cloud Profiler
  1. Once the application run completes you will see a Flame Graph similar to the figure below displaying the application's profile data. Flame graphs make efficient use of screen real estate by representing a large amount of information in a compact, readable format. You can read more about how Cloud Profiler creates flame graphs on the Flame graphs documentation page.

Flame Graph

Notice that the Profile type is set to CPU time. This indicates that CPU usage data is displayed on the flame graph presented.

  1. The output after the application run in Cloud Shell should look similar to the following:
go run . -version 1 -num_rounds 15 2020/08/27 17:27:34 Simulating client requests, round 1 2020/08/27 17:27:34 Stackdriver Profiler Go Agent version: 20200618 2020/08/27 17:27:34 profiler has started 2020/08/27 17:27:34 creating a new profile via profiler service 2020/08/27 17:27:51 Simulated 20 requests in 17.3s, rate of 1.156069 reqs / sec 2020/08/27 17:27:51 Simulating client requests, round 2 2020/08/27 17:28:10 Simulated 20 requests in 19.02s, rate of 1.051525 reqs / sec 2020/08/27 17:28:10 Simulating client requests, round 3 2020/08/27 17:28:29 Simulated 20 requests in 18.71s, rate of 1.068947 reqs / sec ... 2020/08/27 17:44:32 Simulating client requests, round 14 2020/08/27 17:46:04 Simulated 20 requests in 1m32.23s, rate of 0.216849 reqs / sec 2020/08/27 17:46:04 Simulating client requests, round 15 2020/08/27 17:47:52 Simulated 20 requests in 1m48.03s, rate of 0.185134 reqs / sec

The output from Cloud Shell displays the elapsed time for each iteration and the average request rate.

When the application is started, the entry "Simulated 20 requests in 17.3s, rate of 1.156069 reqs / sec" indicates that the server is executing about 1 request per second.

By the last round, the entry "Simulated 20 requests in 1m48.03s, rate of 0.185134 reqs / sec" indicates that the server is executing about 1 request every 5 seconds.

Click Check my progress to verify the objective. Enable the Cloud Profiler API.

Task 2. Using CPU time profiles to maximize queries per second

One approach to maximizing the number of queries per second is to identify CPU intensive methods and optimize their implementations. In this section, you use CPU time profiles to identify a CPU intensive method on the server.

Identifying CPU time usage

The root frame of the flame graph lists the total CPU time used by the application over the collection interval of 10 seconds:

Root Frame

In the example above, the service used 11.3s. When the system runs on a single core, a CPU time usage of 11.3 seconds corresponds to 113% utilization of that core! For more information on profiling, see the types of profiling available.

Task 3. Modifying the application

Step 1: Which function is CPU time intensive?

One way you can identify code that might need to be optimized is to view the table of functions and identify greedy functions:

  1. To view the table, click Focus function list.

Focus Function Button

  1. Sort the table by Total. The column labeled Total shows the CPU time usage of a function and its children.

In this example, GetMatchCount is the first shakesapp/server.go function that is listed. That function used 7.54s of the total CPU time, or 67% of the applications total CPU time. This function is known to be handling the gRPC requests.

Focus Functions

The flame graph shows that the shakesapp/server.go function GetMatchCount calls MatchString, which in turn is spending most of its time calling Compile:

Flame Graph Profile

Step 2: How can you use what you've learned?

  1. Rely on your language expertise. MatchString is a regular-expression method. In general, regular-expression processing is very flexible, but not necessarily the most efficient solution for every problem.

  2. Rely on your application expertise. The client is generating a word or phrase, and the server is searching for this phrase.

  3. Search the implementation of the shakesapp/server.go method GetMatchCount for uses of MatchString, and then determine if a simpler, more efficient function could replace that call.

Step 3: How can you change the application?

In the file shakesapp/server.go, the existing code contains one call to MatchString:

isMatch, err := regexp.MatchString(query, line) if err != nil { return resp, err } if isMatch { resp.MatchCount++ }

One option is to replace the MatchString logic with equivalent logic that uses strings.Contains.

Click the Open Editor button at the top of the Cloud Shell and open the server.go file in the directory /profiler/shakesapp/shakesapp/server.go when the editor opens. Replace the MatchString logic shown above with the snippet below then save the file.

if strings.Contains(line, query) { resp.MatchCount++ } Note: Be sure to REMOVE the import statement for the regexp package at the top of the server.go file.

Task 4. Evaluate the change

To evaluate the change, do the following:

  1. Run the following command in Cloud Shell to set the application version to 2:
go run . -version 2 -num_rounds 40
  1. Wait for the application to complete, and then view the profile data for this version of the application:
  • Click NOW to load the most recent profile data. For more information, see Range of time.
  • In the Version menu, select 2.

For one example, the flame graph is as shown:

version 2

In this figure, the root frame shows a value of 8.96 s. As a result of changing the string-match function, the CPU time used by the application decreased from 11.3 seconds to 8.96 seconds, or the application went from using 113% of a CPU core to using 89.6% of a CPU core.

The frame width is a proportional measure of the CPU time usage. In this example, the width of the frame for GetMatchCount indicates that function uses about 29.5% of all CPU time used by the application. In the original flame graph, this same frame was about 67% of the width of the graph. To view the exact CPU time usage, you can use the frame tooltip or you can use the Focus function list:

2023/03/27 23:38:35 Simulating client requests, round 33 2023/03/27 23:38:38 Simulated 20 requests in 3.13s, rate of 6.389776 reqs / sec 2023/03/27 23:38:38 Simulating client requests, round 34 2023/03/27 23:38:41 Simulated 20 requests in 3.18s, rate of 6.289308 reqs / sec 2023/03/27 23:38:41 Simulating client requests, round 35 2023/03/27 23:38:44 Simulated 20 requests in 3.16s, rate of 6.329114 reqs / sec 2023/03/27 23:38:44 Simulating client requests, round 36 2023/03/27 23:38:47 Simulated 20 requests in 3.02s, rate of 6.622517 reqs / sec 2023/03/27 23:38:47 Simulating client requests, round 37 2023/03/27 23:38:50 Simulated 20 requests in 3.09s, rate of 6.472492 reqs / sec 2023/03/27 23:38:50 Simulating client requests, round 38 2023/03/27 23:38:54 Simulated 20 requests in 3.17s, rate of 6.309148 reqs / sec 2023/03/27 23:38:54 Simulating client requests, round 39 2023/03/27 23:38:57 Simulated 20 requests in 3.26s, rate of 6.134969 reqs / sec 2023/03/27 23:38:57 Simulating client requests, round 40 Cloud Profiler: 2023/03/27 23:38:58 successfully created profile CPU 2023/03/27 23:39:00 Simulated 20 requests in 3.13s, rate of 6.389776 reqs / sec

The small change to the application had two different effects:

  • The number of requests per second increased from less than 1 per second to greater than 5 per second.
  • The CPU time per request, computed by dividing the CPU utilization by the the number of requests per second, decreased.

Task 5. Using allocated heap profiles to improve resource usage

In this task, you will learn how to use the heap and allocated heap profiles to identify an allocation-intensive method in the application.

  • Heap profiles show the amount of memory allocated in the program's heap at the instant the profile is collected.
  • Allocated heap profiles show the total amount of memory that was allocated in the program's heap during the interval in which the profile was collected. By dividing these values by 10 seconds, the profile collection interval, you can interpret these as allocation rates.

Enabling heap profile collection

  1. Back in Cloud Shell, run the application with the application version set to 3 and enable the collection of heap and allocated heap profiles.
cd ~/golang-samples/profiler/shakesapp/ go run . -version 3 -num_rounds 40 -heap -heap_alloc
  1. Wait for the application to complete, and then view the profile data for this version of the application:
  • Click NOW to load the most recent profile data.
  • In the Version menu, select 3.
  • In the Profiler type menu, select Allocated heap.

For one example, the flame graph is as shown:

Allocated Heap

Identifying the heap allocation rate

The root frame displays the total amount of heap that was allocated during the 10 seconds when a profile was collected, averaged over all profiles. In this example, the root frame shows that, on average, 2.303 GiB of memory was allocated.

Task 6. Modifying the application

Step 1: Is it worth minimizing the rate of heap allocation?

The CPU time usage of the Go background garbage collection function, runtime.gcBgMarkWorker.*, can be used to determine if it's worth the effort to optimize an application to reduce garbage collection costs:

  • Skip optimization if the CPU time usage is less than 5%.
  • Optimize if the CPU time usage is at least 25%.

For this example, the CPU time usage of the background garbage collector is 15.8%. This value is high enough that it's worth attempting to optimize shakesapp/server.go:

GC Worker

Step 2: Which function allocates a lot of memory?

The file shakesapp/server.go contains two functions that might be targets for optimization: GetMatchCount and readFiles. To determine the rate of memory allocation for these functions, set the Profile type to Allocated heap, and then use the list Focus function list.

In this example, the total heap allocation for readFiles.func1 during the 10 second profile collection is, on average, 1.71 GiB or 74% of the total allocated memory. The self heap allocation during the 10 second profile collection is 336.6 MiB:

allocated heap focus

In this example, the Go method io.ReadAll allocated 1.361 GiB during the 10 second profile collection, on average. The simplest way to reduce these allocations is to reduce calls to io.ReadAll. The function readFiles calls io.ReadAll through a library method.

Step 3: How can you change the application?

One option is to modify the application to read the files one time and then to re-use that content. For example, you could make the following changes. Open Cloud Shell and then Cloud Editor similar to the previous task to modify the server.go file. This should be located at the path golang-samples/profiler/shakesapp/shakesapp:

  1. Define a global variable files to store the results of the initial file read:
var files []string

Place this under the package imports at the top of the server.go file.

import ( "context" "fmt" "io/ioutil" "strings" "cloud.google.com/go/storage" "google.golang.org/api/iterator" "google.golang.org/api/option" ) // PLACE THE FILES VARIABLE HERE var files []string
  1. Modify readFiles to return early when files is defined:
func readFiles(ctx context.Context, bucketName, prefix string) ([]string, error) { // return if defined if files != nil { return files, nil }

Replace the logic below:

// Save the result in the variable files ret := make([]string, len(paths)) for i := 0; i < len(paths); i++ { r := <-resps if r.err != nil { err = r.err } ret[i] = r.s } return ret, err

With the snippet following snippet:

files = make([]string, len(paths)) for i := 0; i < len(paths); i++ { r := <-resps if r.err != nil { return nil, r.err } files[i] = r.s } return files, nil

After making changes in the readFiles function, the code should look like this.

func readFiles(ctx context.Context, bucketName, prefix string) ([]string, error) { // return if defined if files != nil { return files, nil } type resp struct { s string err error } client, err := storage.NewClient(ctx, option.WithoutAuthentication()) if err != nil { return []string{}, fmt.Errorf("failed to create storage client: %s", err) } defer client.Close() bucket := client.Bucket(bucketName) var paths []string it := bucket.Objects(ctx, &storage.Query{Prefix: bucketPrefix}) for { attrs, err := it.Next() if err == iterator.Done { break } if err != nil { return []string{}, fmt.Errorf("failed to iterate over files in %s starting with %s: %w", bucketName, prefix, err) } if attrs.Name != "" { paths = append(paths, attrs.Name) } } resps := make(chan resp) for _, path := range paths { go func(path string) { obj := bucket.Object(path) r, err := obj.NewReader(ctx) if err != nil { resps <- resp{"", err} } defer r.Close() data, err := ioutil.ReadAll(r) resps <- resp{string(data), err} }(path) } files = make([]string, len(paths)) for i := 0; i < len(paths); i++ { r := <-resps if r.err != nil { return nil, r.err } files[i] = r.s } return files, nil }

Save the file and click Open Terminal to go back to Cloud Shell.

Task 7. Evaluating the change

To evaluate the change, do the following:

  1. Run the application with the application version set to 4:
go run . -version 4 -num_rounds 60 -heap -heap_alloc
  1. Wait for the application to complete, and then view the profile data for this version of the application:
  • Click NOW to load the most recent profile data.
  • In the Version menu, select 4.
  • In the Profiler type menu, select Allocated heap.
  1. To quantify the effect of changing readFiles on the heap allocation rate, compare the allocated heap profiles for version 4 to those collected for 3:

heap compare

  1. In this example, the root frame's tooltip shows that with version 4, the average amount of memory allocated during profile collection decreased by 1.301 GiB, as compared to version 3. The tooltip for readFiles.func1 shows a decrease of 1.045 GiB:

  2. To determine if there is an impact of the change on the number of requests per second handled by the application, view the output in the Cloud Shell.

In the example below, version 4 completes up to 15 requests per second, which is substantially higher than the ~5 requests per second of version 3:

2020/08/27 21:51:42 Stackdriver Profiler Go Agent version: 20200618 2020/08/27 21:51:42 profiler has started 2020/08/27 21:51:42 creating a new profile via profiler service 2020/08/27 21:51:44 Simulated 20 requests in 1.47s, rate of 13.605442 reqs / sec 2020/08/27 21:51:44 Simulating client requests, round 2 2020/08/27 21:51:45 Simulated 20 requests in 1.3s, rate of 15.384615 reqs / sec 2020/08/27 21:51:45 Simulating client requests, round 3 2020/08/27 21:51:46 Simulated 20 requests in 1.31s, rate of 15.267176 reqs / sec

The increase in queries per second served by the application might be due to less time being spent on garbage collection.

Congratulations!

In this lab, you explored the Cloud Operations suite, which allows Site Reliability Engineers (SRE) to investigate and diagnose issues experienced with workloads deployed.

CPU time and allocated heap profiles were used to identify potential optimizations to an application. The goals were to maximize the number of requests per second and to eliminate unnecessary allocations.

By using CPU time profiles, a CPU intensive function was identified. After applying a simple change, the server's request rate increased.

By using allocated heap profiles, the shakesapp/server.go function readFiles was identified as having a high allocation rate. After optimizing readFiles, the server's request rate increased to 15 requests per second.

Finish your quest

This self-paced lab is part of the Cloud Architecture and DevOps Essentials quests. A quest is a series of related labs that form a learning path. Completing a quest earns you a badge to recognize your achievement. You can make your badge or badges public and link to them in your online resume or social media account. Enroll in any quest that contains this lab and get immediate completion credit. Refer to the Google Cloud Skills Boost catalog for all available quests.

Take your next lab

Continue your quest or check out these suggested materials:

Next steps

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated: August 16, 2023

Lab Last Tested: August 16, 2023

End your lab

When you have completed your lab, click End Lab. Your account and the resources you've used are removed from the lab platform.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2024 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.