Google Cloud Storage Performance
Google cloud storage is listed as a “a unified object storage solution” … or in layman’s terms, it’s a place in the cloud to host and serve files.
The uses for GCS are pretty large. For example, some people host their static websites there, some ship game content from there, others have used it to power their back-up services, and let’s not forget, it can host and serve content just like a CDN.
That’s all fine and dandy, but what I’m concerned with is : how would you test performance of it?
Direct test — Read performance
As we looked at in previous posts, it’s really easy to just throw a curl command around to see how GCS performs when fetching an asset.
First, place an asset on a server, say, in North America, (maybe near a BBQ restaurant) and then fetch that asset from a bunch of machines located around the world.
for ((i=0;i<500;i++)); do (time curl http://examplecdn.cdn/asset.png) 2>&1; done | grep real
The graph below shows our asset being fetched from a machine in Asia, Australia and Europe.
Note : Caching was not enabled for these assets, so their numbers are higher than what you’d see from the CDN article. AND this bucket was intentionally set to Regional (instead of multiregional) We can talk about the perf differences, but that a topic for another article.
Now this bucket was set up to be Regional, which means it doesn’t scale properly to being accessed across regions, so you can expect that the farther away the request is, the slower the response can be.
Direct test — Write performance
But we’ve seen read performance be really strong with GCS in the past; Write performance, however, is much trickier to get correct. In some following articles, I’ll highlight a few gotchas that some developer friends ran into, but for now, let’s just upload the same file, under different names, to the US-CENTRAL-1 bucket from various places in the world.
Now, if you run this test yourself, be careful about how you do your timings. There’s two main ways to do upload data to GCS: via `gsutil cp` and via a language-specific API. I’ve charted the performance of both below.
Note : These tests were done on a bucket was intentionally set to Regional (instead of multiregional) We can talk about the perf differences, but that’s another article.
We can see that write performance, for a 600k file is really good over time, however we see a significant difference between the GSUTIL and Python versions. The reason for this is that our scripting process for GSUTIL will cause a new process to be created for each uploaded asset, which isn’t ideal in terms of processor usage, and may not directly be GSUTIL’s fault.
Note : There’s also a multithreaded upload flag, but we’ll talk about that in a future article.
Direct read/write performance is easy to test. But in honesty, there’s lots of variables here that influence performance, that’s a pain to write proper benchmarking for. As we will talk about later, performance can greatly vary depending on the size of your objects, or if you’re creating, deleting, uploading or downloading. Thankfully, you don’t have to write all those tests yourself. The gsutil perfdiag command runs a suite of diagnostic tests for a given Google Storage bucket:
- Read/Write throughput tests (1MiB file(s))
- Latency tests for Delete, Download, Metadata & Upload (for 0b,1KiB and 100KiB)
Simply running `gsutil perfdiag -o test.json gs://<your bucket name>` can get you something that looks like this :
The perfdiag command is provided so that customers can run a known measurement suite when troubleshooting performance problems.
Now that we have a clear picture of the proper way to test GCS read/write performance, we can get on with helping some of our developers and their performance problems. Stay tuned to Google Cloud Performance Atlas (YT Video, Medium Blog), and subscribe / follow to get more great content!