Analyze your Google Cloud Platform Bills w/BigQuery & Data Studio
The current detailed billing transaction view for Google Cloud Platform (GCP) is in its infancy to say the least. It could use some improvement to be able to help you understand where your money is going and this how-to shows you how to fix that.
We will build this (or something a little nicer):
If you have read my previous posts on GCP (here and here) you will notice that I have started to spend some small-order dollars on their services. If you choose to do the same (or spend large-order dollars), you will likely want more data than is provided by the bill you receive, and the current interface can provide. This how-to will walk you through setting up Data Studio to start to analyze your bills with the exported data in BigQuery.
Before we get into BigQuery and Data Studio, a quick word on cost. The cost of Data Studio is currently free, sweet! The cost of BigQuery is a bit more complicated (see here).
As of writing BigQuery is $0.02/G stored, $0.05/G stream inserted, and $5/T queried. My current GCP bill is on the order of MB/m and this will cost me pennies per month (if not free). Even if you are running thousands of instances and utilizing several of GCP’s offerings your bill will still only likely be low-order GB/m (but do the math!). In that case it will still be pretty cheap to store your data, retrieve your data, and even query your data (assuming you are not querying every second and just viewing reports daily or weekly).
Under Billing > Transactions you will have access to the type of resource utilized, the time period aggregate (first of month until now), the amount used as a total, and the total cost. You can immediately see how limiting this view of your bill is just from the below screenshot.
For example, if you run multiple App Engine applications you will not be able to break it down any further.
The most detailed bill you will be able to see in the GCP interface is the Detailed transaction view:
So what to do from here? There are a couple options. Most folks will want to enable the Billing export so that they can get a daily detailed bill (file) that they can analyze in more detail as a traditional analysis in MS Excel or whatever tools they are familiar with. To enable, just jump down to the Billing export item from the Billing menu and click through the dialogs.
There is also a BigQuery export which is where we will focus because it is much easier/simpler, sustainable, and with Data Studio, logical.
For the File export it will take a day to appear in the bucket that you setup/specify (must create ahead). Once it is available, you can download it to review.
For the BigQuery export you need to create a dataset first. Follow the dialogs and once in the BigQuery UI, click the small down arrow to the right of your project name > Create new dataset. Give it a name, location, and data expiration.
Once this is done, head back to the Billing interface and Enable BigQuery export.
There are obviously costs associated with GCS (storage) and BigQuery when you enable either of these options and you may want to review them prior to enabling as discussed in the cost section above.
In a couple hours you will start to see the BigQuery table while for the File export you will likely need to wait overnight. There will be no backfill for either export as this is strictly move-forward. If you wanted to analyze previous costs unfortunately there isn’t much you can do.
Analyzing Costs: BigQuery w/Data Studio
There is a lot you could do with your cost data in BigQuery and I am sure large companies have a set of views they refer to at some interval. As for my use case, I would like to get to get visibility into two items:
- What is my day/day cost?
- What is the per-GAE application cost?
The sub-text to most of these questions is, “Where can you save money? What can you shutdown?” I will not be cover that in this how-to.
While looking at tools to answer this question to make sure I was not re-creating the wheel, I ran into Google Data Studio (link). This looks like a beta offering that has a BigQuery connector for visualizing data. The beginning of this tutorial is also well documented in Google’s example (link) of you prefer to follow it there.
After you click the sign up button, you are dropped into a somewhat familiar Google Drive looking interface albeit with a few more options.
To create our first report, click the + button in the bottom right and answer the getting started prompts. You will have to click it a second time once complete.
In the bottom right: Create new data source > BigQuery > Authorize > My Projects > YOUR_PROJECT > YOUR_DATASET (billing) > gcp_billing_export_FOO > Connect.
Once you select Connect, the field names should show up with the proper types and aggregations. Select Add to Report in the top right, you should not need to make any changes.
You will need to authorize Google Drive and then it will let you continue.
At this point you will have a blank canvas in front of you, how exciting!
Let’s start by doing three things:
- Give the report a name: GCP Cost Summary
- Change the Theme to Simple Dark
- Create your first element, a Cost by Day Time series
For the name, click on Untitled Report in the top-left and update it to GCP Cost Summary.
For the theme, on the right-bar, select THEME, and change it from Simple to Simple Dark.
For the first element, click the Time series button in the menu bar and then click in the top left of the report canvas and drag to mid-center right to create a chart on the top half of the canvas.
If you are bit like me, your brain may have just exploded a little bit [insert annoying GIF]. A lot of magic just happened as you can see. Somehow, the correct Dimension and Metric were selected to draw this element. This is clearly the future. Wow. Impressed. </endpraise>
In the right-hand bar, notice the Time Dimension and Cost metric / fields have been selected for you!
There isn’t much that needs to change here since you now have a Cost by Day Time Series. You can go through the options on the right-hand bar to adjust the date range to custom or a previous period, add additional filtering, and on the Style tab you can go a bit crazy if you like. For me, I will leave it as-is since this is good enough(tm) for me at this point.
Now that we have our first goal achieved, let’s add a cost-per service graph to make it easy to visualize how much we are spending on GAE vs. GCE. To do this, we will select a Bar chart since I think that will look best. And click/drag to create it in the bottom left quarter panel.
Wo, what just happened? Did it select the Dimension and Metric exactly as I intended without having to select them manually? Magic. Absolute Magic. </endpraise>
Again you could use the Style tab to change this further but this is fine for me at this time.
Our canvas/report is looking pretty fantastic at this point:
But now we see that we are spending 5x the cost of GCE on GAE so we need a bit more information about GAE to make us feel warm and fuzzy about where our money is going. To do that will require one more element.
Filtering by GAE Application
If you run multiple GAE Applications under a single project and single billing agreement you will eventually want to split them out on your bill. This is probably easiest done by using separate projects and/or billing agreements but at the end of the day, moving them to separate projects just may not make sense or may be too difficult due to various reasons (networking configuration, ease of use by developers, et cetera).
To be able to filter by application, we need to look at what data is available for filtering. This requires taking a peek at the data in the fields that are being exported auto-magically.
Head over to the BigQuery interface (it looks like 2010, don’t be scared): https://bigquery.cloud.google.com.
Select your billing table and dataset on the left:
And at the top of the page, select Preview:
The raw table you are now looking at is your billing data. You will see entries for each service in 1-hour start/end time increments. Since we are only going to focus on App Engine, let’s write a query to filter just for those entries. The field name as you may notice is product.
In the top-right, select Query Table and try a query like this:
It may be a bit hard to read since the cost and end times are not both ordered, so let’s add a little more ordering:
Very cool, so you will now get to view the raw billing data in a somewhat digestible manner. For me, and possibly others, this is a bit enlightening.
My most expensive cost is Flex Instance Core Hours for a single application as indicated in the labels_key column. The reason for the halving of cost is because of the post I did here where I scaled it down from 2 instances to a single instance since that is all that was needed for this application.
If you scroll ahead in your bill using the Next button at the bottom, if you run Standard Environment instances you will notice something interesting.. there is no labels_key value. This seems like a problem. I have commented to Google Cloud about this (Tweet). We won’t address this at this time.
Even without Standard Environment labels the bulk of our cost is in flexible instance core hours so it doesn’t matter much (for me).
Now that we have an understanding of the columns, we can head back to our canvas to build one more graph that will give per-GAE Flexible Environment Application cost. Since we haven’t used an Area chart yet, let’s use that (it may be helpful later too when we want to stack the costs per App).
Draw the chart in the bottom-right in the remaining space but leave ~1" under the chart because we will be adding one more control.
Let’s start by adding a Filter to only show App Engine costs. With the chart select, select the Add a filter button in the right-hand bar.
Create a Filter button at the bottom. Create a name for the filter (App Engine Only), set to Include, product, Equal to, and App Engine.
Once completed, click Save in the bottom right. It will automatically apply the filter to the chart and you will now only see App Engine costs per day.
Now what about per-GAE Application viewing? For this, we will add a Filter control. Select it from the menu bar (all the way to the right) and put it in the bottom right of the canvas.
Select the Filter control and unselect Show Values from the bar on the right and change the Dimension from product to labels.key. In addition, in the Style tab, select Expandable.
It should look something like this at this point:
It is at this point, you may notice that you have been in the Edit view up until now. We are going to switch to the non-Edit view to see how this control will work. In the top-right, select View.
If you click on the Expandable Filter Control in the bottom right you can select/de-select your Flexible Environment GAE Applications.
But you will quickly notice a problem, it is changing all graphs and not just the bottom right. To fix this, we need to set the control to *only* control the bottom right graph. To do this, head back to the Edit view.
Select both the Filter control and the bottom right Area chart. Go to the Arrange menu in the menu bar and select Group.
Now head back to the View mode and notice that it now functions as expected *only* controlling the bottom-right Area Chart.
You will need to Ungroup/Group if you want to make any style changes to the Area Chart which is slightly annoying. Make your style choices initially to prevent this annoyance.
You can pretty easily, quickly, and cheaply go from a Detailed Transaction View in your GCP bill to something a bit more useful and valuable for you personally or for your business. With a couple clicks and the cost of a couple pennies per month, you can create a report that folks can view. I hope this was interesting, insightful and/or helpful, feel free to let me know on Twitter.
<opinionated commentary on billing in the public cloud>
It is absolutely appalling to me how little time is spent by Public Cloud providers on their billing interfaces, visibility, and usability of the data. It is obviously not their core product and there is clearly a reverse incentive (think: evil) but it would be a bit more moral to provide better visibility to where your money is going: how and why.
When you buy a physical server, there is a clear cost. There is a clear cost for shipping, racking/stacking, and the equipment that it plugs into. In addition, the power, HVAC, et cetera also send you a bill.
AWS Cost Explorer launched in ~2014 IIRC after years of nothing useful and unfortunately GCP has continued the trend with no initial cost exploring tools.
You could setup a separate billing agreement and account for every project but that doesn’t make sense in a microservice architecture where you have hundreds of node.js and mongos everywhere. And logically where a single person (or teams) will have several services that make up a single stack that they want to view in a single place would make this near impossible.
</opinionated commentary on billing in the public cloud>