BigQuery and Its Big Benefit

Inne Prinusantari
GITS Apps Insight
Published in
4 min readJun 24, 2019
Source: Google Images

BigQuery is one of Google Cloud Data Analytics Product. As mentioned from the Google Cloud Platform (GCP) site, with BigQuery you can focus only on how to stream and processing your data without thinking of the IT infrastructure and Database Administrator. Just run one query script and voila … you can get all the data that you desire. Unfortunately, to use this product, you have to use Blaze plan, or let me shorten it, it is not free! You must sign up in GCP, but do not worry they give free $300 credits for 12 months for new user. Big thanks to Google. :)

Why Use BigQuery

Still puzzled when we have to use BigQuery? If you only need to run simple query with data processed is less than 1 TB, then using BigQuery will be a waste. There are a lot of points why we must use Google BigQuery, but I will point out three of them:

  1. You can get data from complex query just for several seconds. Based on my experience, querying from many rows and tables to generate one report for Hospital Management Information System could take a very … long time. To anticipate it, we recommend the user to upgrade the server and take the report at midnight to prevent app crash and server timeout. I know that is not a good solution. It was really a drastic change when we compare it with BigQuery. That is why BigQuery is part of Google Data Analytics product.
  2. We can get datasets from list of available public datasets from BigQuery. You can get datasets of weather, hospitals, or datasets from Stack Overflow and GitHub. The good thing is you can combine the public datasets with your local data from Cloud Storage or directly store and stream your data using BigQuery.
  3. It can process data from other services or clients like Firebase and/or Fluentd (web server). If you have bunch of event log from your Firebase project or Fluentd, you can link it with BigQuery. For example, if you need to trace every user movement inside your Android app like what feature they see the most, you can get it logged with Google Analytics for Firebase. From the data log and analysis data with BigQuery, we can know which feature from our app is their favorite; what feature that should be improved; etc. This is useful for Business Intelligence and/or Data Scientist area.

How to Try BigQuery with Public Datasets

Using BigQuery web UI, we will try to query data from public datasets that is provided by BigQuery.

  • First step is go to Google Cloud Platform console and look for BigQuery menu at Big Data section.
GCP List of Menu
  • Next, you will see the BigQuery UI, then click Add Data and choose Explore public datasets.
BigQuery: Explore Public Datasets
  • You will see the list of available public datasets from many sources like National Oceanic and Atmospheric Administration (NOAA), The World Bank, GitHub, Stack Overflow, and from Google itself. You can filter it based on their categories.
Marketplace Datasets: List of available datasets
  • For example, let’s choose GitHub Activity public Datasets and click View Datasets button.
BigQuery: GitHub Activity Data

You will redirected to the UI with github_repos scheme selected.

BigQuery: github_repos scheme
  • As you can see, inside the scheme there are nine tables. You can start by run your query inside Query Editor. You can also try the sample query from GitHub. They provide three query samples. Let’s try by running ‘What are the most commonly used Java packages?’ query sample.
BigQuery: JSON Result from the query sample
  • After the query has been successfully executed, you will see the result as rows, JSON, or you can save it as CSV file. From the job information tab you can see the query processing time and the important thing is the information of how much the bill bytes of that query so you can calculate and choose the right pricing plan.

That is it! You can try typing your query while exploring other github_repos table. You can also try other public datasets and their query samples. Happy exploring and querying~!

Where to Learn

If you want to learn more about BigQuery, there are so many articles and tutorials that are worth to read. Here are some of them:

  1. From GCP itself, they have already completed documentation for BigQuery.
  2. Medium article “What Google BigQuery is and isn’t?”.
  3. Try BigQuery with NodeJs from Google Codelabs.
  4. Tutorial How to process your Google Analytics Data with BigQuery.

Inne Prinusantari, cloud explorer and non-stop learner on how to humanize human by becoming a product owner.

--

--

Inne Prinusantari
GITS Apps Insight

Enthusiast in women empowerment and leadership. I believe that creating/developing application is more than just writing a code.