Machine Learning in Google Cloud with BigQuery

BigQuery ML : Machine learning with Standard SQL

Abonia Sojasingarayar
IBM Data Science in Practice
5 min readJun 25, 2022

--

BigQuery ML in Google Cloud

BigQuery ML lets you create and execute machine learning models in igQuery using standard SQL queries. BigQuery ML democratizes machine learning by letting SQL practitioners build models using existing SQL tools and skills. BigQuery ML increases development speed by eliminating the need to move data.

If you are new/want to explore GCP for first time, then use google sandbox account which is free and login with your account information and can access GCP toolkit with limited resource.

I would like to dive directly into the model development in BigQuery ML , as we discuss the basic points in the previous article on Quickstarts : BigQuery and BigQuery ML.

In this article we will be discussing:

  1. Steps to Create Models using Google BigQuery ML
  2. Sample models in BigQuery ML
  3. Benefits of BigQuery ML

1. Steps to Create Models using Google BigQuery ML :

So as we do in traditional ML , we can do all the below steps in BigQuery ML:

  • Create and Explore the dataset
  • Create/Train the model
  • Evaluate the model
  • Predict using the model
Ref: Image from Online

1.1 Create Dataset

Create the dataset in cloud console in order to use it for machine learning.I have explained this part in my previous article or more info can be found here.

1.2 Create / Train Model

As shown below write the query starts with ‘CREATE MODEL’ and in ‘OPTIONS’ we can set up the parameters of model such as model_type,learning rate etc.

Create the model in BigQuery console

1.3 Evaluate Model

Use ‘ML.EVALUATE’ function to provide statistics about model to evaluate the model,the below example shows the query returns the error matrics such as mean-absolute-error, mean-squared-error etc

Evaluate the model in BigQuery console

1.4 Predict using Model

Use ‘ML.PREDICT’ to predict on given data from the table in BigQuery.

Predict with model in BigQuery console

BigQuery ML Official interactive tutorial Documentation

2. Sample models in BigQuery ML :

As mentioned in my last article we can solve Regression, Classification, clustering, time series problems with BigQuery ML

Here we will see linear regression and logistic regression with the code snippet/query sample.

2.1 Linear regression

The CREATE MODEL clause is used to create and train the model named bike_model.model

The OPTIONS(model_type='linear_reg', input_label_cols=['duration']) clause indicates that you are creating a linear regression model. A linear regression is a type of regression model that generates a continuous value from a linear combination of input features.

Sample Query : Create and train linear regression model in BQ

Below query will be used to predict with the model that we trained in the above step.

Use below query to evaluate your model:

BigQuery ML Linear Regression official documentation for more info.

2.2 Logistic regression

You can create and train a logistic regression model using the CREATE MODEL statement with the option 'LOGISTIC_REG'. The following query uses a CREATE MODEL statement to train a new logistic regression model

Sample Query : Create and train logistic regression model in BQ

Now the following code will return the evaluation metrics for the model. The metrics are “precision, “recall”, “accuracy”, “f1_score”, “log_loss”, “roc_auc”. The code is given below.

Use below query to predict using the final model:

BigQuery ML Logistic Regression official documentation for more info.

3. Benefits of Google BigQuery ML :

Google BigQuery ML is fast to create, evaluate and execute various Machine Learning models easily using standard SQL queries. It offers many benefits over other Cloud Data Warehouses. Some of the benefits are listed below.

  • You need not to know Python or any other language for managing Machine Learning models. Analyst should be expertise in SQL can train models and make predictions.
  • The data export involves many steps, and it’s a time-consuming process. Google BigQuery ML saves time and resources by letting users use Machine Learning models in Google BigQuery.
  • It allows users to run Machine Learning models on large datasets within minutes as it uses computation resources of Google BigQuery Data Warehouse.
  • It features some automated Machine Learning models that reduce the workload to manipulate data manually. It saves time and allows users to quickly train and test models on the dataset

Conclusion :

So in this article we discussed about how can we do machine learning in BigQuery using standard SQL. it can be easily integrated in the SQL query. It eliminates the restriction Data Analysts face for using Machine Learning models in their analysis. Companies can deploy models directly on their datasets to make better business decisions.We also discuss about the regression and classification problem and how can we train, predict and evaluate the model in BQ ML.With BigQuery ML we can also train machine learning model for time series , clusttering, Recommentation Engine.

Thanks for Reading

Connect with me on Linkedin

Find me on Github

Visit my technical channel on Youtube

Support: Buy me a Cofee/Chai

If you find this article helpful, please give your support by a clap and stay tuned to read more interesting articles in datasciene and machine learning discipline. See you all soon in my next post.

--

--

Abonia Sojasingarayar
IBM Data Science in Practice

Principal Research Scientist | Machine Learning & Ops Engineer | Data Scientist | NLP Engineer | Computer Vision Engineer | AI Analyst