Hyperparameter tuning BigQuery ML

BigQuery ML can use Vertex AI to tune common model parameters

What is BigQuery ML?

BigQuery ML allows you to quickly train ML models on data in BigQuery. For example, suppose you want to train a linear ML model to predict the duration of a bicycle rental, you can do that using:

CREATE OR REPLACE MODEL ch09eu.bicycle_model_linear
OPTIONS(
model_type='linear_reg', input_label_cols=['duration']
)
AS
SELECT
start_station_name,
CAST(EXTRACT(DAYOFWEEK FROM start_date) AS STRING) AS dayofweek,
CAST(EXTRACT HOUR FROM start_date) AS STRING) AS hourofday,
duration
FROM `bigquery-public-data.london_bicycles.cycle_hire`

BigQuery ML also supports feature engineering that will be automatically repeated during inference. A slightly more sophisticated deep neural network model might be:

CREATE OR REPLACE MODEL ch09eu.bicycle_model_dnn
TRANSFORM(
start_station_name,
CAST(EXTRACT(DAYOFWEEK FROM start_date) AS STRING) AS dayofweek,
ML.BUCKETIZE(EXTRACT(HOUR FROM start_date), [0, 6, 12, 18, 24]) AS hourofday,
duration
)

OPTIONS(
model_type='dnn_regressor', input_label_cols=['duration'],
hidden_units=[32, 8],
learn_rate=0.1,
dropout=0.25,
)
AS
SELECT
start_station_name, start_date, duration
FROM `bigquery-public-data.london_bicycles.cycle_hire`

Hyperparameter tuning

But what if you want to experiment with different different learning rates? This is called hyperparameter tuning and you can specify ranges or candidate values for the model parameters:

CREATE OR REPLACE MODEL ch09eu.bicycle_model_dnn_hparam
TRANSFORM(
start_station_name,
CAST(EXTRACT(DAYOFWEEK FROM start_date) AS STRING) AS dayofweek,
ML.BUCKETIZE(EXTRACT(HOUR FROM start_date), [0, 6, 12, 18, 24]) AS hourofday,
duration
)
OPTIONS(
model_type='dnn_regressor', input_label_cols=['duration'],
num_trials=10,
hidden_units=hparam_candidates([struct([32,8]), struct([32]), struct([64,32,4])]),
learn_rate=hparam_range(0, 0.5),
dropout=hparam_candidates([0, 0.1, 0.25, 0.4])

)
AS
SELECT
start_station_name, start_date, duration
FROM `bigquery-public-data.london_bicycles.cycle_hire`

BigQuery ML will use Vertex AI’s hyperparameter tuning service to choose the best parameters within the budget (here 10 trials) that you specify.

Note that each model type only allows a few of the parameters to be tuned. For example, for DNN models, you can tune the batch_size, dropout, learn_rate, optimizer, etc.

What if you want to change the bucket intervals? Or try with and without individual features? In that case, you should follow the approach in this article and build a Python scaffold. Such a Python scaffold will give you a lot of flexibility.

For tuning basic model parameters, however, use the built-in integration to Vertex AI.

Invoking best model

Once the training is complete, you can look up the evaluation metrics for each of the trials:

SELECT * FROM ML.EVALUATE(MODEL ch09eu.bicycle_model_dnn_hparam) 
ORDER BY mean_absolute_error ASC

and invoke the best model using:

SELECT * FROM ML.PREDICT(MODEL ch09eu.bicycle_model_dnn_hparam,
(
SELECT
CURRENT_TIMESTAMP() AS start_date,
'Waterloo Station 1, Waterloo' AS start_station_name
)
)

--

--

--

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Recommended from Medium

Build an Image Duplicate Finder System: A Guide

A seamless way of tracking ML experiments: Weights & Biases

Machine Learning. Linear Models. Part 1.

Stemming In Natural Language Processing NLP — Data Science

Stemming In Natural Language Processing NLP — Data Science

Unravel Policy Gradients and REINFORCE

CNN Feature Extraction: Between ReLU, Tanh and Sigmoid.

Image Classification with EfficientNet: Better performance with computational efficiency

Deep Learning with OpenCV: Scaling Up Model Serving

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Lak Lakshmanan

Lak Lakshmanan

Operating Executive at a technology investment firm; articles are personal observations and not investment advice.

More from Medium

Using Explainable AI in BigQuery ML

End-to-End BigQuery Machine Learning

Building a Demand Forecasting Model with Google Cloud Vertex AI and BigQuery ML

BigQuery ML models deployment with Vertex AI and Kubeflow