Table functions — A hidden gem in Google’s BigQuery

Christian Silva
Google Cloud - Community
3 min readMar 7, 2023

One of the multiple benefits of moving to the cloud is simplification.

The Challenge

Recently, I’ve been working on some use cases in BigQuery, where a specific set of rules and logic must be executed before the query execution for validation purposes. The complete list of potential requirements is as follows:

  1. The end user must pass input parameters to filter the query results. These validation logics should execute before any WHERE clauses or SQL predicates.
  2. The results of this query need to be JOINED or used as CTE tables to support complex query logic from the end user.
  3. The execution must support any 3rd party tool visualisation tools using Standard SQL.
  4. Users won’t have access to the underlying data where the filter logic is executed for security purposes.

Based on the above, my data and software engineering hats came into place with solutions such as creating an app to handle the requests, developing stored procedures, or even moving to a fully managed API approach to abstract all the logic.

Even though these solutions will meet the initial requirements (somehow), I was missing the point that I started this blog with — simplicity.

Table Functions to the rescue!

A Table Function is a user-defined function that behaves similarly to views, with the main difference being that a table function can take parameters! The result is a table, meaning you can invoke it within your SQL queries or via any 3rd party app.

Table functions flow

And by configuring this routine as an Authorised Function, you can share the query results with particular users or groups without giving those users or group access to the underlying tables.

Creating a Table Function

In this example, we will create a table function based on the public Google Trends Public dataset. In our use case, we want the end user to send the refreshing day as a parameter to ensure that the query will retrieve the top Google Search terms from 1 day only to prevent users from retrieving months of data and avoid unnecessary table scans:

CREATE OR REPLACE TABLE FUNCTION mydataset.google_trends_tf(refreshDate DATE) 
AS
(
SELECT
refresh_date AS Day,
term AS Top_Term, -- These search terms are in the top 25 in the US each day.
rank,
FROM `bigquery-public-data.google_trends.top_terms`
WHERE rank = 1 -- Choose only the top term each day.
AND refresh_date = refreshDate -- Filter on the parameter day
GROUP BY Day, Top_Term, rank
ORDER BY Day DESC
);

Notice that this Function received one parameter of type DATE, but you can specify more parameters of different data types as you see fit.

You can now call your Table Function as you would query any other table with the extra that you need to pass the parameters at the end of the query.

SELECT * FROM sample.google_trends_tf('2023-02-05');

And because the result that you get from your Function is a Table. You can use it as part of your JOINS or CTE tables:

WITH tmp_trend AS (
SELECT * FROM sample.google_trends_tf('2023-02-05')
)
SELECT *
FROM `bigquery-public-data.google_trends.top_rising_terms` AS r
JOIN tmp_trend t ON r.refresh_date = t.day
order by week desc;

Finally, you can Authorise your Routine to prevent users from accessing the underlying data of your Table Function.

Happy Coding!

--

--

Christian Silva
Google Cloud - Community

Customer Engineer @Google Cloud & Co-Host @The Data Coffee Break Podcast. I love helping people and organisations to access and get insights from their data!