Snowflake’s ML Functions just got easier (and better!)

Hello, Snowflake community!

We are thrilled to share improvements to our Snowflake Machine Learning Functions. These enhancements include:

  • Easier data preparation for the Classification function: You can now use timestamps and more diverse categorical data as inputs to the Classification function.
  • Easier data referencing: You can now use TABLE instead of that pesky SYSTEM$REFERENCE.
  • Easier results handling: No more SQLID or result_scan. You can now store results from the Forecasting and Anomaly Detection functions much more easily.
  • Better results: Your forecasting prediction quality should improve with the algorithm upgrade we released this spring — so that your predictions are high quality, no matter the scenario you’re working on.
  • Easier interpretations: You can now use show_evaluation_metrics with the Anomaly Detection function to evaluate your models. You can also use show_evaluation_metrics to assess your model relative to data that arrived after training completed.

Let’s dive in.

Easier data preparation 🏗

You can now train your Classification models on a broader set of input types: timestamps (specifically, timestamp_ntz for now) and more diverse categorical data. By “more diverse categorical data,” we mean text that has a large but finite number of values, such as fruits or job titles.

With these two types of data available as inputs to your Classification models — for training or prediction — the world is your oyster! The only thing we don’t handle yet is long text fields…we’re working on it!

Easier data referencing 📖

You can now use the TABLE keyword to get a reference to a table, view, secure view, or query — instead of using SYSTEM$REFERENCE or SYSTEM$QUERY_REFERENCE:

CREATE SNOWFLAKE.ML.FORECAST my_model(
INPUT_DATA => SYSTEM$REFERENCE('TABLE', 'my_data'),
…);

you can use the TABLE keyword:

CREATE SNOWFLAKE.ML.FORECAST my_model(
INPUT_DATA => TABLE('my_data'),
…);

Easier results handling 📈

You can now call the Forecast and Detect Anomalies ML Functions directly in the FROM clause of a SELECT statement.

You can use this to simplify how you save results to a table. For example, rather than using the SQLID variable with result_scan to create a table containing these results (more details here):

BEGIN
CALL model!FORECAST(7);
LET x := SQLID;
CREATE TABLE my_forecasts AS SELECT
*
FROM TABLE(RESULT_SCAN(:x));
END;
SELECT * FROM my_forecasts;

you can use a query that directly selects from the results of calling the methods:

CREATE TABLE my_forecasts AS SELECT
*
FROM TABLE(model!forecast(7));

We hope this makes your life easier — and solves any concern you might have about whether your results are getting stored to the right table!

Note: This change works for prediction but also for steps like show_evaluation_metrics and more.

Better results 📊

With no changes from your side, you will now benefit from improved forecasts, thanks to algorithm updates we’ve made. This improvement primarily enhances the algorithm’s ability to forecast trends in your data.

Easier interpretations 👓

You can now use show_evaluation_metrics with the Anomaly Detection function to evaluate your Anomaly Detection models. You can also use show_evaluation_metrics to assess your model relative to data that arrived after training completed.

These changes help you answer the questions, “How good is my model, now that it’s trained?” and “How good are my model’s predictions relative to actual observations for the same timestamps?”

Conclusion

These changes are a testament to our ongoing commitment to improving and refining our Machine Learning Functions — to better serve your teams. We are still in the early innings of this work.

We can’t wait to see what you’ll build with these new capabilities. As always, we welcome your feedback and look forward to continuing to serve you with the best in cloud-based data warehousing and analytics.

Stay tuned for more exciting updates. ✨

--

--