❄️Snowflake in a Nutshell — LLMs and Generative AI🤖

❄️Snowflake in a Nutshell — LLMs and Generative AI by Dan Galavan
credit: Dan Galavan

Since the launch of ChatGPT on 30th November 2022, industry has been awash with Generative AI activity. Things are happening so fast, that if we don’t check Twitter at least a few times a day, we are sure to miss one big development or another.

The purpose of this post is to take a step back and give you a birds eye view of LLMs and Generative AI in the context of the Snowflake Data Cloud.

But first things first - definitions:

LLM : “Large Language Models (LLMs) are artificial intelligence tools that can read, summarize and translate texts and predict future words in a sentence letting them generate sentences similar to how humans talk and write.” — Professor Shobita Parthasarathy, University of Michigan

Generative AI : “Generative artificial intelligence (AI) describes algorithms (such as ChatGPT) that can be used to create new content, including audio, code, images, text, simulations, and videos” — McKinsey & Co

In other words, LLMs specialize in text, and Generative AI is a broader category including e.g. text, audio, and video.

Relevant Snowflake acquisitions

Snowflake acquisition of Neeva. Credit: Neeva

There have been a number of Snowflake acquisitions over the last year or so which are relevant to the Generative AI conversation:

  • Acquired on May 24th Neeva specializes in Generative AI search at scale. While information at present is scant, it is clear that Generative AI and Snowflake are moving closer together. This could apply to querying data, deriving insights, or searching the Snowflake Marketplace in a prompt driven manner.
  • Applica is an AI platform for unstructured data understanding (e.g. documents, emails, web pages, images). Applica includes something called a TILT (Text-Image-Layout-Transformer) model, which is an LLM for document intelligence. This model is being integrated with Snowflake as we speak.
  • Streamlit can be used as an interactive front end for LLMs. The high profile GPTZero service uses Streamlit. GPTZero can detect whether text was written by a human, or is AI generated.

The Snowflake Summit sessions

Snowflake Summit 2023

With a plethora of Generative AI themed sessions at this year’s Snowflake Summit in Las Vegas (26th — 29th June), there is plenty to choose from e.g.:

  • Generative AI’s Impact on Data Innovation in the Enterprise
  • Turbocharging enterprise value with Generative AI
  • Unleashing the Power of Large Language Models with Snowflake
  • AI Shaping the Future: Navigating Ethics, Collaboration, and Inclusion
  • GPTZero: Idea to Iteration Powered by Streamlit and Snowflake
  • Running Open-Source LLMs on Snowflake

Using and extending current Snowflake functionality

Integrating Generative AI with Snowflake’s External Functions
Integrating Generative AI with Snowflake’s External Functions. Credit: Dash Desai

In a recent blog, Torsten Grabs, Snowflake Senior Director of Product Management discusses current Snowflake components along with what’s in the pipeline e.g.:

  • Integrations to web-hosted LLM APIs using Snowflake external functions. Snowflake external functions can be used to call executable code that is developed, maintained, stored, and executed outside of Snowflake
  • Code autocomplete, text-to-code, and text-to-visualisations for both SQL and Python
  • LLM-powered search experiences across all of Snowflake, from documentation to Snowflake Marketplace
  • As mentioned earlier, Streamlit as an interactive front end for LLM-powered apps
  • Protecting sensitive or proprietary data such as source code, PII, internal documents, wikis, code bases, and other sensitive data sets using Snowflake Governance functionality

Last but not least 🎤🧑‍🏫….

….at this year’s Snowflake Summit, during a live stream I’ll be discussing :

  • Generative AI benefits & tips in the context of Snowflake
  • Data governance considerations
  • What this means for the future of data professionals such as the data modeler

This follows on from my recent presentation on Generative AI augmented Data Modeling at the Knowledge Gap Data Modeling and Data Architecture conference.

Title slide on Generative AI augmented Data Modeling from my presentation at the Knowledge Gap conference 2023 (credit: Dan Galavan)
Title slide from my presentation at the Knowledge Gap conference 2023 (credit: Dan Galavan)

I’m looking forward to the live stream discussion in Las Vegas with Snowflake community manager Howard Lio, taking place on Thursday 29th June at 9:10 PT, 17:10 Irish Standard Time / BST.

Conclusion

We have just had a birds eye view of LLMs and Generative AI in the context of the Snowflake Data Cloud.

From acquisitions such as Neeva, to LLM integrations using Snowflake external functions, to the plethora of AI related sessions at this year’s Snowflake Summit, Generative AI and Snowflake are becoming more intertwined by the day.

Hopefully this blog has given you a chance to take a breather and to catch up.

And of course, do tune in to my Snowflake Summit Generative AI live stream discussion on Thurs 29th June at 9:10 PT, 17:10 Irish Standard Time / BST (link to follow shortly)😊

© 2023 Dan Galavan, galavan.com

--

--

Dan Galavan
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Independent Data Architecture Consultant, 24 years experience. Snowflake DSH (1 of 72 worldwide). www.galavan.com. Author: Snowflake in a Nutshell series ©