How to Train Visual AI Models & Process Images Stored in Snowflake using LandingLens

Leverage LandingAI’s visual AI solutions to process images and videos at scale, all within the secure, governed boundary of the Snowflake Data Cloud.

Yong (David) Park

Published in

Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

7 min readJun 7, 2024

LandingAI + Snowflake Partnership Annoucement

Introduction — Who is LandingAI?

LandingAI, led by globally recognized AI leader Dr. Andrew Ng, builds cutting-edge vision products. LandingAI is at the forefront of creating domain-specific large vision models (LVMs) and applying large multi-model models (LMMs) to practical vision tasks. Visual Prompting technology takes the ideas of text prompting and adapts them to vision — allowing vision models to be built not in hours, but in seconds. Moreover, LandingAI’s proprietary semantic vision layer enables businesses of all sizes to apply machine learning with minimal labeled data, turning images and videos into actionable insights. With user-friendly tools, encapsulated in our LandingLens platform, LandingAI makes visual AI accessible to developers and vision practitioners worldwide.

Business Problem Statement & Solution

Cameron Wasilewsky, one of the co-writers, has seen many Snowflake customers facing challenges with complex vision-based classification and prediction problems, such as defect identification and predictive maintenance. On the flip side, at LandingAI, I often worked with customers that struggled to store and manage their images, labels, metadata, and inference results within their machine learning operations (MLOps) process while adhering to their corporate data governance policies that often led to data silos and technical red tape.

These are some of the main reasons our customers are excited about the LandingAI and Snowflake partnership. Now, our customers like Opto22 — a leading provider of industrial automation and control solutions, manufacturing all products in the USA, can process images to train a deep learning-based visual model, iterate to improve the model’s performance, and deploy it into production all within their Snowflake ecosystem, ensuring ease of use and data security.

Opto22 faced the challenge of manually inspecting the conformal coating of Printed Circuit Boards, or PCBs, during manufacturing. Given the high-mix, low-volume nature of their product portfolio, automating quality inspection proved to be a challenging task. Working with one of our hardware partners, Seedit Vision, we were able to install a proper image acquisition system to form images during the UV treatment of PCBs to train and deploy a high-performing model (100% precision & recall) to determine whether or not the coating was undersprayed, oversprayed, or sufficient. LandingLens automatically applies best practices like data augmentation, train/dev/test data split, regularization, early stopping, transfer learning, and other methods to mitigate overfitting. With this effort, Opto22 was able to fully automate their quality inspection process, allowing them to repurpose their engineer from having to inspect them one by one manually.

In addition, now that the model has been deployed, they can set up an inference pipeline with a full feedback loop to ingest inference results and incoming images back into Snowflake for real-time monitoring and analytics using Streamlit. This allows for continuous model improvement over time. Opto22 is planning on expanding to other vision use cases throughout their manufacturing sites.

LandingAI + Snowflake Partnership & Integration

LandingLens, LandingAI’s Snowflake Native App Architecture

LandingAI announced the most sophisticated Native App in the Snowflake Marketplace during the Snowflake Data Cloud Summit 24 — it is the first (Visual AI) Native App that is fully hosted on Snowpark Container Services. Snowflake became an investor of LandingAI earlier this year to deepen the partnership and boost Visual AI in the Data Cloud.

Customers will now be able to leverage powerful Snowflake compute with configurable GPU resources to train deep learning and visual AI models directly in their Snowflake environment using LandingAI’s flagship product, LandingLens, with no data ever leaving the platform.

Let’s walk through a typical visual AI workflow.

LandingLens, a Snowflake Native App — A Step-by-Step Guide

Step 1: Visit the Snowflake Marketplace to download and launch LandingLens

Step 2: Determine the project type: classification, object detection, and segmentation.

Classification models classify the images in their entirety, i.e. using a model to classify defect parts vs. good parts in manufacturing
Object detection models identifies objects of interest, e.g., defects on a PCB board or a car on a highway
— It provides general size and location of objects of interest as an output using a bounding box and its respective coordinates
Segmentation models are similar to object detection but much more pixel-precise and sensitive
— It provides pixel “masks” of classes for more sensitive use cases that require higher precision, e.g. detecting scratches on phone screens or paint defects on cars

Step 3: Connect them to a Schema to set up a data pipeline and ingest images directly from Snowflake

Run these Snowflake SQL queries to grant stage read permission

GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO APPLICATION <THIS APPLICATION NAME>;
GRANT USAGE ON DATABASE YOUR_DB TO APPLICATION <THIS APPLICATION NAME>;
GRANT USAGE ON SCHEMA YOUR_DB.YOUR_SCHEMA TO APPLICATION <THIS APPLICATION NAME>;
GRANT READ ON STAGE YOUR_DB.YOUR_SCHEMA.YOUR_STAGE TO APPLICATION <THIS APPLICATION NAME>;

Step 4: Leverage Smart Labeling tools provided by LandingAI to label at least ten images

Step 5: Click on the “Train” button to trigger a model training job — once the job has been triggered, LandingLens will tap into Snowflake GPU compute to train a deep learning-based visual AI model using your data and labels

Step 6: Evaluate the model performance using precision* and recall* metrics as well as the confusion matrix on the right side

Step 7: Perform visual error analysis using the confusion matrix. Click into incorrect predictions to filter the data to look at False Positives and False Negatives. This will help you find potential gaps within the model and ways to improve its performance!

Step 8: Deploy the model into production using various deployment options:

Self-Hosted Deployment Option 1, LandingEdge: LandingAI’s edge computing software which is a popular option for manufacturing and industrial applications since it allows;
— Enabling edge-first deployment strategy for data security and alleviating internet connectivity issues
— Supporting high throughput use cases (30+ inferences per second)
— Integration with industry-leading PLC’s (Programmable Logic Controller) and GeniCam-compliant cameras
Self-Hosted Deployment Option 2, LandingAI Docker: package and orchestrate the models in production using orchestration tools such as Kubernetes
Cloud Deployment (REST API): surface the model as an API endpoint to interact with the model programmatically and build an application wrapped around the model

Step 9: Set up a feedback loop to send results back to Snowflake for data enrichment and futher analysis

Step 10: Build an application on top of the model using Streamlit and Snowflake Cortex for real-time monitoring & analytics to interrogate incoming data using Generative AI

Conclusion & Roadmap — What’s Coming Next?

Processing and gaining insights from images and videos is now easier than ever on Snowflake through LandingLens. Visual AI is also transforming the way structured data is enhanced and enriched. By extracting insights from images and videos, customers can now generate metadata, classify content, and even detect patterns that are not discernible to the human eye. This capability allows businesses to augment their structured datasets with deep, contextual insights that were previously inaccessible.

Follow this step-by-step guide to train your own vision model. Log in to your Snowflake account, search for LandingLens or LandingAI on the Snowflake Marketplace and try for yourself and let us know what you think!

What’s coming next? LandingAI is also planning on integrating its Generative AI capabilities into Snowflake in the future. With our latest innovation around Large Vision Models (LVMs), Large Multi-Modal Models (LMMs) as well as vision agents, customers will be able to unlock new avenues for data analytics by bridging the gap between unstructured (visual, text, etc.), semi-structured (JSON, documents, etc.) and structured datasets, making data more comprehensive, insightful, and valuable across various industries.

You can also check out the resources below to stay updated on the latest developments!

LandingAI Resources:

Try LandingLens for free: https://app.landing.ai/signup
GitHub Repo for Python Library: https://github.com/landing-ai/landingai-python
GitHub Repo for Vision Agents: https://github.com/landing-ai/vision-agent
Join our Discord to stay updated on vision agents & LMMs: https://t.co/D7W3Z9lWd2
Join LandingAI’s community to collaborate and ask questions: https://community.landing.ai/home

Snowflake Resources:

Try Snowflake for free: https://signup.snowflake.com/
Learn more about Snowpark Container Services: https://docs.snowflake.com/en/developer-guide/snowpark-container-services/overview
Learn more about Native Apps: https://docs.snowflake.com/en/developer-guide/native-apps/native-apps-about

*Precision: Also known as positive predictive value. It helps describe the purity of the positive detections relative to the ground truth.

*Recall: Also known as sensitivity. It helps describe the completeness of the positive predictions relative to the ground truth.