Snowflake Summit 2024 Highlights: Game-Changing Feature Announcements

How Snowflake’s Latest Innovations Can Transform Your Company

✨ Not a Medium Member? No problem! Use this link to read for free. ✨

The recent Snowflake Summit showcased groundbreaking advancements designed to push the boundaries of data processing, enterprise AI, and cloud services. As a Sales Engineer at Snowflake, I supported partners and customers through some of these exciting developments. This blog will provide a comprehensive summary of the major announcements from Summit, highlighting key features like Cortex AI, Snowpark enhancements, improved app development capabilities, and new governance tools. We will explore how these innovations can significantly impact your business, enhancing productivity, security, and operational efficiency.

Snowflake feature cycle: private preview (PrPr) only some accounts can access, public preview (PuPr) all accounts in the region can access, but not production ready, Generally Available (GA) all accounts can access and support production-grade deployment

TLDR;

  1. Cortex: No-code AI development for enterprise innovation. Currently, the Cortex Platform is GA, but many new features are going into PuPr soon, such as Cortex Analyst, Cortex Search, Cortex Fine-Tuning, AI & ML Studio for ML Functions, and Cortex Guard. With PrPr for AI & ML Studio for LLM functions. Why care? Accelerates AI adoption, enhances security, and reduces costs. Learn more
  2. Simplified End-to-End Development: Streamlined tools for enhanced productivity. Notebooks (PuPr), Improved Snowflake CLI (GA soon), new Python API (GA soon), and Snowflake Trail. Why care? Efficient AI/ML workflows, improved collaboration, and faster development cycles. Learn more
  3. Expanding Range of Apps: Broader application support within Snowflake. Native Apps with Containers is PuPr on AWS; Container Services are now GA on AWS, PuPr on Azure, and Native Apps are now GA on all three clouds. Why care? Comprehensive ecosystem, enhanced performance, and flexible deployment. Learn more
  4. Snowflake Horizon: Advanced governance and data discovery tools. New features are coming to PuPr soon, including complete lineage, ML asset monitoring, synthetic data generation, NA Software scans, Autotagging, and Internal Marketplace. Why care? Improved data quality, compliance, and operational efficiency. Learn more
  5. Polaris Catalog: An open-source catalog for Apache Iceberg, enhancing interoperability across data engines. PuPr will be soon; the open-source release will be in 90 days. Why care? Reduced complexity, lower costs, and robust governance. Learn more
  6. Innovations for a Simplified Data Foundation: Robust and user-friendly data management. Private previews for Parquet Direct, Delta Direct, AI-Powered Object Descriptions, Sensitive Data Auto-Classification, PostgreSQL, mySQL connectors, and per-query cost attribution are coming. Why care? Enhanced governance, integration, and cost optimization. Learn more
  7. Snowpark for Python with Pandas API: Scale data science workflows with familiar tools, PuPr is available now! Why care? Enhanced productivity, cost efficiency, and data security. Learn more

Snowflake Cortex AI Advances Enterprise AI with No-Code Development

Announcement:

Snowflake has introduced Cortex AI, a fully managed service to enable enterprise AI with no-code development, serverless fine-tuning, and managed services for building AI applications like chat-with-data interfaces.

Key Features and Benefits:

  1. Cortex Analyst: Business users can interact with structured data using natural language (public preview soon).
  2. Cortex Search: Provides efficient and accurate document search capabilities (public preview soon).
  3. Cortex Fine-Tuning: Enables secure and effortless customization of large language models (public preview).
  4. AI & ML Studio: Facilitates no-code AI development for users of all technical levels (private preview).
  5. Cortex Guard: Ensures model safety by filtering harmful content (generally available soon).
  6. REST APIs: Allows programmatic access to LLMs from any application (coming soon).
  7. AI21 Labs models: Cortex will soon serve models developed by AI21 labs, allowing for reduced cost and improved inference options (coming soon)

Business Impact:

  • Increased Accessibility: Makes AI tools accessible to all users, regardless of technical expertise.
  • Enhanced Productivity: Accelerates AI development and deployment, reducing time and costs.
  • Improved Security and Compliance: Provides robust data security and governance, ensuring compliance with industry standards. No data leaves Snowflake, and no models are trained on user input.
  • Scalability and Efficiency: Optimizes AI models for specific tasks, reducing latency and costs.

Value to You:

  • Ease of Use: No-code and serverless capabilities simplify AI adoption.
  • Security: Ensures data and model safety with advanced governance and privacy features.
  • Flexibility: Supports various AI applications and use cases, enhancing operational efficiency and decision-making.

You can check out the full blog post on Snowflake’s website for more details.

Simplified End-to-End Development

Announcement:

Snowflake has introduced many new features to simplify end-to-end development, mainly focusing on AI/ML workflows, streaming capabilities, and enhancing the developer experience.

Key Features and Benefits:

New Features:

  • Git Integration: This tool provides an easy way to integrate and manage application code within Snowflake, supporting collaborative development workflows. (PuPr)
  • Improved Snowflake CLI: An open-source command-line interface designed for managing applications and workloads in Snowflake, enhancing developers’ ease of use.
  • Observability Tools: Features like logging and tracing with event tables (in PuPr) improve the debuggability of applications within Snowflake.
  • New Python API: This API simplifies data pipelines and allows direct interaction with Snowflake objects. It supports Tasks/DAG, Snowpark Container Services, Tables, Warehouse, Schema, and Databases (GA soon).

Existing Features of Note:

  • Snowpark ML Modeling API: Scales out feature engineering and simplifies AI/ML training using familiar libraries like scikit-learn and XGBoost. This API supports seamless integration and data processing within Snowflake.
  • Snowpark Model Registry: Provides a centralized repository for managing and deploying ML models, enhancing collaboration and operational efficiency.
  • Streamlit Integration: This feature is in GA and allows data scientists to create interactive applications combining data and models with the ease of Python’s Streamlit library.
  • Dynamic Tables: This feature simplifies continuous data pipelines by incrementally maintaining query results as updated tables. In GA, it automates data transformations, bridging the gap between batch and streaming data.
  • Snowpipe Streaming: Facilitates the seamless integration of streaming data into Snowflake, simplifying the setup and management of real-time data pipelines.

Business Impact:

  • Efficiency and Productivity: By providing integrated tools and APIs, Snowflake simplifies the development and deployment of AI/ML models and streaming data pipelines, reducing the time and effort required for these tasks.
  • Scalability and Flexibility: The new features support scalable ML and real-time data processing, enabling businesses to handle growing data volumes and complex analytics easily.
  • Improved Collaboration: Centralized repositories and collaborative tools like Git integration and the Snowpark Model Registry enhance teamwork and streamline development.
  • Enhanced Security and Governance: Snowflake’s robust governance and security features ensure that data remains secure and compliant throughout the development lifecycle.

Value to You:

  • Comprehensive Toolset: The end-to-end capabilities offered by Snowflake cover the entire development lifecycle, from data ingestion and transformation to model deployment and monitoring.
  • Seamless Integration: Developers can leverage familiar tools and libraries within Snowflake, minimizing the learning curve and maximizing productivity.
  • Operational Efficiency: Automated and streamlined processes reduce operational overhead, allowing teams to focus on innovation and value creation.
  • Real-time Capabilities: Support for streaming data and dynamic tables enables businesses to act on real-time insights, driving faster and more informed decision-making.

You can check out the full blog post on Snowflake’s website for more details.

Expanding Range of Apps Deploy and Distribute on Snowflake

Announcement:

Snowflake has significantly expanded the types of applications that can be built, deployed, and distributed on its platform, enhancing capabilities for AI, search, containerized workloads, and more.

Key Features and Benefits:

  1. Containerized Workloads: Snowpark Container Services (GA on AWS, public preview on Azure) enable efficient AI app development with low latency processing.
  2. Containerized Native Applications: Native Applications can now be built using containers, allowing for my diverse applications to leverage GPUs and advanced UIs (PuPR on AWS)
  3. Search Capabilities: Full-text search and Cortex Search for high-volume data search and natural language queries (PuPr soon).
  4. Application Distribution: The Snowflake Native App Framework (GA on AWS, Azure and GCP) allows deployment across major clouds, enhancing operational efficiency and monetization options.
  5. Hybrid Tables: Transaction tables that support OLTP workloads on Snowflake, now integrated with Snowpark Container Services (SPCS) (PuPr AWS)

Business Impact:

  • Streamlined Development: Comprehensive tools for building, deploying, and managing AI and data apps.
  • Operational Efficiency: Simplified distribution and management of applications across multiple clouds.
  • Scalability: High-performance, low-latency processing supports sophisticated AI models and applications.
  • Remove Barriers to Distribution: Native applications have reduced sales and deployment cycles.

Value to You:

  • Comprehensive Ecosystem: Integrated tools for end-to-end application development.
  • Enhanced Performance: Low-latency processing and efficient data handling.
  • Flexible Deployment: Deploy and monetize applications across significant cloud platforms seamlessly.

You can check out the full blog post on Snowflake’s website for more details.

Snowflake Horizon — Leading Governance and Data Discovery

Announcement:

Snowflake Horizon introduces advanced governance features, internal marketplaces, and AI innovations to enhance data security, privacy, and collaboration.

Key Features and Benefits:

  • Internal Marketplace: Secure collaboration with a single directory of data products curated within organizations. (Private preview)
  • AI-powered Object Descriptions: Automatically generates relevant descriptions for tables and views. (Private preview soon)
  • Sensitive Data Classification Interface: Simplifies data classification with auto-tagging and auto-classification. (Generally available soon)
  • Universal Search: Uses AI for natural language searches across multiple data types and sources. (Generally available)
  • Trust Center: Centralized monitoring for security and compliance risks across clouds. (Generally available soon)
  • Synthetic Data Generation: Creates row-level reproductions of sensitive data that retain referential integrity and are helpful for testing and development. (Private preview soon)

Business Impact:

  • Improved Collaboration: Enhanced discoverability and secure sharing of organizations’ data, apps, and models.
  • Advanced Security: Centralized capabilities to comply with security standards and protect sensitive data.
  • Operational Efficiency: Automated tools and AI-driven insights streamline data management and reduce manual efforts.

Value to You:

  • Seamless Integration: Access and easily manage diverse data sources, leveraging AI and automated tools.
  • Enhanced Security and Compliance: Protect sensitive information and meet regulatory requirements effectively.
  • Boosted Productivity: Streamline collaboration and data governance, enabling faster insights and decision-making.

For more details, see the full blog post on Snowflake’s website or the YouTube playlist.

Introducing Polaris Catalog

Announcement:

Snowflake has introduced Polaris Catalog, an open-source catalog for Apache Iceberg designed to enhance interoperability across various data processing engines and reduce vendor lock-in. Polaris Catalog can be hosted on Snowflake’s AI Data Cloud or self-hosted, offering flexibility and robust governance features.

Key Features and Benefits:

  • Cross-engine Interoperability: It supports multiple data engines, such as Apache Flink, Spark, and Dremio, using a standardized REST API.
  • Flexibility: Deployable on Snowflake infrastructure or self-hosted.
  • Governance Integration: Integrates with Snowflake Horizon to enhance governance capabilities.

Relevant Timelines:

  • Public preview on Snowflake infrastructure: Soon
  • Open-source release: Within 90 days

Business Impact:

  • Reduced Complexity and Costs: Enables using multiple data engines on a single data copy, minimizing data movement and storage costs.
  • Improved Flexibility and Control: Offers choice and control over data management and processing environments.
  • Enhanced Governance: Extends Snowflake’s governance features to Iceberg tables, ensuring secure and compliant data management.

Value to You:

  • Enhanced Interoperability: Facilitates seamless integration of various data engines, improving flexibility and reducing vendor dependency.
  • Cost Efficiency: Reduces the need for data duplication and movement, lowering storage and compute costs.
  • Robust Governance: Provides comprehensive governance features, ensuring secure and compliant data management across different environments.

You can check out the full blog post on Snowflake’s website for more details.

Innovation for a Simplified Data Foundation

Announcement:

Snowflake has introduced several enhancements to simplify data management and extend governance capabilities. These innovations help organizations build a robust data foundation, which is crucial for leveraging advanced AI technologies and ensuring efficient data operations.

Key Features and Benefits:

  1. Parquet Direct and Delta Direct: Seamless integration of Parquet and Delta Lake files with Iceberg tables. (Private preview)
  2. Enhanced Cost Management: New capabilities for better visibility and optimization of Snowflake spend. (Public preview soon)
  3. Improved Data Ingestion: Performance improvements were made to load JSON and Parquet files, as well as new connectors, PostgreSQL, and MySQL. (Public preview soon)
  4. Advanced Analytics: New functions for time series analysis and full-text search. (Various stages of availability)
  5. Iceberg Tables: General availability supports open architectures like data lakehouse and data mesh.
  6. Enhanced Cost Management: New capabilities for better visibility and optimization of Snowflake spend. (GA features coming soon)

Business Impact:

  • Improved Data Quality and Compliance: Enhanced monitoring and governance features ensure high data quality and compliance with industry standards.
  • Operational Efficiency: New data lineage and privacy tools make data management more efficient and secure.
  • Cost Optimization: The new interface helps organizations control and optimize their Snowflake expenses.

Value to You:

  • Robust Governance: Strong data governance capabilities ensure data integrity and security, which are crucial for regulatory compliance and operational efficiency.
  • Enhanced Integration: Support for Iceberg Tables and other data formats provides flexibility and reduces data silos.
  • Cost Efficiency: Optimized cost management tools help organizations maximize their investment in Snowflake.

You can check out the full blog post on Snowflake’s website for more details.

Snowpark for Python with Pandas API

Announcement:

Snowflake has introduced support for the Pandas API in Snowpark for Python, allowing data scientists to scale their workflows using familiar tools. This integration means users can now leverage the power of Pandas within the Snowflake environment, enabling efficient data manipulation and analysis at scale.

Key Features and Benefits:

  1. Scalability: Snowpark’s Pandas API allows large-scale data processing, leveraging Snowflake’s elastic compute power to handle massive datasets without moving data between environments.
  2. Familiarity: Data scientists can use familiar Pandas functions and data structures, making it easier to transition existing codebases and workflows to Snowpark.
  3. Performance: Snowflake’s backend optimizations ensure that Pandas operations are executed efficiently, often outperforming traditional methods. Snowpark DataFrames are significantly faster, as much as 8X, compared to local Pandas DataFrames in various benchmarks.

Relevant Timelines:

The Snowpark Pandas API is currently available as PuPr, and users can start integrating it into their workflows immediately.

Business Impact:

  • Enhanced Productivity: Using a familiar toolset (Pandas), data scientists can quickly adopt Snowpark without a steep learning curve, thus improving productivity.
  • Cost Efficiency: Snowpark allows users to perform complex data transformations and analyses directly within Snowflake, reducing the need for additional infrastructure and associated costs.
  • Data Governance and Security: Operations are performed within the secure Snowflake environment, ensuring that data governance and security policies are maintained.
  • Flexibility: The ability to process data in place without moving it between environments simplifies workflows and reduces data latency, making real-time analytics more feasible.

Value to You:

  • Seamless Integration: Companies already using Pandas for data manipulation can integrate their workflows into Snowflake effortlessly.
  • Performance Gains: Significant improvements in data processing times can lead to faster insights and decision-making.
  • Scalable Infrastructure: Leveraging Snowflake’s scalable infrastructure ensures that companies can handle growing data volumes without performance degradation.
  • Reduced Complexity: Keeping data operations within a single environment minimizes the complexity and potential errors associated with moving data between systems.

You can check out the full blog post on Snowflake’s website for more details.

Harnessing Snowflake’s Innovations for Business Growth

The announcements from the Snowflake Summit mark a pivotal moment in our journey to empower businesses with cutting-edge technology. Whether you want to leverage no-code development with Cortex AI, streamline AI/ML workflows, or enhance data governance, Snowflake’s latest features offer robust solutions tailored to your needs. These innovations are designed to drive efficiency, security, and scalability, providing a solid foundation for your enterprise to thrive in a competitive landscape. By integrating these advancements into your operations, you can unlock new opportunities for growth and success. Join us in embracing the future of enterprise data and AI with Snowflake.

--

--