Unleashing AI Power

Applying Jamba-Instruct to Long Context Use Cases in Snowflake Cortex AI

Discover how AI21 Lab’s Jamba-Instruct enhances long-context AI tasks seamlessly in the Snowflake ecosystem.

Written by Robbin Jang, Alliances Manager at AI21, Chen Wang, Lead Alliances Solution Architect at AI21, and Cameron Wasilewsky, Senior Sales Engineer at Snowflake

Snowflake recently announced that AI21’s foundation model, Jamba-Instruct, is available for serverless inference in Snowflake Cortex AI. Snowflake customers can now take advantage of Jamba-Instruct’s 256k context window, large enough to analyze eight years’ worth of 10-K filings or summarize 25 hours of clinical trial patient interviews. All you need to do is call Jamba-Instruct with a simple Snowflake Cortex SQL COMPLETE statement. You can start delivering on the promise of GenAI while getting the benefits of Snowflake’s security, governance, and cost-effective scaling built into Cortex AI.

Jamba-Instruct’s 256k context window is made possible by its unique hybrid SSM (aka Mamba) + transformer architecture that requires less memory than a traditional transformer-only architecture. With its hybrid architecture, Jamba-Instruct does not degrade in output quality and accuracy as context length increases, unlike more memory-intensive traditional transformer models.

The RULER benchmark evaluates long context models on four categories of complex and multi-step reasoning tasks (retrieval, multi-hop tracing, aggregation, and question answering) and 85% is considered a passing grade.

A long context window simplifies complicated techniques often required to make LLMs performant and accelerates no-code AI development. For many use cases and tasks, such as long document summarization, call transcript analysis, or building a chatbot on top of a large knowledge base, long context capabilities accelerate development and enhance performance.

For developers and analysts looking to try out long context use cases with Jamba-Instruct, we’ve provided a walkthrough of a use case to help get you started. We can’t wait to see how you use this new model, and what you build in Snowflake!

How to Use Snowflake Cortex AI with Jamba

The proper syntax for calling an LLM using the Snowflake Cortex COMPLETE function is straightforward.

SNOWFLAKE.CORTEX.COMPLETE('jamba-instruct', <prompt_or_history> [ , <options> ] )

A more detailed explanation of the Cortex COMPLETE function can be found on Snowflake’s docs. Following is a common use case that you can build along with.

Form 10-K analysis

Annual company reports like 10-Ks are often very long (~100 pages) but contain crucial information that financial analysts dedicate significant time and effort to extract. In this example, you can dramatically accelerate the time to insight by using Jamba-Instruct on Cortex to accurately and quickly perform data extraction, summarization, and Q&A on a 10-K filing.

Note that typically, performing these tasks over such a long document requires setting up a Retrieval Augmented Generation (RAG) workflow to handle text segmentation and retrieval techniques, and chunking the 10-K filing into segments small enough to fit into a standard LLM’s context window. Jamba-Instruct’s 256k context window — the largest context window available on Cortex AI — allows you to feed the model the entire document in a single prompt, obviating the requirement to set up a search service that can scan and chunk the document. Instead, analysis over the entire 10-K is possible using the simple SQL outlined in the steps below:

Step 1

We will be using the SEC Filings dataset provided by Cybersyn from the Snowflake Marketplace. The SEC Filings documents are conveniently written into a column in a Snowflake table. For this example, we will select all the 10-K filings from the company whose Central Index Key (CIK) is 0001045810. The column VALUE contains the entire text for each SEC filing document.

CREATE OR REPLACE TABLE SEC_FILINGS AS
SELECT CIK, VARIABLE, PERIOD_END_DATE,
VALUE FROM "SEC_FILINGS"."CYBERSYN".SEC_REPORT_TEXT_ATTRIBUTES
WHERE CIK = '0001045810' AND VARIABLE = '10-K_Filing_Text';

Step 2

We can use Jamba to extract any information from these documents, e.g. the company name.

SELECT SNOWFLAKE.CORTEX.COMPLETE('jamba-instruct', 
CONCAT('Extract the company name
from following SEC filing document.
The output should only contain the full company name
with no other preambles. Document: ', VALUE))
as "COMPANY_NAME"
FROM SEC_FILINGS;

In this case, Jamba correctly identifies that NVIDIA Corporation is the company of these 10-K filings.

Step 3

Next, we ask Jamba to summarize 3 key financial metrics from the 10-K.

SELECT SNOWFLAKE.CORTEX.COMPLETE('jamba-instruct', 
CONCAT('Summarize 3 key metrics about the financial performance
of the company based on the following SEC filing document
for a financial analyst concisely: ', VALUE))
as "SUMMARY"
FROM SEC_FILINGS;

Here is the output from Jamba based on the most recent Nvidia 10-K:

**NVDA Financial Performance Summary:**

Based on the SEC filing document for NVIDIA Corporation, here are the 3 key metrics regarding their financial performance:

**1. Revenue:** In the fiscal year ending on January 28, 2024, NVIDIA Corporation reported a revenue of $60.922 billion. This represents a significant increase of 126% compared to the previous fiscal year.

**2. Net Income:** The company’s net income for the fiscal year 2024 was $29.760 billion, marking a substantial rise of 581% from the prior year. This translates to a net income per diluted share of $11.93, a considerable increase from $1.74 in the previous year.

**3. Gross Margin:** NVIDIA’s gross margin improved to 72.7% in the fiscal year 2024, up from 56.9% in the prior year. This increase is primarily attributed to robust Data Center revenue growth and lower net inventory provisions as a proportion of revenue.

Step 4

For the last step in our example, we will ask Jamba to identify the top 3 key business risks disclosed in the 10-K.

SELECT SNOWFLAKE.CORTEX.COMPLETE('jamba-instruct', 
CONCAT('What are the 3 key business risks for the
company identified in the following document? ', VALUE))
as "QnA"
FROM SEC_FILINGS;

Here is the output from Jamba based on the most recent Snowflake 10-K:

The three key business risks for NVIDIA Corporation, as identified in the document, are:

1. **Evolving Industry Needs:** Failure to meet the evolving needs and standards of the industry can adversely impact NVIDIA’s financial results. This risk involves anticipating and adapting to changes quickly, developing new products, and maintaining existing ones.

2. **Competition:** High levels of competition present a significant risk. Being outcompeted by other companies, potentially due to factors like more competitive pricing, better functionality, or innovation, could lead to a decrease in NVIDIA’s market share and financial performance.

3. **Regulatory and Legal Issues:** Complex and changing laws, especially those related to export controls, data privacy, security breaches, and responsible use of AI, pose a risk to NVIDIA’s operations and competitiveness. Changes in these regulations could lead to financial, reputational, or operational harm.

You can combine all the different Cortex AI function calls and write the outputs in a Snowflake table for downstream analytics.

You can also create a Streamlit application to help end-users summarize and understand their documents and extract useful information with a user-friendly front-end interface. Such an application would allow business users to interact with and ask questions of the data without having to write any code. Here is an example of such an application, and the GitHub repo for the application is here if you’d like to modify it.

Visualization of the application running and the expected outputs

Additional tips and tricks for prompting Jamba

If you want to build out a use case using your own data, here are the top 10 prompting tips to keep in mind.

  1. Understand the task: Clearly define what you want the model to do. Prompt engineering is about creating a detailed instruction that aligns with the task.
  2. Use clear and simple language: Ensure the prompt is understandable, free from grammatical errors, and well-organized.
  3. Avoid negative prompts: Minimize the use of ‘do not’ or ‘avoid x’ in your prompt. Focus on what you want to do rather than what you don’t want.
  4. Provide examples: This is also called “Few Shot prompting”. Examples can help guide the model and improve the quality of the output. Be sure they are clearly separated by separators and follow the same structure.
  5. Specify a persona: Define the role of the LLM (e.g., a friendly travel agent, a meticulous research assistant). This can guide the tone, language, and level of expertise used in the response.
  6. Put data before the question: When asking a question about the information given in the prompt, provide the data first, then the question. Other techniques to improve the results include Including instructions both before and after the context information and separating different sections of the prompt into different XML objects.
  7. Evaluate and refine your prompt: Test your prompt against examples and be prepared to adjust it based on the results.
  8. Request structured output: If you need a specific format (like JSON or XML), provide examples in that format. Jamba Instruct may not always produce perfectly valid structured output.
  9. Limit output length: If you have a length goal or limit, specify it in the prompt.
  10. Maintain output quality in production: Log your prompts and responses, periodically check the output quality, and provide a feedback mechanism to improve prompts based on user input.

Try it today

Jamba-Instruct in Snowflake Cortex AI is generally available to all Snowflake customers today. This blog is meant to serve as a helpful starting point for anyone who wants to build with Jamba-Instruct.

If you’d like to learn more and discuss use cases for your own enterprise data, let’s talk.

--

--