How to use Mixtral -8x7B for Text-to-SQL

Published in

Dataherald

4 min readMar 18, 2024

Background

Text-to-SQL is the process of translating a natural language question into a SQL query for a given database schema. This is a hard task even for the most advanced proprietary Large Language Models (LLMs) like GPT-4, as demonstrated by the wide gap between human performance and the state-of-the-art text-to-SQL methods on benchmarks such as BIRD. Several techniques, including the methods outlined in the MAC-SQL and DIN-SQL papers, adopt a multi-step decomposition strategy to enhance SQL generation performance.

In building the Dataherald agent we employed a similar decomposition and tool usage approach. Specifically, we drew inspiration from concepts like ReAct, reflection, and Microsoft Auto-gen.

Most production use of Dataherald currently uses OpenAI models given their efficiency in tool utilization and function invocation. However, one of the top requests from the community has been to add support for open source LLMs, due to their improving performance, lower costs and data privacy superiority. With version v01.0.0 release of the Dataherald engine, users can use open source LLMs with the Dataherald agent.

This post will explore how to use a Mixtral model within your private cloud infrastructure to establish a secure text-to-SQL engine for your database. Let’s jump right in! 🚀

Deploy Mixtral with vllm

Vllm is an open-source library designed for rapid inference and serving of LLMs. It incorporates PagedAttention, an innovative attention algorithm to optimize the management of attention keys and values. With PagedAttention at its core, vLLM achieves up to a 24-fold increase in throughput compared to HuggingFace Transformers, all without necessitating modifications to the model’s architecture.

To deploy Mixtral for this tutorial, we will use two A100 GPUs from Runpod.

We will first customize the deployment:

After setting the overrides for the customized deployment we can connect to the web terminal.

First, we install and use tmux to run the model in the background.

apt udate
apt install tmux
tmux

Then we deploy the model on the GPUs by running the following commands.

#installing the vllm package
pip install vllm

python -u -m vllm.entrypoints.openai.api_server \
 --host 0.0.0.0 --model mistralai/Mixtral-8x7B-Instruct-v0.1 \
 --load-format safetensors --tensor-parallel-size 2

If the deployment is successful, you should see the following message in the HTTP service port [8000].

Generating SQL queries

Once Mixtral is successfully deployed, deploying the model with the Dataherald engine is quite simple. You simply need to call the endpoints and specify the deployed model URL in the api_base field and change the llm_name in the llm_config as follows.

{
  "llm_config": {
    "llm_name": "mistralai/Mixtral-8x7B-Instruct-v0.1",
    "api_base": "https://gcttz85utfqq1q-8000.proxy.runpod.net/v1"
  },
  evaluate: false,
  "prompt": {
    "text": "What is the average rent price in LA?",
    "db_connection_id": "65f1ad7f96252bd60de52f3e"
  }
}

In this blog post, the engine was connected to a database of U.S. real estate data , and we get the following response from the endpoint.

SELECT metric_value
FROM renthub_average_rent 
WHERE geo_type = 'city' 
AND dh_state_name = 'California' 
AND dh_place_name = 'Los Angeles' 
AND metric_value IS NOT NULL;

The reasoning and self-reflection steps can be seen Dataherald engine logs, shown in the gif below for our sample deployment.

Conclusion

This post delved into how you can deploy an open-source Large Language Model (LLM) with Dataherald build a natural language interface to your relational database. While we used the Mixtral 7x8B model for this tutorial, any other open-source language model can also be used as long as it is capable of tool usage and function calling. In our next post, we will go over the finetuning process of open-source models.

About Dataherald

Sign up for free and use the hosted version of Dataherald
Our open-source engine is available on Github.
Join our Discord server to learn more about the project.