Gemma and Gemini-Pro AI Models in AlloyDB Omni

Published in

Google Cloud - Community

7 min readMay 17, 2024

In one of the recent posts I was discussing AlloyDB Omni and how to connect securely to a deployed instance. In the post I mentioned that AlloyDB and AlloyDB Omni were packed with a lot of improvements making development and management experience better with AlloyDB. In this blog I want like to talk about AlloyDB Omni integration with AI and particularly about AI models registration. By design AlloyDB can work directly with AI using a familiar SQL interface. You can generate embedding, call predictions or generate content by executing SQL functions. And now, with a new version of AI integration, it can work with different AI models deployed either in the Google Vertex AI or on other platforms. In the blog I will show you how to make predictions using Google Gemini pro and the Gemma model deployed in the Vertex AI. Most of the steps are based on the AlloyDB documentation and you always can read more in the documentation.

Please be aware that for the moment when I am writing the blog the service is still in preview mode and some things can be changed in the future.

For my tests I am using a Google Cloud project where I’ve enabled Vertex AI API and deployed my AlloyDB Omni on a GCE virtual machine. Here is the diagram of the deployed environment.

To use the AlloyDB AI the AlloyDB Omni has to be deployed with the integration enabled. The entire process is described in detail in the documentation and most of the following steps are taken directly from there.

Before starting my AlloyDB Omni instance I’ve created a service account vertex-ai-connect which is used to provide access to the Vertex AI. Then I created a key for the account in JSON format and stored the key on the host machine for AlloyDB Omni as vertex-ai-connect.json. Then I granted the Vertex AI User role to the service account.

gcloud iam service-accounts create vertex-ai-connect
gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:vertex-ai-connect@$PROJECT_ID.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"
gcloud iam service-accounts keys create vertex-ai-connect.json --iam-account=vertex-ai-connect@$PROJECT_ID.iam.gserviceaccount.com --key-file-type=json

I copied the key to the /etc/postgres/vertex-ai-connect.json file which we will use later when we start our AlloyDB Omni instance.

The next step is to update the postgres.conf parameters file for our future (or for an existing instance) and add a couple of parameters. The easiest way to do that is to copy the file to a local file system before creating the instance container and use it later. Why is it the easiest way? Because the container by default doesn’t have any good text editor and it is much easier to modify the parameter file on the host environment. We can get the default configuration file using the following docker command.

docker run -i --rm google/alloydbomni cat /usr/share/postgresql/postgresql.conf.sample > my-postgres.conf

Then we add a couple of parameters to the configuration file.

echo "omni_enable_ml_agent_process = 'on'" >> my-postgres.conf
echo "omni_google_cloud_private_key_file_path = '/etc/postgresql/private-key.json'" >> my-postgres.conf

And create AlloyDB Omni container which mounts the both — my-postgres.conf configuration file and the service account key stored as /etc/postgres/vertex-ai-connect.json. Also I put my PostgreSQL user password to the PGPASSWORD environment variable before executing the command.

export PGPASSWORD=....
docker run --name my-omni -e POSTGRES_PASSWORD=$PGPASSWORD -p 5432:5432 -v "$PWD/my-postgres.conf":/etc/postgresql/postgresql.conf -v /etc/postgres/vertex-ai-connect.json:/etc/postgresql/private-key.json -d google/alloydbomni -c 'config_file=/etc/postgresql/postgresql.conf'

Next time we need to change any parameters in the postgres.conf we can make it using our mounted my-postgres.conf and bounce the container.

Now we have my-omni container with everything prepared for the Vertex AI integration. Let’s create a sample database and enable the new version of google_ml_integration extension. Remember we need to use version 1.3+ of the extension. To connect to my database I am using psql installed on the host machine. You can install it by simply running the “sudo apt-get -y install postgresql-client” command on the host machine.

gleb@gleb-dbtest-omni-dockerhub-01:~$ psql -h localhost -U postgres
psql (15.6 (Debian 15.6-0+deb12u1), server 15.5)
Type "help" for help.

postgres=# create database ai_demo;
CREATE DATABASE
postgres=# \c ai_demo
psql (15.6 (Debian 15.6-0+deb12u1), server 15.5)
You are now connected to database "ai_demo" as user "postgres".
ai_demo=# CREATE EXTENSION google_ml_integration VERSION '1.3';
CREATE EXTENSION
ai_demo=#

Now we can verify if our credentials for Vertex AI work correctly. To do so we call the embedding function with a simple argument. I am wrapping output inside the array_dim operator for clarity. The 768 dimension vector would be too long.

ai_demo=# SELECT array_dims(embedding(model_id=>'textembedding-gecko',content=>'AlloyDB can run everywhere'));
 array_dims
------------
 [1:768]
(1 row)

We can see a vector with 768 dimensions which is exactly what we expect as return value from the textembedding-gecko model. With default options the integration can work with Vertex AI embedding, text LLM models, such as bison, and custom models in Vertex AI with registered endpoints. But, for example, it doesn’t work with the gemini-pro model. To make it happen we need to enable model registration updating parameter google_ml_integration.enable_model_support.

ai_demo=# ALTER SYSTEM SET google_ml_integration.enable_model_support=on;
ALTER SYSTEM
ai_demo=# SELECT pg_reload_conf();
 pg_reload_conf
----------------
 t
(1 row)

The old syntax still works but now we also can register the new models. Before making any changes we can have a glance at what is registered by default right after enabling the new functionality.

ai_demo=# select model_id, model_request_url from google_ml.model_info_view;
        model_id         |                model_request_url
-------------------------+--------------------------------------------------
 textembedding-gecko     | publishers/google/models/textembedding-gecko
 textembedding-gecko@001 | publishers/google/models/textembedding-gecko@001
(2 rows)

We have the textembedding-gecko default (latest) version and the version 001 already in the list.

Now it is time to check it with the new model. We can register Vertex AI gemini-pro using the following code.

CALL
  google_ml.create_model(
    model_id => 'gemini-1.0-pro',
    model_request_url => 'https://us-central1-aiplatform.googleapis.com/v1/projects/<your project id>/locations/us-central1/publishers/google/models/gemini-1.0-pro:streamGenerateContent',
    model_provider => 'google',
    model_auth_type => 'alloydb_service_agent_iam');

You need to replace the <your project id> with your project id and maybe specify your home region instead of us-central1. The model will be using the same Vertex AI authentication through the service account we’ve set up before. Let’s try a simple query to the model to get some generated text.

SELECT google_ml.predict_row( model_id =>'gemini-1.0-pro', request_body =>'{ "contents": { "role": "user", "parts": { "text": "What is AlloyDB Omni?" }  }}');

The response returns a JSON array according to the gemini-pro output specifications and can be parsed either directly in SQL or later in your application.

That approach works for other models hosted on Vertex AI. To test that we can deploy Google’s Gemma-7B model in the Vertex AI. You can read more about the Gemma family in the Google blog. The deployment itself is quite easy. You go to cloud console, open the Vertex AI model garden and click on the Gemma.

Then you click on the “deploy” button, adjust any parameters or accept the defaults and click the “deploy” again.

The deployment will take some time and by the end you will get the endpoint in the Vertex AI. You can see it either in the cloud console or using the gcloud command.

gleb@cloudshell:~ (gleb-test-001)$ gcloud ai endpoints list --region us-central1
Using endpoint [https://us-central1-aiplatform.googleapis.com/]
ENDPOINT_ID: 6289723281343774720
DISPLAY_NAME: google_gemma-1_1-7b-it-mg-one-click-deploy

https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict where the PROJECT_ID and ENDPOINT_ID environment variables should be replaced by PROJECT_ID and the ENDPOINT_ID from the previous command.

Having the endpoint URL for the model we can register it in our AlloyDB Omni ai_demo database using the following call in pSQL:

CALL
  google_ml.create_model(
    model_id => 'google_gemma-1_1-7b-it',
    model_request_url => 'https://us-central1-aiplatform.googleapis.com/v1/projects/gleb-test-001/locations/us-central1/endpoints/6289723281343774720:predict',
    model_provider => 'google',
    model_auth_type => 'alloydb_service_agent_iam');

We are still using the same Vertex AI credentials as we’ve used for the Gemini-pro model access. The procedure allows us to specify handling functions for the input and output to provide proper formatting and parsing but we’ve skipped that for the current models.

SELECT google_ml.predict_row( model_id =>'google_gemma-1_1-7b-it', request_body =>'{ "instances": [{ "prompt": "What is AlloyDB Omni?"}]}');

You can see I’ve modified the JSON input to match expected input format for the model and the output JSON is different from what we’ve had from Gemini-pro.

We can try to parse the output using normal operators for JSON data type.

SELECT google_ml.predict_row( model_id =>'google_gemma-1_1-7b-it', request_body =>'{ "instances": [{ "prompt": "What is AlloyDB Omni?"}]}')->'predictions';

The quality of the response can be quite different depending on the model and parameters. For example a query using different parameters from the default can potentially provide better (or worse) output. For example the result of the following query might differ from the previous one.

SELECT google_ml.predict_row( model_id =>'google_gemma-1_1-7b-it', request_body =>'{ "instances": [{ "prompt": "What is AlloyDB Omni?"}], "parameters":{"temperature": 0.2, "max_output_tokens": 256}}')->'predictions';

That is a simple example of how you can register and use foundation and custom models in AlloyDB Omni. So far we’ve explored only models deployed in the Google Vertex AI . In the following blog I will explain how you can use models deployed outside of the Vertex AI infrastructure.

You can try it in AlloyDB Omni right now using instructions provided on the AlloyDB documentation page and let us know how it works for you. Any feedback is very welcome. Happy testing!

Gemma and Gemini-Pro AI Models in AlloyDB Omni

Written by Gleb Otochkin