Gemma is born!

Published in

Google Cloud - Community

3 min readFeb 26, 2024

As a rubyist, I love gems.

So imagine my delight and surprise when last week Google announced Gemma, a family of open models available directly on `Kaggle`: google/gemma. Plus, it’s free!

But I know nothing about ML and Kaggle!

If you’re like me, a CLI kind of person, fear no more! I found a way to call Gemma with cURL (“if you can curl it, you can call it in any language!” — said a wise man in Dodgeball — did he?).

Don’t believe me? Check out my ugly bash code here: https://github.com/palladius/genai-googlecloud-scripts/tree/main/10-gemma-is-born/

How do I install Gemma locally?

I haven’t tried that one out, I’m sorry. It’s in my todo list.

How do I install Gemma in the Cloud?

As Barney would say “Glad you asked!”. I got hooked by this great video by Mark Ryan which got me started.

Please note that in this case you need access to Google Cloud and have billing enabled. You need:
1. A Gmail account (Google Account)
2. A Google Cloud Project with billing enabled.

If this doesn’t scare you off, the easiest way is to:

Click on https://www.kaggle.com/models/google/gemma
Click on Gemma on Vertex Model Garden in that page.
Follow instructions.

Once done, you should have something like this:

Ok, so what now? How do I interact with Gemma?

Glad you asked!

Click on the model (blue line above).
Under “Deploy and Test” tab you should see an awesome “Test your model ”. It accepts a JSON.

It took me one hour of digging but I can tell you, the JSON can be as simple as this:

{
  "instances": [{
    "prompt": "What's the difference between a city bike and a racing bike?",
    "max_tokens": 1000
  }]
}

The output will look like this:

{
  "predictions": [
    ".. GEMMA_ANSWER_HERE ..."
  ],
  "deployedModelId": "6852805176160419840",
  "model": "projects/980606839737/locations/us-central1/models/google_gemma-7b-it-1708849093570",
  "modelDisplayName": "google_gemma-7b-it-1708849093570",
  "modelVersionId": "1"
}

So writing a shell wrapper is as easy as cURLplus ajq(latest code here):

# You just change these to your model_id and your project_id
ENDPOINT_ID="6294864597715255296"
PROJECT_NUMBER="980606839737"
# this contains the question
INPUT_DATA_FILE="gemma-input-hello.json"
# this contains the answer (not really needed but good for training purposes)
OUTPUT_DATA_FILE="gemma-output-hello.json"

curl \
    -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json" \
    https://us-central1-aiplatform.googleapis.com/v1/projects/${PROJECT_NUMBER}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict \
    -d "@${INPUT_DATA_FILE}" | tee "$OUTPUT_DATA_FILE"

PREDICTION=$(cat "$OUTPUT_DATA_FILE" | jq .predictions[0])

echo -en "$PREDICTION\n"

Conclusions

Deploying Gemma to Vertex AI and calling the model via curl is a breeze.

And you, what are you waiting for?