Exploring Gemma with KerasNLP: A Guide to Lightweight AI Models

3 min readApr 27, 2024

This tutorial serves as your gateway into Gemma through KerasNLP. Gemma, a suite of lightweight, cutting-edge open models, shares its foundational research and technology with the renowned Gemini models. KerasNLP, on the other hand, is a collection of natural language processing (NLP) models built using Keras and compatible with JAX, PyTorch, and TensorFlow.

Within this tutorial, you’ll dive into using Gemma to generate text responses for various prompts. If you’re new to Keras, a quick browse through “Getting Started with Keras” beforehand might be beneficial, although it’s not mandatory. As you progress through this tutorial, you’ll naturally gain a deeper understanding of Keras and its functionalities.

Gemma Setup

Get access to Gemma on kaggle.com.
Select a Colab runtime with sufficient resources to run the Gemma 2B model.
Generate and configure a Kaggle username and API key.

Set environment variables for KAGGLE_USERNAME and KAGGLE_KEY.

import os
from google.colab import userdata

# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env
# vars as appropriate for your system.
os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME')
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY')

Install Keras and KerasNLP.

# Install Keras 3 and Keras-nlp
pip install -q -U keras-nlp
pip install -q -U keras>=3

Keras is a high-level, multi-framework deep learning API designed for simplicity and ease of use. Keras 3 lets you choose the backend: TensorFlow, JAX, or PyTorch.

import os

os.environ["KERAS_BACKEND"] = "jax"  # Or "tensorflow" or "torch".
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"] = "0.9"

Import packages

import keras
import keras_nlp

KerasNLP provides implementations of many popular model architectures. In this tutorial, you’ll create a model using GemmaCausalLM, an end-to-end Gemma model for causal language modeling. A causal language model predicts the next token based on previous tokens.

Create the model using the from_preset method.

gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")

Use the summary to get more info about the model:

gemma_lm.summary()

Generate Text :

Now it’s time to generate some text! The model has a generate method that generates text based on a prompt. The optional max_length argument specifies the maximum length of the generated sequence.

gemma_lm.generate("What is the meaning of life?", max_length=64)

Model Response :

'How does the brain work?\n\nTh
e brain is the most complex organ in the human body. It is responsible for controlling all of the body’s functions, including breathing, heart rate, digestion, and more. The brain is also responsible for thinking, feeling, and making decisions.\n\nThe brain is made up'

You can also provide batched prompts using a list as input:

gemma_lm.generate(
    ["What is the meaning of life?",
     "How does the brain work?"],
    max_length=64)

Model Response:

['What is the meaning of life?\n\nThe question is one of the most important questions in the world’s the question that has been asked by philosophers, theologians, and scientists for centuries.\n\nAnd it’s the question that has been asked by people who are looking for answers to their own lives',
 'How does the brain work?\n\nThe brain is the most complex organ in the human body. It is responsible for controlling all of the body’s functions, including breathing, heart rate, digestion, and more. The brain is also responsible for thinking, feeling, and making decisions.\n\nThe brain is made up']

You can control the generation strategy for GemmaCausalLM by setting the sampler argument on compile(). By default, “greedy” sampling will be used.

As an experiment, try setting a “top_k” strategy:

gemma_lm.compile(sampler="top_k")
gemma_lm.generate("What is the meaning of life?", max_length=64)

Response:

‘What is the meaning of life? That was a question I asked myself as I was driving home from work one night in 2012. I was driving through the city of San Bernardino, and all I could think was, “What the heck am I doing?”\n\nMy life was completely different.’

Thank you! Follow me for more insightful blogs.

Exploring Gemma with KerasNLP: A Guide to Lightweight AI Models

Gemma Setup

Generate Text :

Written by vishal singh