Worksheet Generator app with Firebase AI Extension

Published in

Firebase Developers

8 min readFeb 26, 2024

Worksheets are crucial in the educational framework, offering a structured approach to learning and practicing new skills. The introduction of the Gemini API and AI Firebase extensions has changed how developers can integrate complex, multimodal tasks, like worksheet generation, into Flutter applications. This powerful combination allows for the seamless blending of text and visual data processing, enhancing the development of educational tools.

This article delves deeper into the Firebase AI Extension, exploring its capabilities and demonstrating how it can introduce AI functionalities into your applications. This exploration aims to equip developers with the knowledge to leverage AI in creating more interactive, engaging, and personalized learning experiences.

Prerequisites:

Set up a Firebase project.
Set up Cloud Storage.
Upgrade Firebase with Blaze plan.

Firebase extensions:

Developers can integrate Firebase Extensions into their applications, leveraging pre-built solutions that eliminate the need for extensive research, coding, or debugging. In this project, we’ve harnessed the power of Multimodal Tasks with the Gemini API to generate content using both text and image inputs.

Multimodal Tasks with the Gemini API Firebase Extension:

The Gemini API extension performs multimodal generative tasks, integrating text and image processing. Leveraging Gemini models and Firestore, it offers customizability through prompt engineering for diverse content creation.

Install the extension:

You can install the extension either through the Firebase console or through its command line interface.

Option 1: Using the Firebase CLI

To install the extension through the Firebase CLI:

firebase ext:install googlecloud/firestore-multimodal-genai --project=projectId

Option 2: Using the Firebase Console

Visit the extension homepage and click “Install in Firebase Console”.

Configuration Parameters:

When installing the “Multimodal Tasks with the Gemini API” Firebase extension, you’ll encounter several configuration parameters that are essential for tailoring the extension to your specific needs. Understanding these parameters and how to effectively set them is key to maximizing the capabilities of the Gemini API in your Flutter applications. Below, we detail each parameter and provide examples for clarity.

Gemini API Provider Selection:

When setting up the “Multimodal Tasks with the Gemini API” extension, you’re presented with two choices for the API provider: Google AI and Vertex AI. This choice is pivotal as each provider offers distinct functionalities tailored to different development needs.

Gemini model:

Select the specific Gemini model you wish to use, such as Gemini Pro or Gemini Pro Vision. Available models vary by provider and can be found in the documentation for Vertex AI and Google AI.

Google AI API Key:

Required when opting for Google AI as your provider. This key authenticates your requests. Vertex AI users utilize application default credentials instead.

Firestore Collection Path:

This parameter specifies the Firestore collection path where the extension should listen for new documents to process. The path points to the location in your Firestore database that will trigger the Gemini API to perform tasks based on the document data.

Prompt:

The prompt template is a structured text that guides the Gemini API on how to understand and generate responses based on the input. It can include placeholders for dynamic content extracted from Firestore documents.

Example: Please generate a summary for the following article: {{articleContent}}

Variable Fields:

List the document fields you want to use as variables in the prompt, separated by commas.

Example: articleContent, image

Image Field (Gemini Pro Vision):

For image-related tasks, specify a field containing an image’s Cloud Storage URL or a base64 image string. This feature is exclusive to the Gemini Pro Vision model.

Response Field:

The document field where the API’s response will be stored.

Cloud Functions Location:

Select the geographical location for deploying the extension’s Cloud Functions, guided by Google’s location selection advice.

How it works?

The multimodal tasks with the Gemini API extension for Firebase and Firestore streamline generative tasks by combining custom prompts with the powerful Gemini API and Firestore’s data management capabilities. Here’s a simplified overview:

Setup: The extension is configured within Firebase, linking the Gemini API and Firestore for data interaction.
Data Capture and Storage: Input data (text, images, or both) is captured and stored in Firestore, ready for processing.
Custom Prompts: Users generate custom prompts tailored to guide the Gemini API in producing the desired output for the task at hand.
API Interaction: The extension sends these prompts to the Gemini API, along with any relevant input data from Firestore, initiating the generative process.
Generative Task Processing: Utilizing advanced AI models, the Gemini API processes the prompts and inputs to generate outputs, which could be text, images, or a combination thereof, depending on the task’s requirements.
Output Management: The generated content is then saved back into Firestore, where it can be accessed for further use, such as in applications or for content display.

Let’s Build the Worksheet Generator Flutter App

The app boasts dual functionalities for crafting math worksheets: a text input feature where users specify their desired worksheet parameters, prompting the app to generate a customized set, and an image upload feature that analyzes a sample worksheet and replicates its problem types to create a corresponding worksheet.

Step 1: Initialise Firebase

FirebaseOptions getDefaultFirebaseOptions() {
  return const FirebaseOptions(
    apiKey: "your-default-apiKey",
    authDomain: "your-default-authDomain",
    projectId: "your-default-projectId",
    storageBucket: "your-default-storageBucket",
    messagingSenderId: "your-default-messagingSenderId",
    appId: "your-default-appId",
  );
}

Step 2: create the UI for user input as text

TextField(
  keyboardType: TextInputType.multiline,
  minLines: 5, // <-- SEE HERE
  maxLines: 10, // <-- SEE HERE
  style: TextStyle(color: Colors.black),
  controller: _textController,
  decoration: InputDecoration(
    border: InputBorder.none,
    hintText: 'Enter the sample example',
    contentPadding: EdgeInsets.only(top: 15, left: 10),
  ),
),

Step 3: save the input to Firestore

The TextButton executes an action when clicked. If _textController is empty, it prompts the user to enter text with a Snackbar. Otherwise, it adds the text to Firestore as a new document under ref. Then, it updates the UI by assigning docID to the ID of the newly created document.

TextButton(
  onPressed: () async {
    if (_textController.text == "") {
      ScaffoldMessenger.of(context).showSnackBar(
        SnackBar(
          content: Text('Enter the Text'),
        ),
      );
    } else {
      var doc = await ref.add(DataModel(
        text: _textController.text,
      ));
      setState(() {
        docID = doc.id;
      });
    }
  },
  child: Text(
    "Generate",
    style: Theme.of(context).textTheme.titleMedium,
  )
)

Step 4: create the UI for user input as Image

The IconButton below is configured to trigger an action when pressed. Upon pressing, it asynchronously retrieves an image using ImagePickerWeb and stores it in the image variable. If an image is successfully retrieved, it updates the UI by setting webImage to the retrieved image and imageAvaliable to true. Afterwards, it calls the uploadImage function. Additionally, it displays a tooltip with the text ‘Select Image’ on long press.

IconButton(
  icon: Icon(
    Icons.attachment), // Choose an appropriate icon
  onPressed: () async {
    final image = await ImagePickerWeb.getImageAsBytes();
    if (image != null) {
      setState(() {
        webImage = image;
        imageAvaliable = true;
      });
      uploadImage();
    }
  },
  tooltip:
      'Select Image', // Optional: Tooltip text on long press
),

Step 5: upload the image to Cloud Storage and then save it to Firestore

Prepare Image and Storage Reference:

This step ensures that an image is available to upload.
It generates a unique file name for the image using the current timestamp to avoid naming conflicts.
Obtains a reference to the location in Cloud Storage where the image will be uploaded.

// Check if an image has been picked
if (webImage.isNotEmpty) {
    // Generate a unique file name for the image
    String fileName = 'images/${DateTime.now().millisecondsSinceEpoch}.jpg';

    // Get a reference to the location in Firebase Storage for uploading
    Reference storageRef = firebaseStorage.ref().child(fileName);
}

Upload Image to Cloud Storage:

This step initiates the upload task with the image data using the putData method.
It waits for the upload task to complete using await.
Any errors that occur during the upload process are caught and handled.

// Upload the image to Firebase Storage
try {
    // Initiate the upload task with the image data
    UploadTask uploadTask = storageRef.putData(webImage);

    // Wait for the upload task to complete
    TaskSnapshot snapshot = await uploadTask;
}
catch (e) {
    // Handle any errors during upload
    print("Error uploading image: $e");
}

Store Image Data in Firestore:

This step retrieves the download URL of the uploaded image using getDownloadURL method.
It then stores the image URL along with any additional data (e.g., text) to Firestore.
Finally, it updates the UI state with the Firestore document ID for reference.

// Retrieve the download URL of the uploaded image
String imageUrl = await snapshot.ref.getDownloadURL();

// Store the image URL and any additional data to Firestore
var doc = await ref.add(DataModel(
    image: imageUrl, // Store the image URL
    text: _textController.text, // Any other data you want to save
));

// Update the state with the Firestore document ID
setState(() {
    docID = doc.id;
});

Step 6: Retrieve the result from Firestore and display it in the app

UI widget :

This part defines a Container widget with a specific height, width, and decoration.
It calls the _buildContent method to display the content.

Container(
  height: 800,
  width: double.infinity,
  decoration: BoxDecoration(
    color: Color.fromARGB(255, 245, 245, 242),
    border: Border.all(
      color: Color.fromARGB(255, 223, 21, 21),
      width: 2,
    ),
    borderRadius: BorderRadius.all(Radius.circular(10)),
  ),
  child: _buildContent(),
)

Data Fetching :

_buildContent: It checks if docID is null and decides whether to fetch data or display a placeholder.
_fetchData: It returns a StreamBuilder widget to listen to changes on a Firestore document.
It handles waiting for data and errors.

Widget _buildContent() {
  if (docID != null) {
    return _fetchData(docID);
  } else {
    // No data available, display a placeholder
    return _buildPlaceholder();
  }
}

Widget _fetchData(String docID) {
  return StreamBuilder(
    stream: ref.doc(docID).snapshots(),
    builder: (context, AsyncSnapshot<DocumentSnapshot<DataModel>> snapshot) {
      if (snapshot.connectionState == ConnectionState.waiting) {
        // Show loading indicator while waiting for data
        return Center(child: CircularProgressIndicator());
      }

      if (snapshot.hasError) {
        // Handle any errors
        return Text('Error: ${snapshot.error}');
      }

      // Call the method to display the data
      return _displayData(snapshot);
    },
  );
}

Data Display :

_displayData: It builds the UI to display the content retrieved from Firestore.
It includes an IconButton to copy data to the clipboard and a Text widget to display the data.
_buildPlaceholder: It displays a placeholder text when no document ID is available.

Widget _displayData(AsyncSnapshot<DocumentSnapshot<DataModel>> snapshot) {
  if (snapshot.hasData) {
    // Data is available, display the content
    return Padding(
      padding: const EdgeInsets.all(8.0),
      child: SingleChildScrollView(
        child: Column(
          crossAxisAlignment: CrossAxisAlignment.start,
          children: [
            IconButton(
              icon: Icon(Icons.copy, color: Color.fromARGB(255, 3, 138, 200)),
              onPressed: () async {
                await Clipboard.setData(ClipboardData(text: snapshot.data!.data()!.output ?? ""));
                ScaffoldMessenger.of(context).showSnackBar(
                  SnackBar(content: Text('Link copied to your clipboard!')),
                );
              },
            ),
            Text(snapshot.data!.data()!.output ?? "loading the data"),
          ],
        ),
      ),
    );
  } else {
    // Default case, show an empty container if no data
    return Container();
  }
}

Widget _buildPlaceholder() {
  return Padding(
    padding: const EdgeInsets.all(8.0),
    child: Text("Worksheet will generate here"),
  );
}

Check the full code here.

Flutter

A new Flutter project.

zapp.run

Conclusion: In conclusion, the Worksheet Generator Flutter App with Firebase AI Extensions represents a significant advancement in educational technology, providing an intuitive platform for educators and students alike to generate customized math worksheets effortlessly. By harnessing the power of the Gemini API and Firebase AI Extensions, the app simplifies the creation of educational content, offering both text and image-based inputs for personalized worksheet generation. Writing the proper prompt is the key here, just change the prompt and you can add many functionalities to your application. This innovative tool not only saves time for educators in preparing study materials but also enriches the learning experience for students, making it a valuable addition to the digital classroom toolkit.