From Paper to Score: Create an AI-powered System for Automated Grading

4 min readAug 4, 2024

Introduction and Overview

Problem Statement:

Evaluating student answer scripts manually is time-consuming and prone to inconsistencies. Leveraging AI for this task can streamline the process, ensure fairness, and provide detailed feedback.

Target Audience:

This blog is intended for educators, developers, and AI enthusiasts with a basic understanding of FastAPI, Google Cloud services, and OCR technologies.

Outcome:

By the end of this blog, readers will learn how to build a system that automates the evaluation of student answer scripts using Google’s Gemini Vision and Firebase.

Design

Rationale:

The design aims to integrate OCR capabilities with AI-based evaluation, providing an automated and scalable solution for grading student answer scripts.

Architecture:

OCR Extraction: Uses Google’s Gemini Vision to convert scanned images of answer scripts into text.
Evaluation Engine: Compares extracted text against a predefined marking scheme stored in Firebase.
Feedback System: Generates detailed feedback and overall scores for each answer.

Design Choice Explanation:

This design ensures high accuracy in text extraction and efficient evaluation through cloud-based services, reducing manual intervention.

Prerequisites

Software and Tools:

Python
FastAPI
Google Cloud services (Gemini Vision, Firestore)
PDF2Image, OpenCV, NumPy, PIL

Download Links:

Assumed Prior Knowledge:

Basic understanding of REST APIs
Familiarity with cloud services and AI concepts

Step-by-Step Instructions

1. Setting Up the Environment:

Install necessary libraries:

pip install fastapi uvicorn pdf2image pillow opencv-python numpy firebase-admin google-generativeai

Configure Google Cloud SDK and set up your project:

gcloud init
gcloud auth application-default login

2. Implementing the OCR Functionality:

Convert PDFs to images and preprocess them:

'''Converting pdf to images'''
        # Read the uploaded PDF file
        pdf_bytes = await answer_script.read()

        # Convert the PDF to a list of images
        images = convert_from_bytes(pdf_bytes)

        # Prepare the images for OCR processing and save them to disk
        image_file_list = []
        for i, image in enumerate(images):
            # Preprocess the image
            processed_image = preprocess_image(image)
            image_path = f"saved_images/page_{i + 1}.png"
            processed_image.save(image_path, format="PNG")

Here the preprocess_image() function is a function designed using OpenCV to align and crop the images before passing to Gemini vision.

3. Integrating with Gemini Vision:

Upload images and extract text:

# Initialize question_answer_json
 question_answer_json = """{}"""

# Process each image buffer with Gemini OCR
for image_file in image_file_list:
    # Perform OCR and update question_answer_json
    ocr_result = ocr_with_gemini(image_file, question_answer_json)
    question_answer_json = ocr_result

# Convert the JSON string to a Python dictionary
question_answer_json = json.loads(question_answer_json)

The with Gemini function could be implemented by getting the code from Google AI Studio by selecting the model, task and entering prompt.

After Entering Prompt click on “Get Code”

4. Evaluating Against the Marking Scheme:

Fetch marking scheme from Firebase and compare answers:

# Initialize Firebase Admin SDK
cred = credentials.Certificate("Your_Project.json")
firebase_admin.initialize_app(cred)
db = firestore.client()

The credentials could be obtained from firebase or Google cloud console after creating a service account.

def fetch_question_paper_and_marking_scheme(classroom_code, test_code):
    doc_ref = db.collection('classrooms').document(classroom_code).collection('tests').document(test_code)
    doc = doc_ref.get()
    if doc.exists:
        data = doc.to_dict()
        question_paper = data.get('question_paper', {})
        marking_scheme = data.get('marking_scheme', {})
        return question_paper, marking_scheme
    else:
        raise ValueError("Classroom code or test code not found in Firebase.")

5. Generating Feedback and Scores:

Summarize feedback and calculate scores:
Again Google AI Studio could be used to create a function to evaluate the OCRed answer scripts by Gemini.
Then the total score can be calculated by summing up individual scores
Gemini could also be at this step be used to generate a total summary of feedback.

6. Creating the FastAPI Endpoints:

Define endpoints for uploading answer scripts and adding test data:

from fastapi import FastAPI, UploadFile, File
app = FastAPI()

@app.post("/uploadanswerscript/")
async def upload_answer_script(classroom_code: str, test_code: str, file: UploadFile = File(...)):
    # Process file and evaluate answers
            response = {
            "score": total_score,
            "feedback": summary_feedback,
            "ocr_result": question_answer_json,
            "grades_feedback": grades_feedback
        }

@app.post("/add_test/")
async def add_test(classroom_code: str, test_code: str, question_paper: dict, marking_scheme: dict):
    # Add test data to Firestore
    return {"status": "Test data added successfully"}

Now you must implement the routes under FastAPI and add the functions and code developed in the previous steps.

Result and Demo

Expected Outcome:

Upon successful implementation, the system will return the evaluated score and detailed feedback for each answer.
Demo:

Testing using Swagger UI

What’s next?

Learning resources:

Using Gemini with cloud run: https://codelabs.developers.google.com/codelabs/how-to-deploy-gemini-powered-chat-app-cloud-run#0

Ideas for expanding on the project:

Model Explainability: Investigate methods to understand how the AI model arrives at its decisions, improving transparency.
Implement Plagiarism Detection: Integrate plagiarism detection tools to maintain academic integrity.
Integrate with Learning Management Systems (LMS): Connect your system with popular LMS platforms for seamless integration.

Challenges to take the skills further.

To continue your learning journey, check out these resources:

Google Cloud Documentation: https://cloud.google.com/docs
FastAPI Documentation: https://fastapi.tiangolo.com/learn/

Call to Action

To learn more about Google Cloud services and to create impact for the work you do, get around to these steps right away:

Register for Code Vipassana sessions
Join the meetup group Datapreneur Social
Sign up to become Google Cloud Innovator