5 Steps to Auto-grade your Jupyter Notebooks- nbgrader simplified

Hamza Liaqat
Analytics Vidhya
Published in
6 min readJul 27, 2020

‘Say, there are 200 students in your class and there were 10 tests released- each with 10 questions. So, 20000 questions need to be graded. You’d like a magic wand to generate a .csv file with all students’ grades in the blink of an eye. `nbgrader` has that wand ’ (copied from `step 5` of this article).

nbgrader’ facilitates creating and grading assignments in the Jupyter notebook. This tool is a must-have for instructors who use notebooks. Once you get used to ‘nbgrader’, you’ll never write tests without it. Yet, it turns out when not used right it can increase your workload instead of decreasing it. Specifically,

  • For a beginner, nbgrader can quickly get unnecessarily complex- official documentation is not very enlightening on getting started quickly. More on this in `Step 2` below.
  • Manually entering students ids in the nbgrader will get tedious if you have more than n students (for me, n=3), for example.

If you’ve never heard of nbgrader before or you’d like to know what it means to auto-grade a jupyter notebook, you may find it useful to first watch this official video introducing core functionality of nbgrader.

Here, I’ll focus more on getting started with nbgrader quickly and automating as much stuff as possible.

The following 5 steps worked best for me for a class of 150 students- best in terms of saving time and effort. But, I learned them the hard way.

Step 1:

Install nbgrader (in a virtual environment preferably)

Assuming you’ve jupyter installed.

  • Using conda (recommended):

`conda install -c conda-forge nbgrader

  • Or using pip:

`pip install nbgrader

Once successfully installed, the extension of `Formgrader` will become visible at the top when you launch any notebook from that virtual environment.

Step 2

Create your course

  1. Your current path in the terminal should be where you’d like to store your course material. Then, run this command in terminal/cmd:

`nbgrader quickstart course_id

The command above will create a course dir called `course_id` for you with other necessary files.

2. Now, set your path to the newly created directory “course_id”. Launch jupyter notebook by:

`jupyter notebook

It turns out, for most of its functionality, nbgrader relies heavily on a certain hierarchical structure (see image below). Missing step 1 (e.g. launching notebook from an arbitrary directory will bring unnecessary complexity) will violate that structure, and you’ll waste hours figuring out why your assignments created using ‘Formgrader’ are not saved in ‘course_id/source’ directory, for example.

Achievement in `step 2`:

The following hierarchical structure will be automatically created by doing step 2. It’s worth spending some time understanding this structure. Not all files in this structure concern us. More on this in official philosophy.

Step 3

Automatically add students ids in database

Populate the database with student_ids (optionally names as well). It’s assumed that all your students’ ids are in a .txt file. Make sure your .txt file contains unique students ids.

import pandas as pd
from nbgrader.api import Gradebook
# Create the connection to the database
with Gradebook('sqlite:///gradebook.db') as gb:
path_of_file_holding_student_ids = '../student_ids.txt'
# read the txt file
with open(path_of_file_holding_student_ids) as f:
# extract ids in a list.
student_ids = [word.strip() for line in f.readlines() for word in line.split(',') if word.strip()]
# populate database with ids
for _id in student_ids:
gb.add_student(_id)

Custom file reading:

The following portion of the code (copied from above) is just reading students_ids from your txt file into a list. To fit your needs, you can modify it accordingly. For example, if your file is in a different format/style.

with open(path_of_file_holding_student_ids) as f:
# extract ids in a list.
student_ids = [word.strip() for line in f.readlines() for word in line.split(',') if word.strip()]

Now from your notebook, click ‘Formgrader’, then click `Manage Students`, Your newly added students should be there as shown below:

Ignore “Overall Score” for now. At the bottom, notice “+ Add new student…” manually- which will quickly get tedious if you’ve more than a few students.

`All the code in this notebook should be executed in “course_id” directory.`

Step 4

Create an Assignment:

Upon installation, `formgrader` extension becomes available. Click on it then click `+ Add new assignment`. Name your assignment. Create a new notebook in it. Then, once the notebook is open, at the top in the toolbar click ‘View’-> ‘Cell Toolbar’ -> ‘Create Assignment’ (see the images below). I’ll not expand on this further as the official video and documentation on introducing core functionality of ‘nbgrader’ do an excellent job of walking you through it with concrete examples.

Step 5

Autograde assignments

Say, there are 200 students in your class and there were 10 tests released- each with 10 questions. So, 20000 questions need to be graded. You’d like a magic wand to generate a .csv file with all students, this is how you use it:

Put students submissions in a dir called ‘submitted:

import shutil
import os
"Move submissions to a dir called `submitted` directory. So, nbgrader can recognize them for grading."# Path of directory containing students submissions initially (arbitrary dir)
arbit_dir = 'arbitrary/'
### First Create a dir called 'submitted` in course_id dir
os.mkdir('submitted')
subm_dir = 'submitted/'
# Get files in arbitrary dir containing students submissions
files = os.listdir(arbit_dir)
# Move all files (submissions) from arbitrary directory to submitted dir
for f in files:
shutil.move(arbit_dir+f, subm_dir)

Grade each assignment and generate a .csv file with all grades:

import os## Grading the whole course work i.e. all assignments released.
from nbgrader.apps import NbGraderAPI
api = NbGraderAPI()
# Get names of all assignments (e.g. if 5 assignments were released so far, return 5 names)
all_assignments_names = api.get_source_assignments()
# Grade each assignment
for assignment_name in all_assignments_names:
command = "nbgrader autograde '%s'" % assignment_name
os.system(command)
# Get grades in a csv file
os.system("nbgrader export")

If nbgraded not used right, you’re prone to waste a lot more energy than you’ve to.

(The following .csv file will be created ‘in the blink of an eye’. I didn’t write any ‘submission files’ to autograde so you mostly see 0s as score. But, it illustrates the point.)

More information on `submitted directory` and hierarchical structure:

You’ll collect students submissions in an arbitrary directory (somehow using LMS or via email etc) but nbgrader can’t find them there. Observe the hierarchical structure below:

nbgrader will only detect those submissions and eventually grade them if they are in submitted dir which itself should be in course_id dir. That's why the first snipped in step5 moves files in the submitted dir. Moreover, it's assumed that each student submission is organized this way:

`{student_id}/{assignment_id}/{notebook_id}.ipynb

where:

  • student_id (a directory) corresponds to the unique ID of a student e.g. studendid1 (same as in database- Step3).
  • assignment_id (a directory) corresponds to the unique name of an assignment e.g. Test1.
  • notebook_id (an .ipynb file) corresponds to the name of a notebook within an assignment e.g. logistic_reg.ipynb.

That’s the bare-minimum- batteries included- you need to enhance your learning and teaching experience using nbgrader. nbgrader has a lot more to offer- e.g. the option to do manual grading , how to write good tests, and JupyterHub (which I believe is redundant in this context if you use nbgrader with sufficient automation on your local machine.)

--

--

Hamza Liaqat
Analytics Vidhya

Machine Learning Engineer/Researcher/Teacher. Currently, focusing on ‘AI for Finance’ using Deep Learning and Reinforcement Learning.