Student Validation: From MVP to Prime Time

Rebecca Borison
Codecademy Engineering
4 min readNov 19, 2020
map and bar chart of student learner data
Students from around the world come to Codecademy to learn.

Covid-19 is impacting education in an unprecedented way. Back in March, we at Codecademy saw the growing number of students learning at home and formed a rapid response team to address these new challenges.

One of the initiatives we rolled out was a free 90-day pro membership for students. We wanted to get an MVP out for this as soon as possible, which led us to some quick approaches for validating whether or not a user was a student. We knew these were interim solutions, but we wanted to enable students to learn remotely as quickly as possible.

First take: simple string check

The first thing we could think of is to just check the user’s email domain to see if it contained the string ‘edu’ or ‘k12’ (we wanted to include primary and secondary schools as well as colleges and universities).

_, domain = user.email.split('@')
return false unless domain.present?
(
domain.include?('edu') ||
domain.include?('k12')
)

Definitely not the most complex check out there, but it let us quickly approve 5,000 students! We also added an application form and admin tool so that our customer support team could manually approve students who did not have an edu or k12 email.

Take two: open source libraries FTW

Obviously our initial check was fairly limited, especially when it comes to international students who likely do not have an email with edu or k12.

Someone on our team discovered that JetBrains uses an open source library that tracks domains for colleges and universities. The repository stores different educational institutions as a txt file that is nested in folders in such a way that represents the institution’s domain (in reverse direction).

For example, the University of Pennsylvania would be found in the file domains/edu/upenn.txt and the University of Melbourne would be found in the file domains/au/oz/mu.txt.

The JetBrains library is in kotlin, but the library is forked from a ruby version, so we were able to just use that one, and it was really as simple as installing the library: gem install swot or adding gem 'swot' to your gemfile.

Next all we had to do was require the gem in our module and run a quick check to see if the user’s email domain could be found in the library:

Swot.is_academic?(user.email)

We were then set up to recognize many more email addresses as belonging to a student, especially outside of the U.S. — we saw 200 signups from Algeria within 8 hours of adding Swot!

Take Three: all grown up

Swot is certainly not the most performant way to do this… Every time a new user tries to sign up we have to check to see if a file corresponding to the user’s email domain exists.

It also doesn’t handle users who may still retain their college email address, but are no longer students, or non-students who may have a university email for various reasons.

form to apply for student verification
The SheerID verification form on our site

And so we turned to SheerID. Unlike the free and open source Swot, SheerID is a paid service, but it ensures that we get a much stricter validation on students by asking users to submit documents like class schedules or asking users to login to their school portals to verify their student status.

We took advantage of SheerID’s JavaScript API, which adds a SheerID-powered form to our page that collects user information and handles the various steps of the verification flow (for example, asking them to upload a document and verifying that document).

SheerID exposes a set of hooks so that we can get notified when a user’s student status changes. Our React components can attach the form to a ref and then set up our hook. It’ll look something like this:

const sheerIdContainer = useRef(null); 
new sheerId.VerificationForm(sheerIdContainer.current, programId) sheerId.addHook({
name: 'ON_VERIFICATION_STEP_CHANGE',
callback: async (verificationResponse) => {
setStep(verificationResponse.currentStep)
if (verificationResponse.currentStep === 'success') {
onSuccess();
}
}
});

We’ve seen a number of benefits to using SheerID. Like I mentioned earlier, it’s more performant than file lookups and ensures that the user is actually a currently-enrolled student. But beyond that, it also saves our customer support team a significant amount of time. With our Swot solution, there were still many universities and colleges that were not included in the repository, so we had to continue approving some users manually, while making pull requests to the repo to get new institutions added.

Once we got SheerID up and running, our customer support team received fewer requests for manual approval and could focus on more meaningful initiatives.

Plus SheerID exposes a handy reporting dashboard, so we can get a better look at how users are going through the verification funnel.

For our initial Covid Rapid Response campaign back in the Spring, we ended up giving away more than 100,000 free pro memberships to students (!!). We ultimately ended the free giveaway program, but we now offer a discounted student plan that leverages our earlier SheerID integration to verify student users and help us continue building the education the world needs.

--

--