Data 101: Predictive Analytics and College Admissions

Humanlytics Team
Analytics for Humans
7 min readJan 29, 2018

It’s 1869. The Civil War’s been over for a few years, and you’re feeling like you want a degree in mathematics because, hey, why not? So you wander down to your local technological institute, take a quick entrance exam, and enroll.

Congratulations, you’re now the newest student at the Massachusetts Institute of Technology, class of 1873!

Not sure if we’d rather take the SAT or this… (From MIT Libraries)

Things have changed a bit since then. Higher education is now a massive industry, with 20 million American students enrolled in college today. 77% of colleges and universities spend over $100,000 per annum on brand strategy work, with 31% spending more than $200,000 per year.

This may not seem like a lot, but it’s a new trend in higher education. The overwhelming increase applying to and attending colleges means that higher education is a bona-fide industry, even though most colleges and universities are non-profit. Many colleges and universities employ bona-fide marketing professionals to reach out to and engage the millions of American college applicants per year.

How does data fit into all of this?

We’ll come back to marketing and college admissions in a bit, but for now let’s delve into the largely hidden world of predictive analytics in college admissions. Almost every college in the United States has had to employ predictive analytics and data analysis as an integral aspect of their admissions process over the past few years.

Why?

Because of these guys.

Every year, US News & World Report (USNWR) ranks every college, university, and graduate school in America. Colleges live and die by these rankings. 2.6 million visitors generate 18.9 million page views on the day that rankings are published in September. A college’s placement on the rankings list has the ability to automatically induce thousands more applications and campus visits over years.

For graduate education, the stakes are even higher. In the legal world, most schools are overshadowed by the “T-14”, the schools that most frequently appear in the top 14 places of USNWR’s law school rankings. A degree from one of these schools assures applicants the best chances at prestigious federal clerkships, high-paying firm jobs, and federal employment, creating a huge incentive for schools to find a way to get themselves higher on USNWR’s rankings.

So how does all this tie into predictive analytics?

Turns out USNWR publishes its methodology, down to a tenth of a percent. While some factors are largely subjective (reputation is 22.5% of a school’s rank), others are highly quantifiable and potentially manipulable. For undergraduate institutions, the most significant ones are:

  • Graduation and retention rates (22.5%)
  • Class sizes (8%)
  • Acceptance rates (1.25%)

Let’s go through these one-by-one to see how predictive analytics plays a role.

Graduation and Retention Rates

At first glance, this seems like an impossible problem to model. How do you predict how students are going to do at your college from their applications alone? How do you predict which students will stay at your school all four years of their degree before they even set foot on campus?

However, colleges have been capturing data that can help for years. Every student that passes through the doors of a university has an official transcript that registers every academic aspect of your college experience, from the type of high school you went to, your SAT scores, and your grades in each of your classes.

From this, colleges have been able to compile a fairly accurate statistical portrait of the kinds of students that succeed at their institutions. Southern Methodist University, for instance, can accurately predict from the time of application whether or not a student will graduate on time, or even at all. Administrators at Georgia State, similarly, have built a model that alerts counselors when a student gets a C in the first course in their major, a key factor in eventually dropping out.

Class Sizes and Acceptance Rates

We’re combining this because of how much they flow into one another. To analyze this however, we need to take a step back and understand how the college application funnel works (see — funnels! Just like a for-profit business!)

The typical college application funnel (Created with infogram)

Similar to a for-profit business, colleges generate a list of prospective applicants. Unlike a business, however, colleges have a captive audience of students who have taken the SAT or ACT that helps them generate a fairly accurate portrait of the entire college applicant pool for a particular year. From this, colleges have the ability to run simple searches of students based on SAT scores, GPAs, expected graduation dates, and many more statistics in hopes of predicting and capturing eventual applicants.

A typical week’s mail thanks to colleges trying to increase their applicant pool (Audrey Kletscher Helbling)

The main goal here is to get as many high school seniors as possible to apply to a college as possible. More applicants and a constantly sized entering cohort means a lower acceptance rate, which not only helps rankings, but the reputation of an institution to begin with.

Of course, the biggest challenge here is identifying the right person to apply, and understanding how to incentivize them to send in an application (and pay the $75+ that an application costs). That’s where analytics comes in, and it’s really not even that complicated. As mentioned above, College Board is more than happy to sell this data to colleges and universities.

So…

Average SAT scores slipping in your applicant pool (another big factor in USNWR rankings)?

Send out some fee waivers to applicants with high SAT scores

New England college with almost zero applicants from the West Coast?

Barrage Californian high school seniors with emails and flyers talking about how it really isn’t that cold in New Hampshire in December if you wear a thick jacket

Starting to realize how much diversity your college lacks?

Send fee waivers to as many minority students as possible.

Here’s the thing though. That’s the easy part. Remember the beginning of this piece, where we were telling you about how much colleges spend on marketing and promotion?

Those dollars have a lot more bang for their buck because all they’re doing is convincing people who have already decided to apply to college to send another application their way. Safe thinking dictates that students should apply to “five to eight” institutions, so ultimately it’s just a matter of squeezing into a student’s list. And besides, those aren’t hard or fast numbers. At the LEAP Academy University Charter School in Camden, NJ for instance, students sent out an average of 45 college applications!

So let’s go back to that funnel and think about the narrowing between “Acceptances” and “Deposits/Class Size”

The typical college application funnel (Created with infogram)

In actuality, THAT’S where predictive analytics comes in. Most colleges and universities have a roughly 26–32% “yield” rate. That means, for every 100 or so people who colleges accept, only about 30 end up showing up on the first day of classes, needing dorm rooms, meal plans, and a desk.

That’s vaguely terrifying, if you’re a college administrator.

What if, for instance, your school has a max class size of 1,000 first-years. Any more, and there’s no living space on campus. For years, your yield has fluctuated around 33%, so you accept 3,000 students.

Then, suddenly, something happens. A professor wins a Nobel Prize. A sitting President gives a commencement speech. Your school is announced as the set location for the next Fast and the Furious movie. Suddenly, instead of a 33% yield, you have a 50% yield, and you have about 500 students who have the legal right to attend your school ( you let them in after all), and no space to put them.

Or let’s look at it the other way. Anything smaller than 950 students, and professors have empty classrooms and your ability to place students into prestigious jobs tanks. Another extraneous event happens, and suddenly you have only 900 students showing up, and a 30% yield.

That’s a difference of only 3%, and the difference between success and failure for a college. Plus, this hypothetical models a fairly small, 4000-student college.

What happens as the numbers get bigger? Let’s say you’re O̶h̶i̶o̶ ̶S̶t̶a̶t̶e̶ ̶U̶n̶i̶v̶e̶r̶s̶i̶t̶y̶ THE Ohio State University, with an entering class of 7,136 students, over 52,000 applicants, and a 49% acceptance rate. That’s a yield of 28%, but a change of even 1 % means a difference of 255 students.

When the numbers are that big, colleges turn to a deep world of predictive analytics to help them lock down their yields for sure. Tune in on Wednesday, and we’ll cover the wide world of yield, and how universities make sure they get the right amount of students to the right places!

This article was produced by Humanlytics. Looking for more content just like this? Check us out on Twitter and Medium, and join our Analytics for Humans Facebook community to discuss more ideas and topics like this!

--

--

Humanlytics Team
Analytics for Humans

We examine how technologies can work with humans to create a brighter future for everyone. Beta test at bit.ly/HMLbetatest