Can Machine Learning Improve the College Application Process?

20 Days of Ideas — Day 10

How did you decide where you applied to college? Location? Available majors? Recommendations from family and friends? The rankings? Those postcards and emails from schools based on your scores?

If you’re like me, it wasn’t a very scientifically selected group. I ended up loving where I went to school, but looking back I don’t think it was the best match for me. I think there are other schools, including some that admitted me, that could have better helped me further my interests and education.

If you’re an elite student you might narrow your list to the top 10–15 schools, but today there are more elite students than the top schools have spots. So students are throwing darts — applying to 15 colleges — in hopes their application will resonate with an admissions officer. But outside of the top schools, how can you begin to make your picks? Needle in a haystack.

And what about the admissions side of the game? The top colleges are in competition with one another for the “best” students. But at some level, “best” becomes completely arbitrary. Every year, there is a pool of students that meets or exceeds the GPA/SAT/ACT threshold as defined by the scores of the most recently matriculated class. They all are either broad in many outside activities or deep in one or two. They all have superlative recommendations. And any school should be thrilled to have any of them.

And so it comes down to fit. The schools want to construct a diverse class of students that will thrive and contribute. But what they also want is for the kids they accept to matriculate. And graduate. But with the zillions of applications they receive (some tout a 4 or 5 percent acceptance rate), we’re back to the proverbial needle and haystack.

A Machine Learning Solution

Can we use machine learning to solve or at least alleviate the pain points on both sides of the equation?

There’s obscene amounts of data to work with on both the college side as well as the applicant side. Locations, proximity of major metropolitan area, walkability, test profiles, application essays, available majors, housing options, food options, number of women professors, popularity of majors, career aspirations of applicants and actual careers of graduates. I’m inclined to leave the financial aspect out for now as that’s another huge layer of complexity, but that could go in as well. I’m just throwing darts, but you get the picture.

Students would get a list of maybe 15–20 good matches. Colleges would get a list of the applicants most likely to matriculate and graduate. Otherwise the process remains the same.

The main difference is that students now have a map. It doesn’t mean they can’t look outside the list, but it’s a much better place to start than everywhere. And maybe, over time, it would mean that students could get the applications back down to five. Five is a number of schools you could visit — although it’s doubtful how useful this is in the decision process. Five is a number where you might talk to one or two current students or young alumni at each school. Five is a reasonable solution set where focused diligence might actually be useful.

On the other side, colleges would have a data-supported hypothesis of students most likely to fit into the community. And thrive. And matriculate. And graduate. They might decide to take a closer look at those applications. Not that they couldn’t admit other students, but a little more information can’t hurt.

A flaw in this idea is that over time, if both sides stick to the list, you’d create very homogenous schools. Which is just what admissions committees are trying to avoid. However, when you add back the self-selection of the students that matriculate, the current system is flawed as well.

I’m sure there are more holes in my idea. But I think there has to be a better way then the way we’re doing it today.

What do you think? Is this a good application for machine learning?