The Harry Potter Sorting Hat AI: complete code

A baby programmer
2 min readDec 16, 2023

--

The main problem that every Data Analyst has to face is gathering data. Most of the time, this is a soul-breaking process, and the biggest problem of all is that not always data is available.

Instead of finding a real dataset, you might find only find some measurements published somewhere. I will show you how can you build a classifier simply by relying on those measurements.

The Harry Potter Sorting Hat

In my previous article, I described without using code how to replicate the algorithm behind the Harry Potter Sorting Hat.

For my assumptions, the Sorting Hat selects people based on their personality profiles. Therefore, the input for my model will be the 5 traits of the Big 5 personality model, and the Harry Potter houses the output.

Based on this research (University of California, 2019), these are the mapping of the Big 5 model traits with the four Hogwarts houses.

The data that has been used to write this research paper is available at this source (my personal thanks to Lea Jakob for helping me find this link, she and her colleagues are passionate about promoting Open Science practices). In this article, I will be operating in a quite unconventional, yet creative way. I will only use a graph published in the research paper, from which I will take measurements to replicate the data through simulations.

The purpose of the experiment can be articulated in this way: how can we use this data to train an AI that, based on the personality traits of every individual, can tell to which Hogwarts house we belong to?

Converting boxplots to Normal Distributions

In order to be surgically precise, we are going to measure (using the software paint will be just fine) the distance in pixel of Q1, Q2, and Q3 of every boxplot from a reference point: .4.

A big thanks to you for read this type!

You can follow me at | Telegram | GitHub | Medium |

--

--

A baby programmer

Teaching Programming codes and Computer science with A baby programmer