Stack Ranking — Building the base
The base of any AI-based product is their algorithms. When a lot of organizations are working towards solving the same problem, your Algorithm becomes your USP (Unique Selling Proposition), it defines your individuality. In this blog post, I will try to explain how we laid the ground for Param’s Stack-ranking solution.
Defining the problem statement:
A streamlined hiring process is vital to the success of any business. However, one of the biggest problems in the recruitment space is that of time. Great candidates do not stay active on the job market for long. Recruiters face a lot of operational load in the form of manual tasks taking up a lot of their time. These repetitive, low-value tasks like resume screening take up as much as 23 hours for every hire made, leaving recruiters with lesser time for high-value, high-impact tasks like candidate engagement.
Resume screening is one such task which is high-volume, monotonous, repetitive and time-consuming. This also tends to be one of the bigger bottlenecks in the hiring process. Recruiters are also often subjective about their screening criteria, specifically basing it on conditions such as academic institutions, previous employers or the presence of certain keywords on the resume. Resume screening therefore tends to be based more on gut-instinct than on data. This is the reason why hiring managers review the recruiter-screened candidates further and shortlist far fewer candidates from the ‘screened’ lot. The problem here is not only that recruiters are screening in too many resumes (which get screened out by the hiring manager) but also that they might be screening out resumes (which might have been a good fit) because of their subjective and inconsistent screening criteria.
Automating the tedious tasks in recruiting has been an ongoing effort for some time now. One of the biggest use cases of using automation in recruitment has been in resume screening, a stage where a resume spends 23%* of its time during the hiring process, only next to Hiring Manager Review which is about 37%* (*Source: iCIMS e-book: Strategies to Improve the Recruiter and Line Manager Relationship)
Automated resume screening involves screening, matching and stack-ranking applicant resumes for a given job description.
Ranking candidate resumes against given Job descriptions gives us two types of information.
- Which of the candidates are the most suitable for a given job A.
- Which of the jobs A, B, C or D are the most suitable for a given candidate.
The selection criteria for candidates can be based on multiple factors — some internal to these documents (i.e., listed on the job description) as well as certain external factors. I’ll proceed to explain the science behind Param’s Stack-ranking solution.
Task — Finding Data scientists among 10,000 profiles
Recruiters work on hiring for multiple positions at the same time. So over a period of a few years, a recruiter would have thousands of candidate profiles with them. This can be a mixed bag of Java, .NET, Python programmers to Data engineers to even Sales and marketing profiles depending on what roles they’re primarily hiring for.
Defining your requirements is the first step — which is to draft a well-written Job Description. This means the job description will have details like the skill set required, the target experience bracket, educational qualifications and other criteria the organization sets as mandatory.
Using Bag of words and TF-IDF models:
‘Bag of Words’ model and ‘Term Frequency — Inverse Document Frequency’ (TF-IDF for short) are the most reliable methods for defining a document in terms of numbers and vectors (collection of numbers), since the only thing a machine understands are numbers.
Let’s consider a subset of this problem from the above task and take the below set of Resumes and JDs for example (Though solving this problem is far more complicated in real life.)
JD1 (Data Scientist) — “ We need someone who can develop production applications on Python in NLP and Knows numpy. “
Resume1 ( Java Developer) — “ I have worked on Java and Spring, and developed production applications”
Resume2 (Python Developer) — “ I have worked on Python and Django and integrated work on NLP team in a production application ”
Resume3 (Sales person) — “ I have worked on increasing the YOY growth for all the products of the organization”
Resume4 (Data Scientist) — “ I have worked on Python and developed production applications in NLP”
Resume5 (Data Scientist 2) — “ I have worked on Python and numpy to develop computer vision production applications”
Now we will rank these 5 dummy resumes with the given Job description.
First let’s run these through the Bag of Words model.
The Bag of Words is a collection of counts of words that exist in all of the given documents and keep the occurrences of each.
Since we are considering Bigrams (Set of two words) as well for the model, ‘computer vision’ is also listed as one of the elements.
Next, we’ll pass this through the TF-IDF model.
TF-IDF is where things gets interesting.
It is a statistical method which considers your entire collection of input words to rank them.
Term Frequency is basically the count of word occurrences, which is then divided by total word occurrences to get a normalized term frequency.
TF(t) = (Number of times a term appears in a document) / (Total number of terms in the document).
Inverse Document frequency (IDF) signifies the importance of a word in the whole word corpus.
It is a representation of how rare is your given term in the collection.
Let’s go back to the example where there are 10,000 resumes and we need to find a candidate having NLP or computer vision or some other data science skills. If these terms are specified in your JD, and if we have just 50 resumes with these terms, the importance of these words will increase for our model. This will result in higher score for the resumes containing these terms.
Inverse document frequency is represented as :
IDF(t) = log_e(Total number of documents / Number of documents with term t in it).
Next, we compute similarities among the given documents with respect to the JD1:
These similarities are computed with a weighted ensemble(collection) of the above models.
The JD1 matched exactly with itself, which is not a surprise. We observe Res4 is the best performing of the lot, which we can judge by taking a glance at the text as well.
Here we also observe that res2 and res5 have very similar scores as they both contain a mix of terms present in the JD. To distinguish between these two is the next challenge that we take up.
This method helps us find suitable candidates who have the skills as per our requirements from a long list of resumes. This is a good approach to begin with, but is not the ideal one. In the next article, we’ll see how we give meaning to words and use semantic similarity to redefine our approach.