MATCHING RESUMES WITH JD USING UNIVERSAL SENTENCE ENCODER(USE)
Resumes contain a lot of information but not all of it will be considered as important. Our intentions may differ according to our requirements. We might be looking for specific entities like years of experience, job roles, etc. An automated way of matching resumes with job descriptions will help us reap a lot of benefits and tremendously reduce the amount of time we have to spend skimming through resumes. This article demonstrates the implementation of matching resumes with job descriptions using Universal Sentence Encoder.
UNIVERSAL SENTENCE ENCODER
Universal Sentence Encoder (USE) is a pre-trained model available in TensorFlow Hub. It encodes text into high-dimensional vectors that can be used for text classification, semantic similarity, clustering, and other natural language tasks. There are two variants of the Encoder component in Universal Sentence Encoder:
i) Transformer Encoder
ii) Deep Averaging Network (DAN)
The model used here is “https://tfhub.dev/google/universal-sentence-encoder/4” which implements DAN architecture.
- TensorFlow ≥ 1.7
This is the set of steps we would follow:
- Setup the module.
- The text from resumes is extracted and is passed into a universal sentence encoder and embeddings are fetched.
- The job description is compared against the set of resumes and cosine similarity is calculated for each resume against the job description.
- The resumes are ranked in descending order which depicts the list of resumes that matches the job description.
SAMPLE JOB DESCRIPTION
Data Scientist/ ML Engineer
We are looking for a data scientist that will help us discover the information hidden in vast amounts of data and help us make smarter decisions to deliver even better products. Your primary focus will be in applying data mining techniques, doing statistical analysis, and building high quality prediction systems integrated with our products. Data Scientist at GameChange must be an energetic self-starter who can quickly grasp the company’s vision, develop specific tactical plans, and begin implementation upon appropriate approvals. The candidate must be resourceful and able to deliver on a plan defined.Skills required -
- SVM, Decision Forests, CNN, RNN, LSTM etc.
- Experience with common data science toolkits, such as R, Weka, NumPy, OpenCV, MatLab, etc.
- Great communication skills
- Good applied statistics skills, such as distributions, statistical testing, regression, etc.
-Experience with data visualisation tools, such as D3.js, GGplot, etc.Job Type: Full-timeSalary: ₹2,000,000.00 - ₹2,500,000.00 per yearExperience:Data Science: 2 years (Preferred)
work: 1 year (Preferred)Education:
import tensorflow as tf
import tensorflow_hub as hub
import numpy as npmodule_url = "https://tfhub.dev/google/universal-sentence-encoder/4"
model = hub.load(module_url)
print ("module %s loaded" % module_url)def cosine(u, v):
return np.dot(u, v) / (np.linalg.norm(u) * np.linalg.norm(v))query = docx2txt.process("/content/sample_data/Data Scientist.docx")
for single_file1 in files1:
with open(single_file1, 'r') as f1:
sentence_embeddings = model(sentences_list)
query_vec = model([query])
for sent in sentences_list:
sim = cosine(query_vec, model([sent]))
file_name.append(single_file1)mapped = zip(file_name, org_list)
mapped = list(mapped)
res = sorted(mapped, key = lambda x: x,reverse=True)
for i in res: