Building a Knowledge Graph for Job Search using BERT Transformer

A guide on how to create knowledge graph using NER and Relation Extraction

Walid Amamou
May 17 · 8 min read
Knowledge Graph Network

Introduction:

LinkedIn job recommendations
Job analysis pipeline

Data Extraction:

def analyze(text):   experience_year=[]
experience_skills=[]
diploma=[]
diploma_major=[]
for doc in nlp.pipe(text, disable=["tagger"]):
skills = [e.text for e in doc.ents if e.label_ == 'SKILLS']
for name, proc in nlp2.pipeline:
doc = proc(doc)
for value, rel_dict in doc._.rel.items():
for e in doc.ents:
for b in doc.ents:
if e.start == value[0] and b.start == value[1]:
if rel_dict['EXPERIENCE_IN'] >= 0.9:
experience_skills.append(b.text)
experience_year.append(e.text)
if rel_dict['DEGREE_IN'] >= 0.9:
diploma_major.append(b.text)
diploma.append(e.text)


return skills, experience_skills, experience_year, diploma, diploma_major
def analyze_jobs(item):
with open('./path_to_job_descriptions', 'w', encoding='utf-8') as file:
file.write('[')
for i,row in enumerate(item['Description']):
try:
skill, experience_skills, experience_year, diploma, diploma_major=analyze([row])
data=json.dumps({'Job ID':item['JOBID'[i],'Title':item['Title'[i],'Location':item['Location'][i],'Link':item['Link'][i],'Category':item['Category'[i],'document':row, 'skills':skill, 'experience skills':experience_skills, 'experience years': experience_year, 'diploma':diploma, 'diploma_major':diploma_major}, ensure_ascii=False)
file.write(data)
file.write(',')

except:
continue
file.write(']')
analyze_jobs(path)

Data Exploration:

Diploma distribution across multiple fields
Diploma major distribution
#Diploma
('Master', 54), ('PHD', 49),('Bachelor', 19)
#Diploma major:
('Computer Science', 36),('engineering', 12), ('Machine Learning', 9),('Statistics', 8),('AI', 6)

Knowledge Graph

job_net = Network(height='1000px', width='100%', bgcolor='#222222', font_color='white')

job_net.barnes_hut()
sources = data_graph['Job ID']
targets = data_graph['skills']
values=data_graph['years skills']
sources_resume = data_graph_resume['document']
targets_resume = data_graph_resume['skills']

edge_data = zip(sources, targets, values )
resume_edge=zip(sources_resume, targets_resume)
for j,e in enumerate(edge_data):
src = e[0]
dst = e[1]
w = e[2]


job_net.add_node(src, src, color='#dd4b39', title=src)
job_net.add_node(dst, dst, title=dst)


if str(w).isdigit():
if w is None:

job_net.add_edge(src, dst, value=w, color='#00ff1e', label=w)
if 1<w<=5:
job_net.add_edge(src, dst, value=w, color='#FFFF00', label=w)
if w>5:
job_net.add_edge(src, dst, value=w, color='#dd4b39', label=w)

else:
job_net.add_edge(src, dst, value=0.1, dashes=True)
for j,e in enumerate(resume_edge):
src = 'resume'
dst = e[1]

job_net.add_node(src, src, color='#dd4b39', title=src)
job_net.add_node(dst, dst, title=dst)
job_net.add_edge(src, dst, color='#00ff1e')
neighbor_map = job_net.get_adj_list()for node in job_net.nodes:
node['title'] += ' Neighbors:<br>' + '<br>'.join(neighbor_map[node['id']])
node['value'] = len(neighbor_map[node['id']])
# add neighbor data to node hover data
job_net.show_buttons(filter_=['physics'])
job_net.show('job_knolwedge_graph.html')
Knowledge graph
# JOB ID            #Connections
GO4919194241794048 7
GO5957370192396288 7
GO5859529717907456 7
GO5266284713148416 7
FB189313482022978 7
FB386661248778231 7
Knowledge graph of the highest job matches

Skills Augmentation

Word cloud of skills in software engineering
Word cloud of skills in hardware engineering
Word cloud of skills in research

Conclusion:

MLearning.ai

Data Scientists must think like an artist when finding a solution

Walid Amamou

Written by

Founder of UBIAI, annotation tool for NLP applications| PhD in Physics.

MLearning.ai

Data Scientists must think like an artist when finding a solution, when creating a piece of code.Artists enjoy working on interesting problems, even if there is no obvious answer.

Walid Amamou

Written by

Founder of UBIAI, annotation tool for NLP applications| PhD in Physics.

MLearning.ai

Data Scientists must think like an artist when finding a solution, when creating a piece of code.Artists enjoy working on interesting problems, even if there is no obvious answer.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store