Week 1 — This is the way!

Baran Orhan
AIN311 Fall 2022 Projects
2 min readNov 13, 2022

by Erdem Korhan Erdem and Baran Orhan

Greetings, and be ready for the unique course recommendation system.

We will introduce a new recommendation system for the droids there. Hey, human, I know you are reading this. No worries. We have the same courses for you.

What to expect on this journey:

  • Web Scraping
  • NLP usage
  • Named Entity Normalization (NEN) and Named Entity Recognition (NER)
  • Related Works

Why do people need it?

  • We know you have gotten bored of those capitalist course advertisements and those meaningless user rating-based course recommendations (You probably did not even think of it, but come on, we need to perform some machine learning task).
  • Have you ever considered which skills the engineers currently working in your dream field have? Would it not be nice to have a system that recommends courses so that you can gain the skills of current employees in that field? We are talking about the engineers from Silicon Valley mainly.

We know what you are thinking. THIS IS THE WAY!!

How?

In our project, we will try to develop a course recommendation system that uses the CS engineers’ skills and course outcomes as its primary data source. What we are mainly going to do is try to extract the skills out of course descriptions and then compare them with the engineers’ skills to make a meaningful recommendation. In this skill extraction task, we plan to use NLP techniques: Named Entity Normalization (NEN) and Named Entity Recognition (NER).

Challenging parts

  • Extracting skills from the raw text as Udemy courses give
  • Creating own dataset. We do not have a publicly available dataset suitable for our task. Therefore, we are going to collect our data by web scraping. We will collect the skills of nearly 1000 engineers working in specific fields (backend, frontend, data science, DevOps, QA) and the outcome sections of online courses.

We plan to research NER and NEN techniques and how to collect the data from websites for the next week.

This is my Halloween costume as IG-11

End of the first week. See you soon, stranger!

References:

Web Scraping:

NER and NEN:

Related Works:

--

--