Week 4 — This is the way.

Erdem Korhan Erdem
AIN311 Fall 2022 Projects
2 min readDec 12, 2022

by Baran Orhan and Erdem Korhan Erdem

Welcome back! Wish you have not bored of our data collection progress. Not surprising, but we have.

a tragicomic meme explaining our progress [1]

As mentioned in Week 2, we are still dealing with setting up the necessary data. We have completed scraping courses’ outcomes. However, thanks to LinkedIn, we cannot scrap engineers’ skills as quickly as course outcomes because we are getting banned again and again. That causes us to create a new fake account to scrap the rest, and slows down our progress. But no worries, we are eager to complete this process as quickly as possible.

Labeling Data

Besides trying to complete collecting skills data, we also need to label the course outcomes according to BIO (beginning-inside-outside) tagging scheme. Here is an example of BIO tagging scheme of another dataset.

BIO tagging scheme [2]

In this labeling technique, labeled words with letter B specifies the beginning of an entity, letter I represents the words inside the entity, and letter O represents the words which are not included in any entity. Since we have completed scraping course outcomes, now we are going to label those outcomes by hand accordingly. Very exciting, huh? :D

Here is a sample course outcome labeled by BIO tagging scheme:

The description above has been taken from a backend course, so what a complete label will look like is: B-Backend, I-Backend etc.

Possible Model Architectures

We are also doing research about what architecture should we use in our model. From different sources, we have seen that most of the models developed for Named Entity Recognition benefit from RNNs and LSTMs. Most probably, we are also going to benefit from aforementioned approaches.

RNN visualization [3]

Until progress report due, we plan to complete the data-related issues and build a simple model to obtain a shallow performance. Thanks for your time, and see you in next week’s blog.

We hope that this is the way :)

--

--

Erdem Korhan Erdem
AIN311 Fall 2022 Projects

Artificial Intelligence Engineering senior student at Hacettepe University.