Image for post
Image for post
Photo by Olav Ahrens Røtne on Unsplash

INDEX : Step by step approach

SECTION 1 : Understanding the problem & data.
1. Detailed overview
2. The business problem
3. About the dataset
4. Exploratory data analysis and pre-processing
-----------------------------------------------------
SECTION 2 : The action plan.
5. Evaluation metric
6. Loss function
7. Baseline model
7.1. K-Fold cross validation
7.2. Post-processing : binning
7.3. Error Analysis
7.3.1. Why these features are not performing well?
7.3.2. Possible workarounds
7.3.3. Limitations with current LSTM model
8. Model with SOTA pretrained embeddings
8.1. BERT
8.2. USE
8.3. XLNet
8.4. RoBERTa
-----------------------------------------------------
SECTION 3 : Inferences and analysis.
9. Final results
9.1. Difference between baseline model and final_model.
9.2. …

Image for post
Image for post
Photo by cottonbro from Pexels

Let’s take a quick overview on Stack Overflow, before we dive deep into the project itself. Stack Overflow is one of the largest QA platform for computer programmers. People posts questions-queries associated with wide range of topics (mostly related to computer programming) and fellow users try to resolve queries in the most helpful manner.

INDEX : Step by step approach

SECTION1 : Brief overview
1. Business problem : Need of search engine.
2. 2.1. Dataset
2.2. The process flow
2.3. High level Overview
3. Exploratory data analysis and Data pre-processing
-----------------------------------------------------
SECTION 2 : The attack plan

4. Modelling : The tag predictor
4.1. A TAG Predictor model
4.2. TRAIN_TEST_SPLIT
4.3 Time based splitting Modelling
4.4. GRU based Encoder-decoder seq2seq model
4.5. Model embedding
4.6. Word2Vec embedding
4.7. Multi-label target problem
5. LDA (Latent Dirichlet allocation) : Topic Modelling
6. Okapi BM25 Score : simplest searching technique
7. Sentence embedding : BERT
8. Sentence embedding : Universal sentence encoder
-----------------------------------------------------

SECTION 3 : Productionizing the solution
9. Entire pipeline deployment on a remote server
9.1. A Cloud platform
9.2. Web App using Flask
-----------------------------------------------------
SECTION 4 : Results and conclusion

10. Results and conclusion
10.1 Final Results : BERT
10.2. Final Results : USE
10.3. …

About

Akshay Vispute

NLP, computer vision enthusiast

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store