Experienced Data Scientist, NLP Interview Questions, Journey & Process

Shweta Gargade
3 min readNov 30, 2022

Learning Data Science is easy but preparing for an interview is not an easy job. During the data science interview, the interviewer will ask you a wide range of questions, the questions you have never expected.

If you are looking for how to prepare for an interview, this series is for you. I’ll be helping you with my own journey of Data Science job interviews. In this series of Data Science interviews I’ll cover following topics: Python, Machine Learning, Deep learning, NLP, Statistics, and many more. Along with that I’ll mention the steps and the company name.

Company name: CitiusTech, Duration: 60 mins

Part 1: Introduction about yourself and data Science journey — 5 mins

Part 2: Explain your projects in detail — 30 mins discussion, and asked questions about project end-to-end pipeline, challenges and solutions.

Part 3: Theoretical questions — 20 mins
1. What are the characteristics of normal distribution?
2. What is central limit theorem?
3. How much percentage data it covers when we have 2sigma value in normal distribution around mean?
4. What is P-value?
5. What is type-1 error?
6. Why the mean and variance of poison distribution is same?
7. If we have low variance data/variable and high variance data/variable which one do you think is good for machine learning model?

Part 4: Code Understanding — 5 mins
Understand the given code and explain it to me — what is written here and what will be the output of the entire code. (Machine learning project with DumbModel class having various user defined functions)

Part 4: If you have any questions — 5 mins
Discussed about role and job responsibility and tech stack.

Company name: Retail Capital, Duration: 60 mins

Part 1: Introduction and explain your data science projects — 5mins

Part 2: Projects discussion, Data understanding, Healthcare terms understanding, deployment,

Part 3: Python coding
1. What is multi-label classification and how it is different from multi-class classification
2. Can we build multi-label classification using Random Forest
3. What is BERT model?
4. What is knowledge graphs?
5. What is un-directional, uni-directional and bi-directional graph?

Part 4: Python coding

Company name: Prisonforce.ai , Duration: 60 mins

Part 1: Introduction and explain your data science projects — 5mins

Part 2: Theoretical questions — 25mins
1. What is Time series Analysis?
2. what is auto-regressive (AR)
2. Give me a real life example of moving average and AR
3. what is sigmoid function in logistic Regression
4. What if we consider some other activate function in Logistic regression, what output should we expect?
5. Explain LSTM architecture
6. What is the range of tanh function
7. What if we interchange the activation functions in LSTM (sigmoid with tanh and tanh with sigmoid) what output should we expect, is it wrong or what?
8. What is Central limit theorem?

Part 3: Fundamental questions — 15 mins
1. Explain how NER typically works
2. Difference between traditional NER and spacy

Part 4: Probability questions — 5 mins
1. Let’s assume we have initially just one ball in a bag(and we don’t know the color of a ball) and then we added another red ball in a bag and shuffled it. What is the Probability of getting a red ball from a bag?
2. If we have numbers from 0 to infinity, what is the probability of number is divisible by 120?

I will add Interview process questions of other companies in the coming series, Stay tuned!

--

--

Shweta Gargade

Senior Data Scientist | NLP & Speech Researcher | Helping Freshers | LinkedIn:@shwetagargade