How to Find a Data Scientist Job Part II
Published in
3 min readSep 9, 2020
Recently I have been interviewed by around ten companies for a data scientist position. I have written an article about how to find a data scientist job part I and part III, sharing tips about interviewing and job searching. Here I’d like to share some of the most frequently asked questions during the interview. I have also added some answers using hyperlinks.
Technique questions
- Linear/logistic regression. Focus on the details, such as what’s the assumption about linear regression? What’s the lost function for logistic regression?
- Ridge Regression and LASSO. What’s the difference between them? What’re the weights would be like?
- PCA. Please explain how PCA works? The hyperlink directs to a pdf file from the Standford CS229 Machine Learning lecture note, gives a detailed process of PCA.
- Cross-Validation. How would you use Cross-Validation?
- Bias and variance. How to find the balance between bias and variance?
- Ensemble methods. What’s the difference between bagging and boosting? How random forest/XGBoost works? In which algorithm, the depth of decision trees is larger, and why?
- K-means. Explain how the K-means algorithm works?
- Learning Rate. What will happen if the learning rate is too big/small?
- Central limit theorem. What’s your understanding of the Central limit theorem?
- Covariance, Correlation coefficient, and R². What’s the relationship between them?
- Bayesian theorem. Typically there will be a question for you to calculate the posterior probability. What’s the equation of the Bayesian theorem and the definition of the prior, posterior?
- Sampling. Given access to a uniform random number generator over [0, 1], how would you generate a sample from a particular (absolutely continuous for simplicity) distribution?
- Hypothesis testing. Khan Academy provides a great video series about this topic.
- Confidence Interval. You can also gain related knowledge from Khan Academy.
- Cloud computing. Do you have exposure to cloud computing? Can you explain cloud computing to non-tech staff?
- Error Metric. What’s the difference between MSE and MAE? If the prediction is a constant number, what’s the best choice for MSE and MAE? The link directs to a Coursera notebook. You can find the detailed derivation in “Metrics_video2_constants_for_MSE_and_MAE.ipynb”.
17. Describe a neural network structure that you are familiar with.
Behaviour Questions
- Tell me about yourself/work/project experience.
- Why would you like to work as a data scientist?
- What can you bring to the company if you get the opportunity?
- What’s motivating you to work for our company?
- Which principle of our company you like most and why?
- Why did you leave your last job? What is the part that you enjoy the most/least?
- Tell me about your experience when you made a mistake.
- Do you have experience when you have a different opinion with your manager/senior staff, and how did you go through it?
- Have you ever worked with someone who is hard to cooperate with, and how did you work with them?
- What do you usually do when you are free? What’s your hobby?
- What would you do given enough money?
- Tell me a project you’ve ever done, if you have the chance to do it again, how would you improve it?
Comment if you have interesting questions to be added to this list.