Data Science Interview Questions and Answers

Maha K
3 min readMay 4, 2023

Here are interview questions for a data science position:

  1. What is the difference between Gradient Boosting and Random Forest?
  2. What are the different types of Sampling methods that you have used?
  3. Explain the difference between bagged and boosting models.
  4. What is cross entropy?
  5. What is multi-collinearity, how do you fix it in a regression?
  6. What is the significance of log odds?
  7. Give some problems or scenarios where map-reduce concept works well and where it doesn’t work.
  8. Given a dataset having employee id and manager id find the employees who are also managers ?
  9. A person is using search engine to find something, you know nothing about her/him, how do you come up with am algorithm that will predict what she/he needs after the user types only a few letters ?

Here are answers:

  1. Gradient Boosting and Random Forest are both ensemble machine learning algorithms used for regression and classification tasks. Random Forest builds multiple decision trees and aggregates their predictions to obtain the final prediction, whereas Gradient Boosting builds decision trees sequentially, where each subsequent tree tries to improve the…

--

--