Machine Learning Interview Questions 2023

Ashesh Nath Mishra
4 min readJul 11, 2023

--

As per my recent Interview experience I have created a list of questions on various Machine Learning Sections. This list will comprise of just questions, I am planning to write a separate blog post for all the answers.

Machine Learning Interview Questions

  1. What is Regression?
  2. What are the Types of regression?
  3. Explain the assumptions of Linear regression.
  4. How to evaluate regression models?
  5. Explain different types of model Testing methods.
  6. Explain Benefits of cross validation?
  7. What is overfitting?
  8. What are different ways to fix overfitting?
  9. How to handle imbalanced data in a dataset?
  10. What to do if your data is skewed?
  11. What if your data has outliers, how to identify and different ways to fix?
  12. What is difference between lasso and ridge regression?
  13. What is cross entropy?
  14. Difference between precision and recall?
  15. What is a confusion matrix?
  16. What is AUC?
  17. What is bias variance trade-off?
  18. What is correlation? How is it related to covariance?
  19. What are activation function? Explain different Types? Where and why it is used? Sigmoid vs tan h vs relu?
  20. What is classification?
  21. How to evaluate classification models?
  22. What are ensemble models?
  23. What is clustering?
  24. What are types of clustering?
  25. How to decide the K in k means clustering? Can we automate it? If yes how?
  26. What is hypothesis testing?
  27. What is A/B testing?
  28. How does time series work? Explain how to approach a time series problem step by step.
  29. How do you treat seasonality in time series?
  30. What to do if your time series data is not stationary?
  31. Difference between ARIMA and SARIMA?
  32. What is auto correlation?
  33. Can we use deep learning for time series forecasting? If yes what models to use? How do they work?
  34. What is null hypothesis?

NLP Questions

  1. What is word embedding in NLP? Types of embedding? Best one to use ?
  2. What is N-grams?
  3. What is bag of words model?
  4. What is named entity recognition? Is it a classification or regression model?
  5. How to approach a named entity recognition problem?
  6. How to handle large data size in case of a NER?
  7. What is transformer architecture?
  8. How does BERT works? explain Architecture ?
  9. What is attention ? Why is it important?
  10. How to create a model similar to BERT manually?
  11. What is LLM? How are they designed?
  12. In which cases LLMs are disadvantagous?
  13. What are benefits of LLM?
  14. How does ChatGPT (Any Generative AI chatbot) remember previous text messages?
  15. Where should you use open source models like falcon, llama over ChatGPT?
  16. How does LLM like chatgpt work in the background?
  17. Can we create a Private LLM like ChatGPT that can Work only on Prem devices? If yes how?
  18. How to Fine Tune LLM?
  19. How does ChatGPT generate text response? Can it generate without any manual intervention? If yes how?
  20. Let’s say you have created a chatbot. How would you test it? Can you test it without launching to the customers?
  21. How to design conversations in a chatbot?
  22. How does dialogflow or rasa work in designing chatbots?
  23. How to evaluate a chatbot?
  24. What metrics to use to monitor a chatbot performance once it’s live?
  25. How the chatbot knows what the user is saying and make a response?
  26. Let’s say you need to generate caption from an input image? How would you design? What steps and models to use? How image embedding works? How an image is converted to generate text input?
  27. What is topic modelling? What steps and models to use? If given a text data corpus of conversations, how would you generate topic from a unlabeled unsorted data corpus?
  28. What is sentiment analysis? How to approach? Models to use? How to evaluate?
  29. How can we create prompt templates and use them in LLMs?
  30. How vector storage/DBs are used to store information in LLMs?

Other ML Questions

I haven’t been asked these following questions yet. I am adding them to cover other important topics.

  1. What is RNN? Where is it used? Drawbacks of RNN?
  2. What is vanishing gradient problem and where does it occurs? How to fix?
  3. What is LSTM? How does it works? Explain LSTM architecture?
  4. What is bagging and boosting? Explain xgboost?
  5. Why do we need an optimizer? Explain Adam optimizer?
  6. What is LDA in ML?
  7. What is PCA? Why do we use? What alternatives are there?
  8. What is pruning in decision trees?
  9. What is back propagation?
  10. What is gradient decent?
  11. What is bootstrapping?
  12. What is conditional probability?
  13. What is word2vec?
  14. What is seq2seq and encoder decoder Neural Networks?

I will keep updating this list further. Thanks for Reading.

--

--