Cheat Sheets for Machine Learning Interview Topics

A couple of years ago I started applying for internships in the area of Machine Learning and ML system design. I had been studying and actively researching in the area of ML for a few years then. I was familiar with most of the basic topics. But when I started interviewing, I realized that though I had a general understanding of the topics, I required a quick go-through before I can answer it perfectly.

So I decided to refresh my concepts. I realized that before every interview, I was required to go through the topics again. So, I created my handwritten notes. Skimming through them was much easier than going through slides and book chapters. It provided me with a quick boost to my understanding in a short amount of time. I decided to convert my hand-written notes into compact cheat sheets that might come in handy for ML interviews and daily data-scientist life in general.

The rest of the article is based on those cheat sheets. For each topic, I provide

  • An overview in form of a cheat sheet
  • Example interview questions
  • Suggested articles for a detailed understanding of the topic.

Note 1:_ These cheat sheets are aimed at refreshing the concepts and are not meant to provide in-depth understandings of the topics for beginners._

_Note 2: _The article is constantly updated for more cheat sheets.

Source:_ All of these cheat sheets (and more) can be downloaded in pdf format from _www.cheatsheets.aqeel-anwar.com

Bias and Variance in Machine Learning Models

a) Overview:

Image for post

Bias-variance tradeoff cheat sheet — Image by author

b) Example Questions:

  1. What is Bias in ML models?
  2. What is Variance in ML models?
  3. What is the trade-off between bias and variance?
  4. What are the demerits of a high bias / high variance ML model?
  5. How do you select the model (high bias or high variance) based on the training data size?

c) Detailed Article:

Imbalanced data in Machine Learning

a) Overview:

Image for post

Imbalanced data in classification — Image by Author

b) Example Questions:

  1. What is imbalanced data in classification?
  2. Is accuracy a good performance metric? When does it fail to capture the performance of an ML system?
  3. What are Precision and Recall? Give an example
  4. How to address the issue of imbalanced data?

c) Detailed Articles:

Bayes’ Theorem

a) Overview: Image for post

Bayes’ Theorem and Classifier — Image by Author

b) Example Questions:

  1. What is Bayes’ theorem?
  2. Toy example to implement Bayes’ theorem
  3. What is the difference between MLE and MAP?
  4. When are MAP and MLE equal?

c) Detailed Articles:

Principal Component Analysis and Dimensionality Reduction

a) Overview:

Image for post

PCA and Dimensionality Reduction — Image by Author

b) Example Questions:

  1. What is Principal Component Analysis?
  2. How can we use PCA to reduce dimensions?
  3. What do the eigenvalues signify in the context of PCA? (Greater the magnitude of eigenvalue, the more information is preserved if we keep that corresponding eigenvector as a feature vector for our data)

c) Detailed Articles:

Regression in Machine Learning

a) Overview:

Image for post

Regression Analysis — Image by Author

b) Example Questions:

  1. What is Regression in ML?
  2. How can we introduce regularization in regression? (LASSO and Ridge)
  3. What impact does LASSO and Ridge regression has on the weights of the model? (Ridge tries to reduce the size of the weights learned, whereas LASSO tries to force them to zero creating a more sparse set of weights)
  4. When does the prediction by Bayesian linear regression approach the prediction of linear regression? (When the number of data points is large enough)
  5. Is logistic regression a misnomer? (Yes, because it is not regression, but classification based on regression)

c) Detailed Articles:

Regularization in Machine Learning

a) Overview:

Image for post

Regularization in ML — Image by Author

b) Example Questions:

  1. What is regularization in ML?
  2. How can we address over-fitting?
  3. What is K-fold cross-validation?
  4. What is the difference between L1 and L2 regularization?
  5. Why do we use dropout?

c) Detailed Articles:

Famous DNNs in Machine Learning

a) Overview:

Image for post

Famous CNNs — Image by Author

b) Example Questions:

  1. How does the ResNet network address the problem of vanishing gradient?
  2. What is one of the main key features of the Inception Network?
  3. What are shortcut connections in ResNet network?

c) Detailed Articles:

Summary

This article provides a list of cheat sheets covering important topics for Machine learning interview followed by some example questions. The list of topics and the number of cheat sheets are constantly being added to the article.

This article was originally published on Medium.com

#cheatsheet #machine-learning #programming

Cheat Sheets for Machine Learning Interview Topics
14.70 GEEK