A couple of years ago I started applying for internships in the area of Machine Learning and ML system design. I had been studying and actively researching in the area of ML for a few years then. I was familiar with most of the basic topics. But when I started interviewing, I realized that though I had a general understanding of the topics, I required a quick go-through before I can answer it perfectly.

So I decided to refresh my concepts. I realized that before every interview, I was required to go through the topics again. So, I created my handwritten notes. Skimming through them was much easier than going through slides and book chapters. It provided me with a quick boost to my understanding in a short amount of time. I decided to convert my hand-written notes into compact cheat sheets that might come in handy for ML interviews and daily data-scientist life in general.

The rest of the article is based on those cheat sheets. For each topic, I provide

- An overview in form of a cheat sheet
- Example interview questions
- Suggested articles for a detailed understanding of the topic.

_ These cheat sheets are aimed at refreshing the concepts and are not meant to provide in-depth understandings of the topics for beginners._Note 1:

_Note 2: _The article is constantly updated for more cheat sheets.

_ All of these cheat sheets (and more) can be downloaded in pdf format from _Source:www.cheatsheets.aqeel-anwar.com

Bias-variance tradeoff cheat sheet — Image by author

- What is Bias in ML models?
- What is Variance in ML models?
- What is the trade-off between bias and variance?
- What are the demerits of a high bias / high variance ML model?
- How do you select the model (high bias or high variance) based on the training data size?

Imbalanced data in classification — Image by Author

- What is imbalanced data in classification?
- Is accuracy a good performance metric? When does it fail to capture the performance of an ML system?
- What are Precision and Recall? Give an example
- How to address the issue of imbalanced data?

Bayes’ Theorem and Classifier — Image by Author

- What is Bayes’ theorem?
- Toy example to implement Bayes’ theorem
- What is the difference between MLE and MAP?
- When are MAP and MLE equal?

PCA and Dimensionality Reduction — Image by Author

- What is Principal Component Analysis?
- How can we use PCA to reduce dimensions?
- What do the eigenvalues signify in the context of PCA?
*(Greater the magnitude of eigenvalue, the more information is preserved if we keep that corresponding eigenvector as a feature vector for our data)*

Regression Analysis — Image by Author

- What is Regression in ML?
- How can we introduce regularization in regression?
*(LASSO and Ridge)* - What impact does LASSO and Ridge regression has on the weights of the model?
*(Ridge tries to reduce the size of the weights learned, whereas LASSO tries to force them to zero creating a more sparse set of weights)* - When does the prediction by Bayesian linear regression approach the prediction of linear regression?
*(When the number of data points is large enough)* - Is logistic regression a misnomer?
*(Yes, because it is not regression, but classification based on regression)*

Regularization in ML — Image by Author

- What is regularization in ML?
- How can we address over-fitting?
- What is K-fold cross-validation?
- What is the difference between L1 and L2 regularization?
- Why do we use dropout?

Famous CNNs — Image by Author

- How does the ResNet network address the problem of vanishing gradient?
- What is one of the main key features of the Inception Network?
- What are shortcut connections in ResNet network?

This article provides a list of cheat sheets covering important topics for Machine learning interview followed by some example questions. The list of topics and the number of cheat sheets are constantly being added to the article.

This article was originally published on Medium.com

#cheatsheet #machine-learning #programming

14.70 GEEK