20 Questions to Ace Before Getting a Machine Learning Job. An Introduction to machine learning thanks to a tweet

Call it a lucky find on Twitter. It definitively is. Santiago tweeted 20 questions you need to ace before getting a machine learning job. I figured I’d use these questions to understand developers’ work better and maybe get a glimpse into future applications.

The first questions were about various basic concepts of machine learning. Let’s imagine, for example, that we are given a puzzle as a gift. How do you put it together? Do you need the finished puzzle poster as a basis, or do you put the edge together first? Do you try to sort the colors? Depending on what kind of information you have, it makes sense to choose a different method and, therefore, a different algorithm.

The same applies to deep learning methods such as supervised, unsupervised, semi-supervised, and reinforcement learning. A scientist decides on a training model for its algorithms depending on the available data and the research question. He named this part of his tweet the “warming up” phase. A key takeaway for non-techies:

  • Depending on the research question, developers use a specific algorithm. The decision is based on data first, then the training model, and lastly, the algorithm.

The “Getting Deeper” Phase

In this phase, he asks questions concerning supervised learning methods. How do we learn when supervised? The teacher labels things, and we understand these things to be true (e.g., knowing all presidents by heart). In machine learning, this means that the algorithm learns on a labeled dataset to evaluate the training data. So when the question arises when to use classification over regression problems, the two main areas where supervised learning is useful are being brought up. According to Isha Salian, “classification problems ask the algorithm to predict a discrete value, identifying the input data as a member of a particular class or group, regression problems look at continuous data.” Therefore, we need to remember as non-techies:

  • Supervised learning models need to have a clean, well-labeled set of available reference points or a ground truth to train the algorithm.
  • If we want to distinguish dogs and cats in pictures and have a precise data set together, it makes more sense to use one method. Whatever the scientist chooses, he does so based on the research question and the possibilities that the data offers.

