10 Application-Centric & Open Data Science Interview Questions You Forgot to Prepare For

In the corporate data science world, technicality is important, but being business-minded and a good problem solver is more important (unless you’re going to work in a research department, of course). Being a data scientist is not solely about memorizing the technicalities of algorithms and methods, but about being able to apply them creatively to real-world problems that are messier and less clean than theoretical problems.
Many data science interview question compilations put too much emphasis on technicality and too little on the creative and problem-solving aspect of business data science. The reality is that companies won’t necessarily ask extremely technical questions — although it is good to prepare for them to use — but they will ask how you will apply your knowledge, regardless of whatever level you are applying for.
Unfortunately, many prospective data scientists forget to hone their problem-solving skills and are flustered when they need to answer an open-ended question.
Have you prepared enough?
Regardless of how great you are at problem solving, it’s never wrong to practice more. Here are 10 application-centric, open-ended questions, 10 in-depth answers, and a few tips along the way on mastering the more open-ended part of your data science interview. The exact questions may not (perhaps, even probably will not) be directly asked in an interview, but it’s the method of answering and the way of thinking that gets you hired, not the mere parroting of Wikipedia definitions.
See how many you can answer — good luck!
1 | Your customer conversion rate is very low. What may be a cause to this problem, and how would you fix it?
Customer conversion rate is a measurement of how many ‘interested’ users actually end up bringing profit to the company. A high conversion rate means that the business has a strong appeal, whereas a low conversion rate means the opposite. A low conversion rate also leads to a higher Customer Acquisition Cost, or the cost in marketing and deals for a customer to be ‘acquired’, which means wasted money.
Of the top of your mind, there are a couple of reasons why prospective customers are not being convinced to carry through with the sale: an unclear pitch, a pitch that doesn’t seem relevant, and so-called ‘flow-blockers’ like bad design or the impression of being very complex and unapproachable. In terms of design or which message to put on the homepage of a website, A/B tests can be used to optimize a customer’s chances of proceeding with a sale.
This doesn’t solve one issue, however, with the aspect of personalization that is so important in modern marketing. Marketers can no longer assume that everyone who receives their message will be receptive to it; instead, to maximize the probability of satisfaction, we must implement dynamic marketing. Based on clickthrough rates from previous emailing or other advertisements, one could create a machine learning model to predict which emails are most effective at getting the user on the site. For instance, consider three email headers that appeal to different emotions and wants (namely, time, ego, and free stuff).
“Jeans and jackets remain at 60% price only for 1 more day!”
“Get the latest and trendiest clothing for school”
“Purchase an item and win a free jacket!”
Say that the user was shown to be responsive only for the second email — based on that user’s attributes and history, we could build a model that sends emails to users that they are more likely to be responsive to, or be more likely to lead to a sale.
After these changes are implemented — and existing failing marketing campaigns are analyzed and discontinued if necessary — the customer conversion rate should go up.
Data scientists need to be able to look outside the narrow scope of analysis and understand how it fits within a business. Even though data analysts probably don’t know too much about user experience and design, taking common sense steps to understand the role of data analysis and its strategic role within the company is a must.
2 | What is KPI?
KPI stands for Key Performance Indicator and can be more generally referred to as a ‘metric’. KPIs refer to a set of quantifiable measurements used to gauge the performance of a product or new feature. With the release of new development, companies cannot simply assume that it will improve performance (applications of the principle “less is more”).
You can’t improve what you don’t measure .
— Peter Druker
In real data science applications, the result to measure (y-variable) is often not as clear and straightforward as one would hope. Choosing the KPI is often a difficult task, because the metric may not be as reflective as one would hope, and if a metric is incorrectly chosen, the analysis will be incorrect, and hence the project will collapse.
There are three things a good KPI should be:
Sensitive. The metric needs to be sensitive to changes, preferably directly impacted by the new feature instead of measuring a down-the-river response. For instance, changing the layout of a webpage may increase user engagement — measured by clicks or mouse movement — which indirectly increases profits. In this case, profit would not be a suitable KPI because it has an indirect correlation, which leaves analysis more open to unexplained external factors.
Forward-looking. Lag is too often an issue in measuring KPI. Consider, for instance, the number of coronavirus cases, a metric we are all too familiar with in measuring how bad the coronavirus was on a certain day. This is a backward-looking KPI because one day’s new case increase is actually reflective of how many people actually got it a while ago. The real number of increased cases today will be measured a few days from now. If a KPI cannot be forward-looking, the lag needs to at least be predictably factored out.
Reflective of Performance & Values. Depending on how you measure the KPI, the same change can have a variety of performance results. Consider, for instance, a self-driving car company, which measures KPI not only based on miles per accident but also on the number of breaks. This is because self-driving cars not only prioritize safety but also customer experience — a self-driving car that brakes every other second can be very safe, but also extremely unpleasant to ride in. An immediately obvious chosen KPI may make a new release that doesn’t align with the company’s values look like a high-performer.
When interviewers ask for a definition, don’t just give them one — give examples of how their applications, weaknesses, strengths, and other higher-level information.

#machine-learning #interview #data-science #artificial-intelligence #ai

towardsdatascience.com

10 Application-Centric & Open Data Science Interview Questions You Forgot to Prepare For