SQL is extremely powerful and has a lot of functionality. When it comes to data science interviews, however, there are really only a handful of core concepts that most companies test. These 10 concepts show up the most often because they have the most application in real-life settings.

In this article, I’m going to go over what I think are the 10 most important SQL concepts that you should focus the majority of your time on when prepping for interviews.

With that said, here we go!

1. CASE WHEN

You’ll most likely see many questions that require the use of CASE WHEN statements, and that’s simply because it’s such a versatile concept.

It allows you to write complex conditional statements if you want to allocate a certain value or class depending on other variables.

Less commonly known, it also allows you to pivot data. For example, if you have a month column, and you want to create an individual column for each month, you can use CASE WHEN statements to pivot the data.

Example Question: Write an SQL query to reformat the table so that there is a revenue column for each month.

Initial table:
+------+---------+-------+
| id   | revenue | month |
+------+---------+-------+
| 1    | 8000    | Jan   |
| 2    | 9000    | Jan   |
| 3    | 10000   | Feb   |
| 1    | 7000    | Feb   |
| 1    | 6000    | Mar   |
+------+---------+-------+

Result table:
+------+-------------+-------------+-------------+-----+-----------+
| id   | Jan_Revenue | Feb_Revenue | Mar_Revenue | ... | Dec_Revenue |
+------+-------------+-------------+-------------+-----+-----------+
| 1    | 8000        | 7000        | 6000        | ... | null        |
| 2    | 9000        | null        | null        | ... | null        |
| 3    | null        | 10000       | null        | ... | null        |
+------+-------------+-------------+-------------+-----+-----------+

2. SELECT DISTINCT

SELECT DISTINCT is something that you should always have at the back of your head. It’s extremely common to use SELECT DISTINCT statements with aggregate functions (which is #3).

For example, if you have a table that shows customer orders, you may be asked to calculate the average number of orders made per customer. In this case, you would want to count the total number of orders over the count of the total number of customers. It may look something like this:

SELECT
   COUNT(order_id) / COUNT(DISTINCT customer_id) as orders_per_cust
FROM
   customer_orders

#programming #education #technology #data-science #work #data analysis

Ten SQL Concepts You Should Know for Data Science Interviews
4.20 GEEK