In 2012, Harvard Business Review declared the role of Data Scientist to be “the sexiest job of the 21st century.” Glassdoor added fuel to the fire in 2016 by listing data scientist at number one on their list of the best jobs of the year. Their number one job in 2019? You guessed it: data scientist. So what is it about these “high-ranking professionals with the training and curiosity to make discoveries in the world of big data” that is just so damn sexy? For starters, there’s more data than ever, and most of that data has been created extremely recently. In 2017, Data management platform Domo estimated that 90% of all data had been created within the prior two years — to the tune of 2.5 quintillion bytes of data per day. It’s no wonder that companies are interested in discovering how best to use this unprecedented volume of data to their benefit. However, voluminous data isn’t necessarily relevant data. This incongruity has led to companies scrambling over the past decade to implement programs to interpret this data and generate actionable insights. As Duke economist Dan Ariely stated in 2013:

“Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…”

Data scientists, therefore, have become highly valuable for their ability to make sense of this sea of information. By drawing connections between different data points and assessing the bigger picture of why that data might matter, data scientists are able to help companies surface meaningful insights that can lead to better decisions.

Why Data is Meaningless Without Context

On its surface, data is simply a collection of numbers, words, etc. It is only when data is presented with a context that it becomes meaningful. Take, for example, this Google trends graph:

Image for post

Without any context, it’s completely meaningless. There is certainly something cyclical to the frequency, but it’s not entirely clear what or why that frequency is occurring, or why it matters. Now, if I told you that the trend graph was for the search term “pie,” you could begin to start drawing meaningful conclusions about the data: namely, something every year is causing a spike in searches for the term “pie.” If I continue to provide contextual clues, the data becomes more and more relevant. For example, allow me to further clarify that the search is limited to the United States, and the spikes seem to be focused on one major and one minor spike:

Image for post

Image for post

Knowing that searches for “pie” in the United States significantly peak in November every year, we can begin to make inferences about what the data means. At this point, you might be saying to yourself “clearly, searches for pie are spiking significantly around Thanksgiving.” However, this is where context becomes critically important. If you were born and raised in the US, chances are that you have the context to know that Thanksgiving is in November, and that pie is often served at Thanksgiving, therefore the spike in pie searches directly correlates to people planning their Thanksgiving celebrations. However, if you were not raised in the US, you might lack the contextual reference to understand why pie searches peak in November in the US.

So what about that minor spike in March? What you may have guessed at this point is that the spike in March is likely a result of the lesser-known holiday of Pi Day that occurs every March 14th (3.14…get it?). Once again, this data is only insightful to those that possess the proper context to provide relevance. If you’ve never heard of Pi Day, the March data spike might not provide you with much insight.

With the proper context, decisions can be made based upon the data, but also based upon what the data means. In our pie example, the CMO of a hypothetical pie manufacturer would be armed with information about traffic habits (data), as well as topical interests related to those habits (context). Taken together, this could support an ad campaign where the data informs when and how to reach consumers, and the context informs what they might find relevant and interesting.

It bears mentioning that this example is an oversimplification, and likely would require additional data and context through both quantitative and qualitative means. Regardless, hopefully it’s becoming clear that those sexy data scientists can help improve decision making by researching trends and contextual clues. What may be less clear is how they could possibly begin to do this with 2.5 quintillion bytes of daily data. This is where technology enters the equation.

#ai #data #machine-learning #emerging-technology #decision-making #data analysis

Layers of Data: How Context Helps Make Better Decisions
1.20 GEEK