In
this article, I wish to share my thoughts on what challenging data science problems we can solve which have business value amid Covid-19. Covid-19 is the Novel Coronavirus disease of 20**19 **[1]. This article is useful for both the data science enthusiasts to identify, formulate, solve the business problems, and to the leaders to instruct their teams to work on the data science problems relevant for their business. I observed_two main trend_s in the industry.
○ We have a business problem but not sure what data helps to solve it?
○ We have data but what business problems to solve with that data?
In both these cases, the starting point is different. However, both business problems and relevant data to solve it are equally important in the data science world. The focus of the article is:
What business problems to solve?
What are the different data sets relevant to COVID-19 available?
What techniques of Machine Learning / Deep Learning / Statistical techniques can be used to solve these problems and challenges involved?
Summarize the problems to solve specific to industry verticals
As mentioned in my earlier article [2], there are 7 types of data namely, **numerical, categorical, text, image, video, speech, and signals**irrespective of the industry/domain to build the data science problems.
Table-1 summarizes the type of problems to solve with _numerical, categorical _types of data, what data science techniques to use, and challenges in solving those problems. The core business problems to solve are “What is the impact of Covid-19 on my business?”, “How risky we are to get Covid-19 virus?”. These business problems are formulated as multi-step data science problems as listed in Table-1.
Table-1: List of problems to solve with Numerical and Categorical data amid Covid-19
Here is the list of available data sets to solve the above set of problems. Along with these data sets, you may require to use data specific to your organization which you will have access to. You can directly load the raw open-source data in your code (python notebook) or download .csv files and then load them for further processing.
https://github.com/CSSEGISandData/COVID-19
https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset
https://data.humdata.org/dataset/novel-coronavirus-2019-ncov-cases
There are interesting articles to process your loaded data as referred to in [3].
https://datahub.io/core/gdp#data
https://tradingeconomics.com/country-list/gdp-growth-rate
https://data.worldbank.org/topic/economy-and-growth
https://datahub.io/core/cpi#data
https://datahub.io/core/employment-us#data
https://datahub.io/core/population
https://datahub.io/core/gold-prices
https://www.kaggle.com/muthuj7/weather-dataset
https://datahub.io/core/house-prices-us#data
https://github.com/closedloop-ai/cv19index
https://www.kaggle.com/jaisimha34/covid19-drug-discovery/data
If you are comfortable with text processing, then these problems may be of interest to you. Table-2 summarizes a list of problems, the corresponding text data, techniques to solve those problems along with the list of challenges. Recent advances in Bidirectional Encoder Representations from Transformers (BERT) are playing a crucial role in solving these kinds of problems.
Table-2: List of problems to solve with Text data amid Covid-19
The available data set links are as follows:
https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
https://www.acaps.org/covid19-government-measures-dataset
https://github.com/yaqingwang/EANN-KDD18
#ai-in-covid #business-problem-in-covid #covid19 #data-science-in-covid #data-science-problems #data science