Let’s see one by one how these steps play a vital role in learning data science in flow~
The first stage of the framework is to develop a business understanding. For this, you have to carry out two steps:
Same way learners have to select the Business domain in which they are planning to become a data scientist and then try to become SME of that particular domain. Without business domain knowledge they unable to meet the Step1 requirements and eventually would face difficulties in other steps. Once we know the business we can easily apply our data science skills to drive more value out of that business data.
_Some trending domains: _Healthcare, Fintech, Real Estate, E-commerce, EduTech etc.
This stage comprises of two key steps to understand the available data and identify new relevant data in order to solve the business problem.
For describing data learner needs to know one programming language(R ,Python or sas) and excel so that they can easily do first-level analysis and then can create data dictionary.
_For exploring data learners needs to know __EDA -Exploratory data analysis _which could be done with the help of statistics knowledge.
“Give me six hours to chop down a tree and I will spend the first four sharpening the axe.”
Data preparation is the most important and time-consuming step in this data needs to be prepared by doing some data preprocessing like data transformation, aggregation etc. We can create new attributes using our existing here new attributes are called derived attributes, eg. deriving age from dob etc.
The data preparation has various rigorous steps including the following:
#data-scientist #data-science #data #data-analysis #statistics #data analysis