No matter what kind of data science project one is assigned to, making sense of the dataset and cleaning it always critical for success. The first step is to understand the data using exploratory data analysis (EDA)as it helps us create the logical approach for solving the business problem. It also allows us to identify the issues like outliers existing in our dataset.

It is necessary to clean up these issues before starting any analysis because if our data is spewing garbage, so will our analysis. Moreover, the insights from such an analysis won’t tie up with the theoretical or business knowledge of our clients and they may lose confidence in our work.Even if the clients end up making a decision based on such an analysis but the end result will turn out to be wrong and we will be in a lot of trouble! Thus, how well we clean and understand the data has a tremendous impact on the quality of the results.

Things get slightly more complicated when we deal with datasets having hidden properties like time series datasets. Time series datasets is a special type of data which is ordered chronologically and needs special attention for handling it’s intrinsic elements like trend and seasonality.


For these reasons, We will be focusing on a step-by-step guideline that walks through the EDA and data cleaning process one can follow while working with multivariate time series data.

#data-science #time-series-analysis #outlier-detection #python #forecasting

Cleaning and Understanding Multivariate Time Series Data
7.85 GEEK