When analyzing and modelling data, a significant amount of time is spent preparing the data: loading, cleansing, transforming, and reorganizing. These tasks are often reported to take 80% or more of an analyst’s time. Sometimes the way data is stored in files or databases is not in the right format for a particular task. In this article, I will take you through the techniques of data preparation data cleaning with Python.
Fortunately, pandas, along with the built-in features of the Python language, provide you with a high-level, flexible and fast set of tools to let you manipulate data in the right form. So we only need pandas and some functions of Numpy for data cleaning with Python.
#machine learning #data science #python