Exploratory Data Analysis on Heart Disease UCI data set**

| A complete step-by-step exploratory data analysis with simple explanation.

Motivation

Exploratory Data Analysis (EDA) is a pre-processing step to understand the data. There are numerous methods and steps in performing EDA, however, most of them are specific, focusing on either visualization or distribution, and are incomplete. Therefore, here, I will walk-through step-by-step to understand, explore, and extract the information from the data to answer the questions or assumptions. There are no structured steps or method to follow, however, this project will provide an insight on EDA for you and my future self.

Introduction

Cardiovascular diseases (CVDs) or heart disease are the number one cause of death globally with 17.9 million death cases each year. CVDs are concertedly contributed by hypertension, diabetes, overweight and unhealthy lifestyles. You can read more on the heart disease statistics and causes for self-understanding. This project covers manual exploratory data analysis and using pandas profiling in Jupyter Notebook, on Google Colab. The dataset used in this project is UCI Heart Disease dataset, and both data and code for this project are available on my GitHub repository.

Data Set Explanations

Initially, the dataset contains 76 features or attributes from 303 patients; however, published studies chose only 14 features that are relevant in predicting heart disease. Hence, here we will be using the dataset consisting of 303 patients with 14 features set.

#google-colab #exploratory-data-analysis #data-science #heart-disease #python

Motivation

Introduction

Data Set Explanations

towardsdatascience.com

Exploratory Data Analysis on Heart Disease UCI data set**