Predicting Heart Failure Survival with Machine Learning Models 

Predicting Heart Failure Survival with Machine Learning Models 

A step-by-step pythonic walk-through of the analysis of survival data along with domain-level explanation. Cardiovascular diseases are diseases of the heart and blood vessels and they typically include heart attacks, strokes, and heart failures [1]. According to the World Health Organization (WHO), cardiovascular diseases like ischaemic heart disease and stroke have been the leading causes of deaths worldwide for the last decade and a half [2].

Preface

Cardiovascular diseases are diseases of the heart and blood vessels and they typically include heart attacks, strokes, and heart failures [1]. According to the World Health Organization (WHO), cardiovascular diseases like ischaemic heart disease and stroke have been the leading causes of deaths worldwide for the last decade and a half [2].


Motivation

A few months ago, a new heart failure dataset was uploaded on Kaggle. This dataset contained health records of 299 anonymized patients and had 12 clinical and lifestyle features. The task was to predict heart failure using these features.

Through this post, I aim to document my workflow on this task and present it as a research exercise. So this would naturally involve a bit of domain knowledge, references to journal papers, and deriving insights from them.

Warning: This post is nearly 10 minutes long and things may get a little dense as you scroll down, but I encourage you to give it a shot.


About the data

The dataset was originally released by Ahmed et al., in 2017 [3] as a supplement to their analysis of survival of heart failure patients at Faisalabad Institute of Cardiology and at the Allied Hospital in Faisalabad, Pakistan. The dataset was subsequently accessed and analyzed by Chicco and Jurman in 2020 to predict heart failures using a bunch of machine learning techniques [4]. The dataset hosted on Kaggle cites these authors and their research paper.

The dataset primarily consists of clinical and lifestyle features of 105 female and 194 male heart failure patients. You can find each feature explained in the figure below.

Image for post

Fig. 1 — Clinical and lifestyle features of 299 patients in the dataset (credit: author)

Project Workflow

The workflow would be pretty straightforward —

  1. *Data Preprocessing — *Cleaning the data, imputing missing values, creating new features if needed, etc.
  2. *Exploratory Data Analysis — *This would involve summary statistics, plotting relationships, mapping trends, etc.
  3. *Model Building — *Building a baseline prediction model, followed by at least 2 classification models to train and test.

heart-disease data-science machine-learning exploratory-data-analysis data-visualization

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Exploratory Data Analysis is a significant part of Data Science

You will discover Exploratory Data Analysis (EDA), the techniques and tactics that you can use, and why you should be performing EDA on your next problem.

Exploratory Data Analysis is a significant part of Data Science

Data science is omnipresent to advanced statistical and machine learning methods. For whatever length of time that there is data to analyse, the need to investigate is obvious.

15 Machine Learning and Data Science Project Ideas with Datasets

Learning is a new fun in the field of Machine Learning and Data Science. In this article, we’ll be discussing 15 machine learning and data science projects.

Exploratory Data Analysis

Suppose you are looking to book a flight ticket for a trip of yours. Now, you will not go directly to a specific site and book the first ticket that you see.

Exploratory Data Analysis on Heart Disease UCI data set**

A complete step-by-step exploratory data analysis with simple explanation. The dataset used in this project is UCI Heart Disease dataset, and both data and code for this project are available on my GitHub repository.