Survival analysis for predictive maintenance of turbofan engines

<disclaimer: I aim to showcase the effect of different methods and choices made during model development. These effects are often shown using the test set, something which is considered (very) bad practice but helps for educational purposes.>

Welcome to another installment of the ‘Exploring NASA’s turbofan dataset’ series. This will be the fourth and final analysis on the first dataset (FD001), in which all engines run on the same operating condition and develop the same fault.

In my last post we delved into time-series analysis and explored distributed lag models for predictive maintenance. The final model performed quite well with an RMSE of 20.85. Today we’ll explore survival analysis. A technique I’m eager to try, as I’ve heard and read multiple times it could be a suitable approach for predictive maintenance. However, I have never encountered an example implementation which satisfied my curiosity. First, what is survival analysis exactly?

Survival analysis primer

Survival analysis originated within the medical sector to answer questions about the lifetimes of specific populations. If you know someone’s age and can predict someone’s lifetime, you can also estimate how much time that person has left to live. This technique is applied within epidemiology or studies for disease treatment for example. However, it can also be applied to many other cases where the data consists of duration and time-based events, such as churn prediction and predictive maintenance. Below I quickly summarize a few key concepts used within survival analysis [1, 2]:

Event: The occurrence of a phenomenon of interest, in our case the breakdown of an engine.

Duration: The duration refers to the time of beginning of the observation till the event or stopping of the observation

Censoring: Censoring occurs when the observations have stopped but the subject of interest did not have their ‘event’ yet.

Survival function: The survival function returns the probability of survival at/past time t

Hazard function: The hazard function returns the probability of the event occurring at time t, provided the event has not occurred yet until time t

One of the appealing aspects of survival analysis for me, is the possibility to include subjects (or in our case machines) in the model which did not have their event yet. In more traditional machine learning you would discard ‘incomplete’ or censored subjects from your dataset, which can bias results [3]. With some of the basics explained, it’s time to get started!

#data-science #machine-learning #survival-analysis

Survival analysis primer

towardsdatascience.com

Survival analysis for predictive maintenance of turbofan engines