The importance of problem framing for supervised predictive maintenance solutions

Revisiting our assumption of Remaining Useful Life & Support Vector Regression. Today we’ll re-examine our assumption of RUL to improve our accuracy and fit a Support Vector Regression (SVR) in an attempt to further improve upon our results.

<disclaimer: I aim to showcase the effect of different methods and choices made during model development. These effects are often shown using the test set, something which is considered (very) bad practice but helps for educational purposes.>

In my last post we explored NASA’s FD001 turbofan degradation dataset. To quickly recap, sensors 1, 5, 6, 10, 16, 18 and 19 held no information related to Remaining Useful Life (RUL). After removing these from the data we fitted a baseline Linear Regression model with an RMSE of 31.95. Today we’ll re-examine our assumption of RUL to improve our accuracy and fit a Support Vector Regression (SVR) in an attempt to further improve upon our results. Let’s get started!

First, we’ll load the data and inspect the first few rows to confirm it loaded correctly.

%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()

from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error, r2_score

dir_path = './CMAPSSData/'

## define column names for easy indexing
index_names = ['unit_nr', 'time_cycles']
setting_names = ['setting_1', 'setting_2', 'setting_3']
sensor_names = ['s_{}'.format(i) for i in range(1,22)]
col_names = index_names + setting_names + sensor_names

## inspect first few rows

```

