Top 9 Feature Engineering Techniques with Python

In a previous couple of articles, we specifically focused on the performanceof machine learning models. First, we talked about how to quantify machine learning model performance and how to improve it with regularization. Then we covered the other optimization techniques, both basic ones like Gradient Descent and advanced ones, like Adam. Finally, we were able to see how to perform hyperparameter optimization and get the best “configuration” for your model.

However, what we haven’t considered so far is how we can improve performance by making modifications in the data itself. We were focused on the model. So far in our articles about SVMand clustering, we applied some techniques (like scaling) to our data, but we haven’t done a deeper analysis of this process and how manipulations with the dataset can help us with performance improvements. In this article we do exactly that, explore the most effective feature engineering techniques, that are often required in order to get good results.

Top 9 Feature Engineering Techniques with Python:

1. Imputation

2. Categorical Encoding

2.1 Label Encoding
2.2 One-Hot Encoding
2.3 Count Encoding
2.4 Target Encoding
2.5 Leave One Out Target Encoding

3. Handling Outliers

4. Binning

5. Scaling

5.1 Standard Scaling
5.2 Min-Max Scaling (Normalization)
5.3 Quantile Transformation

6. Log Transform

7. Feature Selection

8. Feature Grouping

9. Feature Split

#ai #machine learning #python #data science #datascience #deep learning #machine learning