Speeding up a sklearn model pipeline to serve single predictions with very low latency

If you have worked with sklearn before you certainly came across the struggles between using dataframes or arrays as inputs to your transformers and estimators. Both bring their advantages and disadvantages. But once you deploy your model, for example as a service, in many cases it will serve single predictions. Max Halford has shown some great examples on how to improve various sklearn transformers and estimators to serve single predictions with an extra performance boost and potential responses in low millisecond range! In this short post we will advance these tricks and develop a full pipeline.

A few months ago Max Halford wrote an awesome blogpost where he described how we can modify sklearn transformers and estimators to handle single data points at a higher speed, essentially using one-dimensional arrays. When you build sklearn model pipelines they usually work with numpy arrays and pandas dataframes at the same time. Arrays often provide better performance, because the numpy implementations for many computations are high performant and often vectorized. But it also gets trickier to control your transformations using column names, which the arrays do not have. If you use pandas dataframes you might get worse performance, but your code might get more readable and column names (i.e. feature names) stick with the data for most transformers. During data exploration and model training you are mostly interested in batch transformations and predictions, but once you deploy your trained model pipeline as a service, you might also be interested in single predictions. In both cases service users will send a payload like below.

#machine-learning #python #sklearn #pipeline #performance

levelup.gitconnected.com

Speeding up a sklearn model pipeline to serve single predictions with very low latency