The Power of Pickletools : Handling Large Model Pickle Files

The volume of data is increasing. The more data, the more we can use it for solving different problems. Suppose you train a machine learning model for some solution and want to save it for later predictions. Here come some serialization-deserialization methods: Pickling (Pickle, cPickle, Joblib, JsonPickle, Dill, Mlflow), saving to PMML format as pipeline (sklearn2pmml), saving as JSON format (sklearn_json, sklearn_export, msgpack, JsonPickle).
You may also use m2cgen library to export your model to python/java/c++ code or write your own code to serialize and de-serialize your model.
When you build a machine learning model to solve some problem by using huge data, the model is gonna be huge too! So the problem arises when you try to save this ginormous encoding to your system (I am going to talk here about the large RandomForestClassifier model/pickle file I tried to take care of).

#big-data #python #pickles #serialization #data-science

towardsdatascience.com

The Power of Pickletools : Handling Large Model Pickle Files