5 Things You Don’t Know about PyCaret

PyCaret

PyCaret is an open source machine learning library in Python to train and deploy supervised and unsupervised machine learning models in a low-code environment. It is known for its ease of use and efficiency.

In comparison with the other open source machine learning libraries, PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with a few words only.

You can tune “n parameter” in unsupervised experiments

In unsupervised machine learning the “n parameter” i.e. the number of clusters for clustering experiments, the fraction of the outliers in anomaly detection, and the number of topics in topic modeling, is of fundamental importance.

When the eventual objective of the experiment is to predict an outcome (classification or regression) using the results from the unsupervised experiments, then the tune_model() function in the **pycaret.clustering module, **the pycaret.anomaly module, and the **pycaret.nlp **modulecomes in very handy.

To understand this, let’s see an example using the “Kiva” dataset.

This is a micro-banking loan dataset where each row represents a borrower with their relevant information. Column ‘en’ captures the loan application text of each borrower, and the column ‘status’ represents whether the borrower defaulted or not (default = 1 or no default = 0).

You can use **tune_model **function in **pycaret.nlp **to optimize **num_topics **parameter based on the target variable of supervised experiment (i.e. predicting the optimum number of topics required to improve the prediction of the final target variable). You can define the model for training using estimator parameter (‘xgboost’ in this case). This function returns a trained topic model and a visual showing supervised metrics at each iteration.

#data-science #machine-learning #artificial-intelligence #python #pycaret

PyCaret

You can tune “n parameter” in unsupervised experiments

towardsdatascience.com

5 Things You Don’t Know about PyCaret