In this post we will show how to use Prevision.io sdk to create a multi-classification use case using white wine quality data from the UCI Machine Learning Repository.
The machine learning objective is to predict white wine quality from its chemical characteristics such as (acidity, ph, density, sulphates ..)
Furthermore we will compare prevision performances with other self made coding algorithms, and show how we can compare both approachs within exactly the same scope (same cross validation/ test evaluation) despite the black box characteristic of the auto-ml solution offered by prevision platform.
Let’s get the dataset:
import pandas as pd df = pd.read_csv('winequality-white.csv', sep=';')
Lets create a sub-sample (about 20% of the overall dataset) that we will use as a holdout data-set, in order to evaluate the generalization error of our models. this sub-sample will be put aside, and not used for training. Hence, we will find out how well our models will perform on new data (not seen during the training phase).
It exist many ways to create the sample, the simplest is to use
train_test_split() of scikit learn
from sklearn.model_selection import train_test_split train_set, test_set = train_test_split(df, test_size=0.2, random_state=42)
During Feature engineering step you can construct two types of features:
The second kind of feature engineering is supported by the platform: once you launch the use case on your dataset, you can select transformations that you want to apply on your dataset, and they will be automatically computed and added as stand-alone features to get more information about the feature-transformation supported by the platform consult this link
Getting Started with scikit-learn Pipelines for Machine Learning: Building a pipeline from the ground up. (All code in this post is also included in this GitHub repository.)
Most popular Data Science and Machine Learning courses — August 2020. This list was last updated in August 2020 — and will be updated regularly so as to keep it relevant
AutoML makes the power of a Machine Learning algorithm available to you even if you don't have the complete knowledge of Machine Learning.You can use AutoML
Learning is a new fun in the field of Machine Learning and Data Science. In this article, we’ll be discussing 15 machine learning and data science projects.
This post will help you in finding different websites where you can easily get free Datasets to practice and develop projects in Data Science and Machine Learning.