1595852280
Hyperparameter optimization is often one of the final steps in a data science project. Once you have a shortlist of promising models you will want to fine-tune them so that they perform better on your particular dataset.
In this post, we will go over three techniques used to find optimal hyperparameters with examples on how to implement them on models in Scikit-Learn and then finally a neural network in Keras. The three techniques we will discuss are as follows:
You can view the jupyter notebook here.
One option would be to fiddle around with the hyperparameters manually, until you find a great combination of hyperparameter values that optimize your performance metric. This would be very tedious work, and you may not have time to explore many combinations.
Instead, you should get Scikit-Learn’s GridSearchCV
to do it for you. All you have to do is tell it which hyperparameters you want to experiment with and what values to try out, and it will use cross-validation to evaluate all the possible combinations of hyperparameter values.
Let’s work through an example where we use GridSearchCV
to search for the best combination of hyperparameter values for a RandomForestClassifier trained using the popular MNIST dataset.
To give you a feel for the complexity of the classification task, the figure below shows a few images from the MNIST dataset:
To implement GridSearchCV
we need to define a few things. First being the hyperparameters we want to experiment with and the values we want to try out. Below we specify this in a dictionary called param_grid
.
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
param_grid = {'bootstrap': [True],
'max_depth': [6, 10],
'max_features': ['auto', 'sqrt'],
'min_samples_leaf': [3, 5],
'min_samples_split': [4, 6],
'n_estimators': [100, 350]
}
forest_clf = RandomForestClassifier()
forest_grid_search = GridSearchCV(forest_clf, param_grid, cv=5,
scoring="accuracy",
return_train_score=True,
verbose=True,
n_jobs=-1)
forest_grid_search.fit(X_train, y_train)
view raw
gridsearchcv.py hosted with ❤ by GitHub
The param_grid
tells Scikit-Learn to evaluate 1 x 2 x 2 x 2 x 2 x 2 = 32 combinations of bootstrap
, max_depth
, max_features
, min_samples_leaf
, min_samples_split
and n_estimators
hyperparameters specified. The grid search will explore 32 combinations of RandomForestClassifier’s hyperparameter values, and it will train each model 5 times (since we are using five-fold cross-validation). In other words, all in all, there will be 32 x 5 = 160 rounds of training! It may take a long time, but when it is done you can get the best combination of hyperparameters like this:
forest_grid_search.best_params_
{‘bootstrap’: True,
‘max_depth’: 10,
‘max_features’: ‘auto’,
‘min_samples_leaf’: 3,
‘min_samples_split’: 4,
‘n_estimators’: 350}
view raw
gridsearch.best_params_. hosted with ❤ by GitHub
Since n_estimators=350 and max_depth=10 are the maximum values that were evaluated, you should probably try searching again with higher values; the score may continue to improve.
You can also get the best estimator directly:
forest_grid_search.best_estimator_
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=10, max_features='auto', max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=3, min_samples_split=4,
min_weight_fraction_leaf=0.0, n_estimators=350,
n_jobs=None, oob_score=False, random_state=None,
verbose=0, warm_start=False)
view raw
best_estimator_ hosted with ❤ by GitHub
```
And of course the evaluation score is also available:
```
forest_grid_search.best_score_
0.9459
view raw
gridsearch.best_score_ hosted with ❤ by GitHub
```
Our best score here is 94.59% accuracy which is not bad for such a small parameter grid.
#machine-learning #classification #deep learning
1664885780
Very nice.
You can also watch my Keras tuner hyper parameters tutorial :
https://bit.ly/3BLpJEu
Eran
1603753200
So far in our journey through the Machine Learning universe, we covered several big topics. We investigated some regression algorithms, classification algorithms and algorithms that can be used for both types of problems (SVM**, **Decision Trees and Random Forest). Apart from that, we dipped our toes in unsupervised learning, saw how we can use this type of learning for clustering and learned about several clustering techniques.
We also talked about how to quantify machine learning model performance and how to improve it with regularization. In all these articles, we used Python for “from the scratch” implementations and libraries like TensorFlow, Pytorch and SciKit Learn. The word optimization popped out more than once in these articles, so in this and next article, we focus on optimization techniques which are an important part of the machine learning process.
In general, every machine learning algorithm is composed of three integral parts:
As you were able to see in previous articles, some algorithms were created intuitively and didn’t have optimization criteria in mind. In fact, mathematical explanations of why and how these algorithms work were done later. Some of these algorithms are Decision Trees and kNN. Other algorithms, which were developed later had this thing in mind beforehand. SVMis one example.
During the training, we change the parameters of our machine learning model to try and minimize the loss function. However, the question of how do you change those parameters arises. Also, by how much should we change them during training and when. To answer all these questions we use optimizers. They put all different parts of the machine learning algorithm together. So far we mentioned Gradient Decent as an optimization technique, but we haven’t explored it in more detail. In this article, we focus on that and we cover the grandfather of all optimization techniques and its variation. Note that these techniques are not machine learning algorithms. They are solvers of minimization problems in which the function to minimize has a gradient in most points of its domain.
Data that we use in this article is the famous Boston Housing Dataset . This dataset is composed 14 features and contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It is a small dataset with only 506 samples.
For the purpose of this article, make sure that you have installed the following _Python _libraries:
Once installed make sure that you have imported all the necessary modules that are used in this tutorial.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import SGDRegressor
Apart from that, it would be good to be at least familiar with the basics of linear algebra, calculus and probability.
Note that we also use simple Linear Regression in all examples. Due to the fact that we explore optimizationtechniques, we picked the easiest machine learning algorithm. You can see more details about Linear regression here. As a quick reminder the formula for linear regression goes like this:
where w and b are parameters of the machine learning algorithm. The entire point of the training process is to set the correct values to the w and b, so we get the desired output from the machine learning model. This means that we are trying to make the value of our error vector as small as possible, i.e. to find a global minimum of the cost function.
One way of solving this problem is to use calculus. We could compute derivatives and then use them to find places where is an extrema of the cost function. However, the cost function is not a function of one or a few variables; it is a function of all parameters of a machine learning algorithm, so these calculations will quickly grow into a monster. That is why we use these optimizers.
#ai #machine learning #python #artificaial inteligance #artificial intelligence #batch gradient descent #data science #datascience #deep learning #from scratch #gradient descent #machine learning #machine learning optimizers #ml optimization #optimizers #scikit learn #software #software craft #software craftsmanship #software development #stochastic gradient descent
1618278600
Amilestone for open source projects — French President Emmanuel Macron has recently been introduced to Scikit-learn. In fact, in a recent tweet, Scikit-learn creator and Inria tenured research director, Gael Varoquaux announced the presentation of Scikit-Learn, with applications of machine learning in digital health, to the president of France.
He stated the advancement of this free software machine learning library — “started from the grassroots, built by a community, we are powering digital revolutions, adding transparency and independence.”
#news #application of scikit learn for machine learning #applications of scikit learn for digital health #scikit learn #scikit learn introduced to french president
1595422560
Welcome to DataFlair Keras Tutorial. This tutorial will introduce you to everything you need to know to get started with Keras. You will discover the characteristics, features, and various other properties of Keras. This article also explains the different neural network layers and the pre-trained models available in Keras. You will get the idea of how Keras makes it easier to try and experiment with new architectures in neural networks. And how Keras empowers new ideas and its implementation in a faster, efficient way.
Keras is an open-source deep learning framework developed in python. Developers favor Keras because it is user-friendly, modular, and extensible. Keras allows developers for fast experimentation with neural networks.
Keras is a high-level API and uses Tensorflow, Theano, or CNTK as its backend. It provides a very clean and easy way to create deep learning models.
Keras has the following characteristics:
The following major benefits of using Keras over other deep learning frameworks are:
Before installing TensorFlow, you should have one of its backends. We prefer you to install Tensorflow. Install Tensorflow and Keras using pip python package installer.
The basic data structure of Keras is model, it defines how to organize layers. A simple type of model is the Sequential model, a sequential way of adding layers. For more flexible architecture, Keras provides a Functional API. Functional API allows you to take multiple inputs and produce outputs.
It allows you to define more complex models.
#keras tutorials #introduction to keras #keras models #keras tutorial #layers in keras #why learn keras
1595414040
This article is the spotlight on the need for python deep learning library, Keras. Keras offers a uniform face for various deep learning frameworks including Tensorflow, Theano, and MXNet. Let us see why you should choose and learn keras now.
Keras makes deep learning accessible and local on your computer.It also acts as a frontend for other big cloud providers. It is the most voted recommendation for beginners who want to start their journey in machine learning. It provides a minimal approach to run neural networks. This allows students to learn complex features from input data sequentially.
Let us see some of the features of keras that make you learn Keras.
Keras is the most easy to use the library for machine learning for beginners. Being simple helps it to bring machine learning from imaginations to reality. It provides an infrastructure that can be learned in very less time. Using Keras, you will be able to stack layers like experts.
Python is the most popular library for machine learning and Data Science. The compatibility with python allows Keras to have many useful features. Writing less code, easy to debug, easy to deploy, extensibility is due to the support of Keras with python 2.7 and python 3.6.
Keras being a high-level API provides support for multiple popular and powerful backend frameworks. Tensorflow, theano, CNTK are very dominant for backend computations and Keras supports all of them.
The importance of Keras leads to many other innovative tools to explore deep learning. These tools are built on top of Keras making Keras as the base. The following tools are:
#keras tutorials #importance of keras #keras features #learn keras #deep learning
1625043360
So far in our journey through the Machine Learning universe, we covered several big topics. We investigated some regression algorithms, classification algorithms and algorithms that can be used for both types of problems (SVM, Decision Trees and Random Forest). Apart from that, we dipped our toes in unsupervised learning, saw how we can use this type of learning for clustering and learned about several clustering techniques.
We also talked about how to quantify machine learning model performance and how to improve it with regularization. In all these articles, we used Python for “from the scratch” implementations and libraries like TensorFlow, Pytorch and SciKit Learn. The word optimization popped out more than once in these articles, so in this article, we focus on optimization techniques which are an important part of the machine learning process.
#ai #machine learning #python #artificaial inteligance #artificial intelligence #batch gradient descent #data science #datascience #deep learning #from scratch #gradient descent #machine learning optimizers #ml optimization #optimizers #scikit learn #software #software craft #software craftsmanship #software development