1595852280

Hyperparameter optimization is often one of the final steps in a data science project. Once you have a shortlist of promising models you will want to fine-tune them so that they perform better on your particular dataset.

In this post, we will go over three techniques used to find optimal hyperparameters with examples on how to implement them on models in Scikit-Learn and then finally a neural network in Keras. The three techniques we will discuss are as follows:

- Grid Search
- Randomized Search
- Bayesian Optimization

You can view the jupyter notebook here.

One option would be to fiddle around with the hyperparameters manually, until you find a great combination of hyperparameter values that optimize your performance metric. This would be very tedious work, and you may not have time to explore many combinations.

Instead, you should get Scikit-Learn’s `GridSearchCV`

to do it for you. All you have to do is tell it which hyperparameters you want to experiment with and what values to try out, and it will use cross-validation to evaluate all the possible combinations of hyperparameter values.

Let’s work through an example where we use `GridSearchCV`

to search for the best combination of hyperparameter values for a RandomForestClassifier trained using the popular MNIST dataset.

To give you a feel for the complexity of the classification task, the figure below shows a few images from the MNIST dataset:

To implement `GridSearchCV`

we need to define a few things. First being the hyperparameters we want to experiment with and the values we want to try out. Below we specify this in a dictionary called `param_grid`

.

```
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
param_grid = {'bootstrap': [True],
'max_depth': [6, 10],
'max_features': ['auto', 'sqrt'],
'min_samples_leaf': [3, 5],
'min_samples_split': [4, 6],
'n_estimators': [100, 350]
}
forest_clf = RandomForestClassifier()
forest_grid_search = GridSearchCV(forest_clf, param_grid, cv=5,
scoring="accuracy",
return_train_score=True,
verbose=True,
n_jobs=-1)
forest_grid_search.fit(X_train, y_train)
view raw
gridsearchcv.py hosted with ❤ by GitHub
```

The `param_grid`

tells Scikit-Learn to evaluate 1 x 2 x 2 x 2 x 2 x 2 = 32 combinations of `bootstrap`

, `max_depth`

, `max_features`

, `min_samples_leaf`

, `min_samples_split`

and `n_estimators`

hyperparameters specified. The grid search will explore 32 combinations of RandomForestClassifier’s hyperparameter values, and it will train each model 5 times (since we are using five-fold cross-validation). In other words, all in all, there will be 32 x 5 = 160 rounds of training! It may take a long time, but when it is done you can get the best combination of hyperparameters like this:

```
forest_grid_search.best_params_
```

{‘bootstrap’: True,

‘max_depth’: 10,

‘max_features’: ‘auto’,

‘min_samples_leaf’: 3,

‘min_samples_split’: 4,

‘n_estimators’: 350}

view raw

gridsearch.best_params_. hosted with ❤ by GitHub

```
Since n_estimators=350 and max_depth=10 are the maximum values that were evaluated, you should probably try searching again with higher values; the score may continue to improve.
You can also get the best estimator directly:
```

forest_grid_search.best_estimator_

```
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=10, max_features='auto', max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=3, min_samples_split=4,
min_weight_fraction_leaf=0.0, n_estimators=350,
n_jobs=None, oob_score=False, random_state=None,
verbose=0, warm_start=False)
view raw
best_estimator_ hosted with ❤ by GitHub
```
And of course the evaluation score is also available:
```
forest_grid_search.best_score_
0.9459
view raw
gridsearch.best_score_ hosted with ❤ by GitHub
```
Our best score here is 94.59% accuracy which is not bad for such a small parameter grid.
```

#machine-learning #classification #deep learning

1603753200

So far in our journey through the Machine Learning universe, we covered several big topics. We investigated some **regression** algorithms, **classification** algorithms and algorithms that can be used for both types of problems (**SVM****, ****Decision Trees** and Random Forest). Apart from that, we dipped our toes in unsupervised learning, saw how we can use this type of learning for **clustering** and learned about several clustering techniques.

We also talked about how to quantify machine learning model **performance** and how to improve it with **regularization**. In all these articles, we used Python for “from the scratch” implementations and libraries like **TensorFlow**, **Pytorch** and SciKit Learn. The word optimization popped out more than once in these articles, so in this and next article, we focus on optimization techniques which are an important part of the machine learning process.

In general, every machine learning algorithm is composed of three integral parts:

- A
**loss**function. - Optimization criteria based on the loss function, like a
**cost**function. **Optimization**technique – this process leverages training data to find a solution for optimization criteria (cost function).

As you were able to see in previous articles, some algorithms were created intuitively and didn’t have optimization criteria in mind. In fact, mathematical **explanations** of why and how these algorithms work were done later. Some of these algorithms are **Decision Trees** and **kNN**. Other algorithms, which were developed later had this thing in mind beforehand. **SVM**is one example.

During the training, we change the parameters of our machine learning model to try and **minimize** the loss function. However, the question of how do you change those parameters arises. Also, by how much should we change them during training and when. To answer all these questions we use **optimizers**. They put all different parts of the machine learning algorithm together. So far we mentioned **Gradient Decent** as an optimization technique, but we haven’t explored it in more detail. In this article, we focus on that and we cover the **grandfather** of all optimization techniques and its variation. Note that these techniques are **not** machine learning algorithms. They are solvers of **minimization** problems in which the function to minimize has a gradient in most points of its domain.

Data that we use in this article is the famous *Boston Housing Dataset* . This dataset is composed 14 features and contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It is a small **dataset** with only 506 samples.

For the purpose of this article, make sure that you have installed the following _Python _libraries:

- **NumPy **– Follow
**this guide**if you need help with installation. - **SciKit Learn **– Follow
**this guide**if you need help with installation. **Pandas**– Follow**this guide**if you need help with installation.

Once installed make sure that you have imported all the necessary modules that are used in this tutorial.

```
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import SGDRegressor
```

Apart from that, it would be good to be at least familiar with the basics of **linear algebra**, **calculus** and **probability**.

Note that we also use simple **Linear Regression** in all examples. Due to the fact that we explore **optimization**techniques, we picked the easiest machine learning algorithm. You can see more details about Linear regression **here**. As a quick reminder the formula for linear regression goes like this:

where *w* and *b* are parameters of the machine learning algorithm. The entire point of the training process is to set the correct values to the *w* and *b*, so we get the desired output from the machine learning model. This means that we are trying to make the value of our **error vector** as small as possible, i.e. to find a **global minimum of the cost function**.

One way of solving this problem is to use calculus. We could compute derivatives and then use them to find places where is an extrema of the cost function. However, the cost function is not a function of one or a few variables; it is a function of all parameters of a machine learning algorithm, so these calculations will quickly grow into a monster. That is why we use these optimizers.

#ai #machine learning #python #artificaial inteligance #artificial intelligence #batch gradient descent #data science #datascience #deep learning #from scratch #gradient descent #machine learning #machine learning optimizers #ml optimization #optimizers #scikit learn #software #software craft #software craftsmanship #software development #stochastic gradient descent

1618278600

Amilestone for open source projects — French President Emmanuel Macron has recently been introduced to Scikit-learn. In fact, in a recent tweet, Scikit-learn creator and Inria tenured research director, Gael Varoquaux announced the presentation of Scikit-Learn, with applications of machine learning in digital health, to the president of France.

He stated the advancement of this free software machine learning library — “started from the grassroots, built by a community, we are powering digital revolutions, adding transparency and independence.”

#news #application of scikit learn for machine learning #applications of scikit learn for digital health #scikit learn #scikit learn introduced to french president

1595422560

Welcome to DataFlair Keras Tutorial. This tutorial will introduce you to everything you need to know to get started with Keras. You will discover the characteristics, features, and various other properties of Keras. This article also explains the different neural network layers and the pre-trained models available in Keras. You will get the idea of how Keras makes it easier to try and experiment with new architectures in neural networks. And how Keras empowers new ideas and its implementation in a faster, efficient way.

Keras is an open-source deep learning framework developed in python. Developers favor Keras because it is user-friendly, modular, and extensible. Keras allows developers for fast experimentation with neural networks.

Keras is a high-level API and uses Tensorflow, Theano, or CNTK as its backend. It provides a very clean and easy way to create deep learning models.

Keras has the following characteristics:

- It is simple to use and consistent. Since we describe models in python, it is easy to code, compact, and easy to debug.
- Keras is based on minimal substructure, it tries to minimize the user actions for common use cases.
- Keras allows us to use multiple backends, provides GPU support on CUDA, and allows us to train models on multiple GPUs.
- It offers a consistent API that provides necessary feedback when an error occurs.
- Using Keras, you can customize the functionalities of your code up to a great extent. Even small customization makes a big change because these functionalities are deeply integrated with the low-level backend.

The following major benefits of using Keras over other deep learning frameworks are:

- The simple API structure of Keras is designed for both new developers and experts.
- The Keras interface is very user friendly and is pretty optimized for general use cases.
- In Keras, you can write custom blocks to extend it.
- Keras is the second most popular deep learning framework after TensorFlow.
- Tensorflow also provides Keras implementation using its tf.keras module. You can access all the functionalities of Keras in TensorFlow using tf.keras.

Before installing TensorFlow, you should have one of its backends. We prefer you to install Tensorflow. Install Tensorflow and Keras using pip python package installer.

The basic data structure of Keras is model, it defines how to organize layers. A simple type of model is the Sequential model, a sequential way of adding layers. For more flexible architecture, Keras provides a Functional API. Functional API allows you to take multiple inputs and produce outputs.

It allows you to define more complex models.

#keras tutorials #introduction to keras #keras models #keras tutorial #layers in keras #why learn keras

1595414040

This article is the spotlight on the need for python deep learning library, Keras. Keras offers a uniform face for various deep learning frameworks including Tensorflow, Theano, and MXNet. Let us see why you should choose and learn keras now.

Keras makes deep learning accessible and local on your computer.It also acts as a frontend for other big cloud providers. It is the most voted recommendation for beginners who want to start their journey in machine learning. It provides a minimal approach to run neural networks. This allows students to learn complex features from input data sequentially.

Let us see some of the features of keras that make you learn Keras.

Keras is the most easy to use the library for machine learning for beginners. Being simple helps it to bring machine learning from imaginations to reality. It provides an infrastructure that can be learned in very less time. Using Keras, you will be able to stack layers like experts.

Python is the most popular library for machine learning and Data Science. The compatibility with python allows Keras to have many useful features. Writing less code, easy to debug, easy to deploy, extensibility is due to the support of Keras with python 2.7 and python 3.6.

Keras being a high-level API provides support for multiple popular and powerful backend frameworks. Tensorflow, theano, CNTK are very dominant for backend computations and Keras supports all of them.

The importance of Keras leads to many other innovative tools to explore deep learning. These tools are built on top of Keras making Keras as the base. The following tools are:

- Deepjazz: This is deep learning-driven jazz built using Keras and theano, available on github.
- Eclipse Picasso: It is a visualization tool that works with Keras checkpoints.
- Auto Keras: It is built upon Keras and used for machine learning model automation.

- Keras allows us to switch between the backends as per the requirement of our applications. It acts as a wrapper that gives us the privilege to use either TensorFlow, theano, or any other framework.
- Keras is very easy and enjoyable to use. It uses great guiding principles like extensibility, python nativeness, and modularity.
- The ability of Keras to create the state of the art implementations of common deep neural networks. These are fast and it is easy to get them running using Keras.
- Being Keras user, you will be more faster and productive, you will have the ability to try more ideas.
- Keras provides Multi-GPU and strong distributed support. We can run our deep learning models on large GPU clusters.
- We can deploy Keras deep learning models on multiple platforms. For example, We can deploy in the browser using tensorflow.js, on the server using either TensorFlow serving or using Node.js runtime. On mobile devices i.e in android or IOS, we can deploy using TensorFlow Lite.
- Keras has a large ecosystem of products to support your deep learning development. Some of the popular products are Tensorflow Cloud, Keras Tuner, Tensorflow Lite,Tensorflow.js, and Tensorflow Model Optimizatio

#keras tutorials #importance of keras #keras features #learn keras #deep learning

1625043360

So far in our journey through the Machine Learning universe, we covered several big topics. We investigated some **regression** algorithms, **classification** algorithms and algorithms that can be used for both types of problems (**SVM, Decision Trees** and **Random Forest**). Apart from that, we dipped our toes in unsupervised learning, saw how we can use this type of learning for **clustering** and learned about several clustering techniques.

We also talked about how to quantify machine learning model **performance** and how to improve it with **regularization**. In all these articles, we used Python for “from the scratch” implementations and libraries like **TensorFlow**, **Pytorch** and **SciKit Learn**. The word optimization popped out more than once in these articles, so in this article, we focus on optimization techniques which are an important part of the machine learning process.

#ai #machine learning #python #artificaial inteligance #artificial intelligence #batch gradient descent #data science #datascience #deep learning #from scratch #gradient descent #machine learning optimizers #ml optimization #optimizers #scikit learn #software #software craft #software craftsmanship #software development