Rusty  Shanahan

Rusty Shanahan

1598062920

Fine Tuning XGBoost model

Tuning the model is the way to supercharge the model to increase their performance. Let us look into an example where there is a comparison between the untuned XGBoost model and tuned XGBoost model based on their RMSE score. Later, you will know about the description of the hyperparameters in XGBoost.

Below is the code example for untuned parameters in XGBoost model:

#Importing necessary libraries
	import pandas as pd
	import numpy as np 
	import xgboost as xg

	#Load the data
	house = pd.read_csv("ames_housing_trimmed_pricessed.csv")
	X,y = house[house.columns.tolist()[:-1]],
	            house[house.columns.tolist()[-1]]

	#Converting it into DMatrix
	house_dmatrix = xgb.DMatrix(data = X, label = y)

	#Parameter configuration
	param_untuned = {"objective":"reg:linear"}

	cv_untuned_rmse = xg.cv(dtrain = house_dmatrix, params = param_untuned, nfold = 4, 
	                        metrics = "rmse", as_pandas = True, seed= 123)
	print("RMSE Untuned: %f" %((cv_untuned_rmse["test-rmse-mean"]).tail(1)))
view raw
tune_1.py hosted with ❤ by GitHub

Output: 34624.229980

Now let us look to the value of RMSE when the parameters are tuned to some extent:

#Importing necessary libraries
	import pandas as pd
	import numpy as np 
	import xgboost as xg

	#Load the data
	house = pd.read_csv("ames_housing_trimmed_pricessed.csv")
	X,y = house[house.columns.tolist()[:-1]],
	            house[house.columns.tolist()[-1]]

	#Converting it into DMatrix
	house_dmatrix = xgb.DMatrix(data = X, label = y)

	#Parameter Configuration
	param_tuned = {"objective":"reg:linear", 'colsample_bytree': 0.3,
	               'learning_rate': 0.1, 'max_depth': 5}

	cv_tuned_rmse = xg.cv(dtrain = house_dmatrix, params = param_tuned, nfold = 4,
	                      num_boost_round = 200, metrics = "rmse", as_pandas = True, seed= 123)
	print("RMSE Tuned: %f" %((cv_tuned_rmse["test-rmse-mean"]).tail(1)))
view raw
tune_2.py hosted with ❤ by GitHub

Output: 29812.683594

It can be seen that there is around 15% reduction in the RMSE score when the parameters got tuned.

#machine-learning #hyperparameter #artificial-intelligence #hyperparameter-tuning #xgboost #deep learning

What is GEEK

Buddha Community

Fine Tuning XGBoost model
Rusty  Shanahan

Rusty Shanahan

1598062920

Fine Tuning XGBoost model

Tuning the model is the way to supercharge the model to increase their performance. Let us look into an example where there is a comparison between the untuned XGBoost model and tuned XGBoost model based on their RMSE score. Later, you will know about the description of the hyperparameters in XGBoost.

Below is the code example for untuned parameters in XGBoost model:

#Importing necessary libraries
	import pandas as pd
	import numpy as np 
	import xgboost as xg

	#Load the data
	house = pd.read_csv("ames_housing_trimmed_pricessed.csv")
	X,y = house[house.columns.tolist()[:-1]],
	            house[house.columns.tolist()[-1]]

	#Converting it into DMatrix
	house_dmatrix = xgb.DMatrix(data = X, label = y)

	#Parameter configuration
	param_untuned = {"objective":"reg:linear"}

	cv_untuned_rmse = xg.cv(dtrain = house_dmatrix, params = param_untuned, nfold = 4, 
	                        metrics = "rmse", as_pandas = True, seed= 123)
	print("RMSE Untuned: %f" %((cv_untuned_rmse["test-rmse-mean"]).tail(1)))
view raw
tune_1.py hosted with ❤ by GitHub

Output: 34624.229980

Now let us look to the value of RMSE when the parameters are tuned to some extent:

#Importing necessary libraries
	import pandas as pd
	import numpy as np 
	import xgboost as xg

	#Load the data
	house = pd.read_csv("ames_housing_trimmed_pricessed.csv")
	X,y = house[house.columns.tolist()[:-1]],
	            house[house.columns.tolist()[-1]]

	#Converting it into DMatrix
	house_dmatrix = xgb.DMatrix(data = X, label = y)

	#Parameter Configuration
	param_tuned = {"objective":"reg:linear", 'colsample_bytree': 0.3,
	               'learning_rate': 0.1, 'max_depth': 5}

	cv_tuned_rmse = xg.cv(dtrain = house_dmatrix, params = param_tuned, nfold = 4,
	                      num_boost_round = 200, metrics = "rmse", as_pandas = True, seed= 123)
	print("RMSE Tuned: %f" %((cv_tuned_rmse["test-rmse-mean"]).tail(1)))
view raw
tune_2.py hosted with ❤ by GitHub

Output: 29812.683594

It can be seen that there is around 15% reduction in the RMSE score when the parameters got tuned.

#machine-learning #hyperparameter #artificial-intelligence #hyperparameter-tuning #xgboost #deep learning

Michael  Hamill

Michael Hamill

1617331277

Workshop Alert! Deep Learning Model Deployment & Management

The Association of Data Scientists (AdaSci), the premier global professional body of data science and ML practitioners, has announced a hands-on workshop on deep learning model deployment on February 6, Saturday.

Over the last few years, the applications of deep learning models have increased exponentially, with use cases ranging from automated driving, fraud detection, healthcare, voice assistants, machine translation and text generation.

Typically, when data scientists start machine learning model development, they mostly focus on the algorithms to use, feature engineering process, and hyperparameters to make the model more accurate. However, model deployment is the most critical step in the machine learning pipeline. As a matter of fact, models can only be beneficial to a business if deployed and managed correctly. Model deployment or management is probably the most under discussed topic.

In this workshop, the attendees get to learn about ML lifecycle, from gathering data to the deployment of models. Researchers and data scientists can build a pipeline to log and deploy machine learning models. Alongside, they will be able to learn about the challenges associated with machine learning models in production and handling different toolkits to track and monitor these models once deployed.

#hands on deep learning #machine learning model deployment #machine learning models #model deployment #model deployment workshop

Bret Kinley

1583299685

BI Database Modeling Tools

At SqlDBM BI database modeling tool help organizations to improve their decision and Analyze billions of records in seconds. Currently " Data Warehouse” is currently trending topic in the data area. We will covering what a Data Warehouse is and how it is created from a SQL script. Visit us to get know more about BI modeling Tools and how it work with SQL.

#export data model #SQL Server BI Modeling #BI modeling Tools #SQL Server Business Intelligence Modeling Tool

Efficient Hyperparameter Optimization for XGBoost model Using Optuna

Introduction :

Hyperparameter optimization is the science of tuning or choosing the best set of hyperparameters for a learning algorithm. A set of optimal hyperparameter has a big impact on the performance of any machine learning algorithm. It is one of the most time-consuming yet a crucial step in machine learning training pipeline.

A Machine learning model has two types of tunable parameter :

· Model parameters

· Model hyperparameters

Image for post

Model parameters vs Model hyperparameters (source)

Model parameters are learned during the training phase of a model or classifier. For example :

  • coefficients in logistic regression or linear regression
  • weights in an artificial neural network

**_Model Hyperparameters _**are set by the user before the model training phase. For example :

  • ‘c’ (regularization strength), ‘penalty’ and ‘solver’ in logistic regression
  • ‘learning rate’, ‘batch size’, ‘number of hidden layers’ etc. in an artificial neural network

The choice of Machine learning model depends on the dataset, the task in hand i.e. prediction or classification. Each model has its own unique set of hyperparameter and the task of finding the best combination of these parameters is known as hyperparameter optimization.

For solving hyperparameter optimization problem there are various methods are available. For example :

  • Grid Search
  • Random Search
  • Optuna
  • HyperOpt

In this post, we will focus on Optuna library which has one of the most accurate and successful hyperparameter optimization strategy.

#hyperparameter-tuning #optimization-algorithms #xgboost #optuna #machine-learning #algorithms

Kawsar  Ahmed

Kawsar Ahmed

1612457100

Tuning Model Hyper-Parameters for XGBoost and Kaggle

Properly setting the parameters for XGBoost can give increased model accuracy/performance. This is a very important technique for both Kaggle competitions and data science in general. In this video I will show how I automated one popular technique for XGBoost.

Code is here: https://github.com/jeffheaton/jh-kaggle-util

Subscribe: https://www.youtube.com/channel/UCR1-GEpyOPzT2AO4D_eifdw

#kaggle #xgboost #data-science