Tutorial: Uncertainty estimation with CatBoost

Understanding why your model is uncertain and how to estimate the level of uncertainty. This tutorial post details how to quantify both data and knowledge uncertainty in CatBoost.

Gradient Boosting Trees for Classification: A Beginner’s Guide

This article will mainly focus on understanding how Gradient Boosting Trees works for classification problems. We will also discuss about some important parameters, advantages and disadvantages associated with this method.

Predicting Weekly Hotel Cancellations with XGBRegressor

XGBoost can also be used for time series forecasting. This is done by using lags of the time series of interest as separate features in the model. Let’s see how XGBRegressor can be used to help us predict hotel cancellations.

Understanding Twitter Engagement with a Constant

Understanding Twitter Engagement with a Constant. This article outlines the solution proposed by the POLINKS team that ranked sixth on the RecSys Challenge 2020. This challenge is one of the most important competitions in the field of recommender system.

Training Better Deep Learning Models for Structured Data using Semi-supervised Learning

In this post, we will use semi-supervised learning to improve the performance of deep neural models when applied to structured data in a low data regime. We will show that by using unsupervised pre-training we can make a neural model perform better than gradient boosting.

Modeling Swing Probability

I want to investigate how pitchers can use information about the hitter to give themselves an edge. In particular, I will be trying to measure the probability that a certain batter will swing at a given pitch.

BYOL: Bring Your Own Loss

How we improve delivery time estimation with a custom loss function.Dear connoisseurs, I invite you to take a look inside Careem’s food delivery platform. Specifically, we are going to look at how we use machine learning to improve the customer experience for delivery time tracking.

Understanding gradient boosting from scratch with a small dataset

What is Boosting? Boosting is a very popular ensemble technique in which we combine many weak learners to transform them into a strong learner. Boosting is a sequential operation in which we build weak learners in series which are dependent on each other in a progressive manner i.e weak learner m depends on the output of weak learner m-1.

House Price Prediction in Natural Hazard Prone Areas - Part 2

Different Regression models i.e. Linear Regression, Decision Tree Regression, Gradient Boosted Regression, and Random Forest Regression were used. The performance of those models using R² were compared. Based on these performance score, better performing model were suggested to predict house price.

Feature Engineering on Date-Time Data

And how to implement it in your forecasting model using Gradient Boosting regression. These features can then be used to improve the performance of machine learning algorithms.

Using gradient boosting machines for classification in R

Using gradient boosting machines for classification in R: Understand the factors driving student success so that Open University can allocate resources to improve student success

Forecasting with web traffic data

Forecasting without using time series. Typically, when you have access to time series data, in this case, we are going to look at an example where we have one week’s worth of web traffic data.

Predicting Sentiment of Employee Reviews

Classifying positive and negative sentiment of employee reviews from Indeed.com. In my previous articles, we learned how to scrape, process, and analyze employee reviews from Indeed.com.

A Simple Gradient Boosting Trees Explanation

A simple explanation to gradient boosting trees.

Categorical features parameters in CatBoost

Categorical features parameters in CatBoost: Mastering the parameters you didn’t know exist. CatBoost is an open-sourced gradient boosting library.

NGBoost: Natural Gradients in Gradient Boosting

NGBoost: Natural Gradients in Gradient Boosting. The reign of the Gradient Boosters were almost complete in the land of tabular data.

Gradient Boost Decomposition

In order to understand the Gradient Boosting Algorithm, effort has been made to implement it from first principles using pytorch to perform the necessary optimizations.

How does XGBoost work

XGBoost was developed by Tianqi Chen and Carlos Guestrin and it is an ensemble machine learning technique that uses the Gradient boosting framework.

Ensemble Methods (Algorithms)

Tip You can see the difference with bagging here: boosting reduces the bias (or underfitting) instead of the variance. As such, boosting can overfit.

Introduction to the Gradient Boosting Algorithm

Boosting Algorithm is one of the most powerful learning ideas introduced in the last twenty years. Gradient Boosting is an supervised.