Feature Engineering on Date-Time Data

Feature Engineering on Date-Time Data

And how to implement it in your forecasting model using Gradient Boosting regression. These features can then be used to improve the performance of machine learning algorithms.

According to Wikipedia, feature engineering refers to the process of using domain knowledge to extract features from raw data via data mining techniques. These features can then be used to improve the performance of machine learning algorithms.

Feature engineering does not necessarily have to be fancy. One simple, yet prevalent, use case of feature engineering is in time-series data. The importance of feature engineering in this realm is due to the fact that (raw) time-series data usually only contains one single column to represent the time attribute, namely date-time (or timestamp).

Regarding this date-time data, feature engineering can be seen as extracting useful information from such data as standalone (distinct) features. For example, from a date-time data of “2020–07–01 10:21:05”, we might want to extract the following features from it:

  1. Month: 7
  2. Day of month: 1
  3. Day name: Wednesday (2020–07–01 was Wednesday)
  4. Hour: 10

Extracting such kinds of features from date-time data is precisely the objective of the current article. Afterwards, we will incorporate our engineered features as predictors of a gradient boosting regression model. Specifically, we will forecast metro interstate traffic volume.

Quick summary

This article will cover the following.

A step-by-step guide to extract the below features from a date-time column.

  1. Month
  2. Day of month
  3. Day name
  4. Hour
  5. Daypart (morning, afternoon, etc)
  6. Weekend flag (1 if weekend, else 0)

How to incorporate those features in a Gradient Boosting regression model to forecast metro interstate traffic volume.

forecasting regression pandas gradient-boosting feature-engineering

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Demystifying Feature Engineering and Selection for Driver-Based Forecasting

In this article, we will explore the different types of features which are commonly engineered during forecasting projects and the rationale for using them.

Feature Engineering: What is Feature Engineering?

According to a survey in Forbes, data scientists spend 80% of their time on data preparation. This shows the importance of feature engineering in data science.

Explained: Kaggle Housing Prices’ Feature Engineering and Ridge Regression.

This blog is based on the notebook I used to submit predictions for Kaggle In-Class Housing Prices Competition. My submission ranked 293 on the score board, although the focus of this blog is not how to get a high score but to help beginners develop intuition for Machine Learning regression techniques and feature engineering.

NGBoost: Natural Gradients in Gradient Boosting

NGBoost: Natural Gradients in Gradient Boosting. The reign of the Gradient Boosters were almost complete in the land of tabular data.

Feature Engineering & Feature Selection

How to apply modern Machine Learning on Volume Spread Analysis (VSA).Following up the previous posts in these series, this time we are going to explore a real Technical Analysis (TA) in the financial market. For a very long time, I have been fascinated by the inner logic of TA called Volume Spread Analysis (VSA). I have found no articles on applying modern Machine learning on this time proving long-lasting technique.