What Are We Forecasting and Why?

At Wix.com for the last few years we’ve been using time-series forecasting models as part of our data science projects for forecasting Wix’s future collections. This allowed the company two important things: (1) Better budget planning based on future collections; (2) Accurate guidance to the stock market.

Forecasting collections is a challenging task (every forecasting task is) but it’s one we’ve constantly improved on and achieved amazing results. In this blog I want to share some of our insights and practices for scaling a forecasting project.

Scaling Starts from the Bottom — The Importance of Design

For our system to scale and accommodate more features, more models and eventually more forecasters we divided it to the above building blocks. Each building block stands on its own but also knows how to communicate and work with others. A short explanation of each (hopefully I would describe them in more depth at the future, comment below/DM with requests):

  • Data Sources —Used to access and query all of our different tables and DB’s.
  • Feature Store — Uses the data from the Data Sources to create time-series features. Given a list of requested features it will return a time-series dataset.
  • **Models — **This is included for clarity, references open source models for time-series.
  • Models Library — Wrappers for Models, with our choice of hyper-params and compatible to datasets originating from the Feature Store.
  • Forecaster — Takes a dataset from the Feature Store and a model from the Model Library and creates a time-series forecaster for our required task.
  • Airflow DAG — The DAG%20as%20code.) plays several roles in our system, in this case (more will follow) we’re referring to it scheduling the Forecaster’s run (daily in our case)

As mentioned, this design supports scaling and as important it supports fast experimentation which is the basis of every successful data science project.

#data-science #scaling #forecasting #time-series-forecasting #machine-learning

Scaling Your Time Series Forecasting Project
1.35 GEEK