CausalImpact is an R package developed by Google for causal inference using Bayesian Structural time-series models. You can find the R version here.
In short, what this package does is making counterfactual predictions. In other words, what would have happened in a parallel (sort of) universe if an intervention never had happened? Here is a quick example straight from Google’s website: “Given a response time series (e.g., clicks) and a set of control time series (e.g., clicks in non-affected markets or clicks on other sites), the package constructs a Bayesian structural time-series model. This model is then used to try and predict the counterfactual, i.e., how the response metric would have evolved after the intervention if the intervention had never occurred.”
CausalImpact 1.2.1, Brodersen et al., Annals of Applied Statistics (2015). http://google.github.io/CausalImpact/
Image Description: Part A of the image (original) shows, with the dark continuous line, the time series of something we are monitoring. The blue dotted one is the counterfactual prediction. The vertical grey line is the moment when an intervention was made. We can observe that from that moment onwards, blue and black lines drift apart. Part B (pointwise) illustrates the difference of those lines over time which in essence is the causal effect we are interested in, while Part C (cumulative) is the cumulative difference over time.
I know you can work with R, but for Python lovers, I am not aware of the equivalent package. Surely, there are some libraries implementing parts of the original paper. By checking out some of those Python implementations I noticed differences in terms of the results. Long story short, here you can check how to run this package from python. Similarly, the approach is generalisable to probably any R package for that matter.
What worked for me was to create a new Conda environment with both Python libraries and core R packages pre-installed. Here is an example: conda create -n r_env numpy pandas statsmodels r-essentials r-base
Creating the environment should take some time. Also, note that Jupyter notebook requires further configuration so I tend to edit the code in any programming editor instead and run from the command line.
What we would also need is rpy2
which does all the work for us. It is a python interface to the R language. pip install rpy2
would do.
Load all the libraries as below:
#rpy2 lib
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
from rpy2.robjects import Formula
import rpy2.robjects.packages as rpackages
import rpy2.robjects.vectors as StrVector
from rpy2.ipython.ggplot import image_png
#typical python libs
import numpy as np
import pandas as pd
import datetime
#arma
from statsmodels.tsa.arima_process import ArmaProcess
#python #timeseries #causality #r #counterfactual