Dynamic Time Warping (DTW) is a way to compare two -usually temporal- sequences that do not sync up perfectly. It is a method to calculate the optimal matching between two sequences. DTW is useful in many domains such as speech recognition, data mining, financial markets, etc. Itβs commonly used in data mining to measure the distance between two time-series.
In this post, we will go over the mathematics behind DTW. Then, two illustrative examples are provided to better understand the concept. If you are not interested in the math behind it, please jump to examples.
Letβs assume we have two sequences like the following:
π=π₯[1], π₯[2], β¦, x[i], β¦, x[n]
Y=y[1], y[2], β¦, y[j], β¦, y[m]
The sequences π and π can be arranged to form an π-by-π grid, where each point (π, j) is the alignment between π₯[π] and π¦[π].
A warping path π maps the elements of π and π to minimize the distance between them. π is a sequence of grid points (π, π). We will see an example of the warping path later.
The Optimal path to (π_π, π_π) can be computed by:
where π is the Euclidean distance. Then, the overall path cost can be calculated as
The warping path is found using a dynamic programming approach to align two sequences. Going through all possible paths is βcombinatorically explosiveβ [1]. Therefore, for efficiency purposes, itβs important to limit the number of possible warping paths, and hence the following constraints are outlined:
In addition to the above three constraints, there are other less frequent conditions for an allowable warping path:
An acceptable warping path has combinations of chess king moves that are:
#time-series-analysis #data-science #pattern-recognition #dynamic-programming #python