An Illustrative Explanation to Dynamic Time Warping

Dynamic Time Warping (DTW) is a way to compare two -usually temporal- sequences that do not sync up perfectly. It is a method to calculate the optimal matching between two sequences. DTW is useful in many domains such as speech recognition, data mining, financial markets, etc. It’s commonly used in data mining to measure the distance between two time-series.

In this post, we will go over the mathematics behind DTW. Then, two illustrative examples are provided to better understand the concept. If you are not interested in the math behind it, please jump to examples.

Formulation

Let’s assume we have two sequences like the following:

𝑋=𝑥[1], 𝑥[2], …, x[i], …, x[n]

Y=y[1], y[2], …, y[j], …, y[m]

The sequences 𝑋 and 𝑌 can be arranged to form an 𝑛-by-𝑚 grid, where each point (𝑖, j) is the alignment between 𝑥[𝑖] and 𝑦[𝑗].

A warping path 𝑊 maps the elements of 𝑋 and 𝑌 to minimize the distance between them. 𝑊 is a sequence of grid points (𝑖, 𝑗). We will see an example of the warping path later.

Warping Path and DTW distance

The Optimal path to (𝑖_𝑘, 𝑗_𝑘) can be computed by:

Image for post

where 𝑑 is the Euclidean distance. Then, the overall path cost can be calculated as

Image for post

Restrictions on the Warping function

The warping path is found using a dynamic programming approach to align two sequences. Going through all possible paths is “combinatorically explosive” [1]. Therefore, for efficiency purposes, it’s important to limit the number of possible warping paths, and hence the following constraints are outlined:

Boundary Condition: This constraint ensures that the warping path begins with the start points of both signals and terminates with their endpoints.

Image for post

Monotonicity condition: This constraint preserves the time-order of points (not going back in time).

Image for post

Continuity (step size) condition: This constraint limits the path transitions to adjacent points in time (not jumping in time).

Image for post

In addition to the above three constraints, there are other less frequent conditions for an allowable warping path:

Warping window condition: Allowable points can be restricted to fall within a given warping window of width 𝜔 (a positive integer).

Image for post

Slope condition: The warping path can be constrained by restricting the slope, and consequently avoiding extreme movements in one direction.

An acceptable warping path has combinations of chess king moves that are:

Horizontal moves: (𝑖, 𝑗) → (𝑖, 𝑗+1)
Vertical moves: (𝑖, 𝑗) → (𝑖+1, 𝑗)
Diagonal moves: (𝑖, 𝑗) → (𝑖+1, 𝑗+1)

#time-series-analysis #data-science #pattern-recognition #dynamic-programming #python

Formulation

Warping Path and DTW distance

Restrictions on the Warping function

towardsdatascience.com

An Illustrative Explanation to Dynamic Time Warping