The Whole is Greater than the Sum of Its Parts

Image for post

(Image by Author)

STUMPY is a powerful and scalable Python library for modern time series analysis and, at its core, efficiently computes something called a matrix profile. The goal of this multi-part series is to explain what the matrix profile is and how you can start leveraging STUMPY for all of your modern time series data mining tasks!

Note: These tutorials were originally featured in the STUMPY documentation.

Part 1: The Matrix Profile

Part 2: STUMPY Basics

Part 3: Time Series Chains

Part 4: Semantic Segmentation

Part 5: Fast Approximate Matrix Profiles with STUMPY

Part 6: Matrix Profiles for Streaming Time Series Data

Part 7: Fast Pattern Searching with STUMPY

Beyond Matrix Profiles

At the core of STUMPY, one can take any time series data and efficiently compute something called a matrix profile, which essentially scans along your entire times series with a fixed window size, m, and finds the exact nearest neighbor for every subsequence within your time series. A matrix profile allows you to determine if there are any conserved behaviors (i.e., conserved subsequences/patterns) within your data and, if so, it can tell you exactly where they are located within your time series. In a previous tutorial, we demonstrated how to use STUMPY to easily obtain a matrix profile, learned how to interpret the results, and discover meaningful motifs and discords. While this brute-force approach may be very useful when you don’t know what pattern or conserved behavior you are looking but, for sufficiently large datasets, it can become quite expensive to perform this exhaustive pairwise search.

However, if you already have a specific user defined pattern in mind then you don’t actually need to compute the full matrix profile! For example, maybe you’ve identified an interesting trading strategy based on historical stock market data and you’d like to see if that specific pattern may have been observed in the past within one or more stock ticker symbols. In that case, searching for a known pattern or “query” is actually quite straightforward and can be accomplished quickly by using the wonderful core.mass function in STUMPY.

In this short tutorial, we’ll take a simple known pattern of interest (i.e., a query subsequence) and we’ll search for this pattern in a separate independent time series. Let’s get started!

#python #towards-data-science #time-series-analysis #matrix-profile #stumpy

Fast Pattern Searching with STUMPY
7.40 GEEK