Changepoint Detection

As we saw in trend detection, the detected output are time points where the trend changes. Changepoint detection is also a method to detect time points, but the point indicates where the probability distribution of a time series changes. In some libraries such as Greykite, trend detection is included in its changepoint detection. In Kats, they are seperated. Let's take a look at Kats changepoint detection first.

Kats Changepoint Detection

The experiments for Kats changepoint detection are shown with humidity and temperature time series data.

Kats' CUSUM Detector

CUSUM Detector detects the up or down shift of means in the time series. Starting from an initial point, it calculates the CUSUM of means iteratively, and locates a changepoint where its previous CUSUM is maximized or minimized.

The null hypothesis is H0: There is no change in the mean. We can decide to detect a single changepoint or multiple changepoints. Let's start from single changepoint detection!

Single Changepoint Detection

To use CUSUM Detector in Kats, we need to initialize the detector with a time series input as shown in Step 1, then we can write a function in Step 2 to load detector's parameters and plot the changepoint, finally Step 3 shows the output of increasing changepoint detection for both humidity and temperature data. The parameter threshold is the significance level for null hypothesis. As we can see, in temperature data, there is no significant increasing changepoint detected, given 0.05 significance level.

Similarly, we can detect decreasing changepoint for the time series as shown below:

What's more, sometimes we only want to do the detection within part of the time series, so we can specify an interest window and the changepoint detection will only happen within the window. Of course, the detected result is only valid locally in the window but may not be valid in the whole time series.

๐ŸŒป Check detailed code in Kats Changepoint Detection >>

Multiple Changepoints Detection

The secret to enable multiple changepoints detection with CUSUM Detector is to add a sliding window. When the window slide forward in a fixed step size, the detector will detect the changepoint within the window.

In this example, we are using window size as 2000 (meaning, there're 2000 records in the window) and step size as 800 to detect the increasing changepoints, the results are markded in red lines. As you can see below, some red lines are close together, if you want to avoid this happen, you can set a larger step size, so that there is less overlap between windows.

Similarly, here're the detected decreasing changepoints:

๐ŸŒป Check detailed code in Kats Changepoint Detection >>

Kats' RobustStat Detector

If to detect multiple changepoints with CUSUM Detector needs self-implemented rolling window function, then RobustStat Detector can do this for you in one run. Similar to CUSUM Detector, it detects changepoints by checking mean shifts. It will start with smoothing out the time series using moving average, then compare the differences between a window of data points with its previous windows (window size is fixed, called as comparison_window). Finally it calculates the z-score, p-values of those differences, and mark those points with p-value smaller than the threshold as changepoints.

To detect a single changepoint, it looks like this:

However, as we can see, we can't specify the changepoint to be the "increasing" or the "decreasing" one. Kats documentation is too poor, you simply don't know how to properly set parameter values in the function... And here's how to generate multiple changepoints detection, it seems that the trick is to use a smaller comparison window.

You might have noticed the small size of the visualization, in fact, Lady H. can't find a way to adjust the plot size... Either because Kats is an incomplete library or because of its embarassing software design...

๐ŸŒป Check detailed code in Kats Changepoint Detection >>

Greykite Changepoint Detection

Greykite's changepoint detection supports both trend and seasonality changepoint identification. The trend changepoint detection process is similar to that of Kats. To detect changepoints, Greykite first preprocesses the time series using mean aggregation to prevent small fluctuations or seasonality effects from being misinterpreted as trends. It then places a large number of potential changepoints uniformly across the entire time span. To refine the results, Greykite applies adaptive lasso to shrink the coefficients of insignificant changepoints to zero. Finally, it performs post-filtering to eliminate changepoints that are too close to each other.

While Kats' changepoint detection failed to work on the sales data, Greykite successfully detected changepoints across all three time series: sales, temperature, and humidity data.

Greykite's Trend Changepoint Detection

The configuration for trend detection is shown as below, no matter it's univariate time series or multivariate time series input, you just need to specify the time series variable name.

The plot of changepoints allows more info to be added, such as trend line, seasonality, etc. The example below shows the trend changepoints on the sales data:

These are the trend changepoints for humidity and temperature data:

However we don't know the detected changepoints here indicate increasing trend or decreasing trend.

๐ŸŒป Check detailed code in Greykite Trend Changepoint Detection >>

Greykite's Seasonality Changepoint Detection

The configuration for seasonality changepoint detection is similar to Greykite trend detection, but just to change to another function:

Greykite will show seasonality changepoints for different components (daily, weekly, yearly). For example, for the sales data, it has detected weekly and yearly changepoints:

These are the seasonality changepoints for humidity and temperature data:

But some items in the plot can be confusing. Such as, why the trend line can be much higher than the whole time series? Why there can be yearly changepoint while the overall time series is less than 1 year length? Comparing with Kats changepoint detection, which output makes more sense to you? Welcome to share your thoughts here!

๐ŸŒป Check detailed code in Greykite Seasonality Changepoint Detection >>

You've made it halfway through Purgatory, and that's an incredible achievement! The most valuable learnings never have short cut, but they are the ones that shape us the most. Keep your patience, stay positive, and keep pushing forward!

Last updated