Introduction

augurs is a Rust library for time series analysis and forecasting. It provides a comprehensive set of tools for working with time series data, including:

  • Forecasting with multiple algorithms including ETS, MSTL, and Prophet
  • Outlier detection using MAD and DBSCAN
  • Time series clustering with DTW distance metrics
  • Seasonality and changepoint detection
  • Data preprocessing and transformation tools

Built with a focus on performance and ease of use, augurs offers both high-level APIs for common tasks and low-level components for custom implementations. The library supports Python and JavaScript bindings, making it accessible across different programming environments.

Whether you're building a production forecasting system or analyzing seasonal patterns in data, augurs provides the tools you need for robust time series analysis.

These docs are a work-in-progress; the missing pages will hopefully be added soon.

Installation

Rust

Add augurs to your Cargo.toml. The library is modular, so you only need to enable the features you plan to use:

[dependencies]
augurs = { version = "0.6.0", features = [] }

Available features include:

  • forecaster - High-level forecasting API with data transformations
  • ets - Exponential smoothing models
  • mstl - Multiple Seasonal-Trend decomposition using LOESS
  • outlier - Outlier detection algorithms
  • clustering - Time series clustering algorithms
  • dtw - Dynamic Time Warping distance calculations
  • full - All features
  • prophet - Facebook Prophet forecasting model
  • prophet-cmdstan - Prophet with cmdstan backend
  • prophet-wasmstan - Prophet with WebAssembly stan backend
  • seasons - Seasonality detection

For example, to use forecasting with ETS and MSTL:

[dependencies]
augurs = { version = "0.6.0", features = ["forecaster", "ets", "mstl"] }

Python

The Python bindings can be installed via pip:

pip install augurs

JavaScript Installation

The JavaScript bindings are available through npm:

npm install @bsull/augurs

Quick Start Guide

This guide will help you get started with augurs, a Rust library for time series analysis and forecasting.

Installation

Add augurs to your Cargo.toml:

[dependencies]
augurs = { version = "0.6.0", features = ["forecaster", "ets", "mstl", "outlier"] }

Basic Forecasting

Let's start with a simple example using the MSTL (Multiple Seasonal-Trend decomposition using LOESS) model and a naive trend forecaster:

extern crate augurs;
use augurs::{mstl::MSTLModel, prelude::*};

fn main() {
    // Sample time series data
    let data = &[1.0, 1.2, 1.4, 1.5, 1.4, 1.4, 1.2, 1.5, 1.6, 2.0, 1.9, 1.8, 1.9, 2.0];

    // Create an MSTL model with weekly seasonality (period = 7)
    let mstl = MSTLModel::naive(vec![7]);

    // Fit the model
    let fit = mstl.fit(data).expect("model should fit");

    // Generate forecasts with 95% prediction intervals
    let forecast = fit
        .predict(10, 0.95)
        .expect("forecasting should work");

    println!("Forecast values: {:?}", forecast.point);
    println!("Lower bounds: {:?}", forecast.intervals.as_ref().unwrap().lower);
    println!("Upper bounds: {:?}", forecast.intervals.as_ref().unwrap().upper);
}

Advanced Forecasting with Transforms

For more complex scenarios, you can use the Forecaster API which supports data transformations:

extern crate augurs;
use augurs::{
    ets::AutoETS,
    forecaster::{transforms::MinMaxScaleParams, Forecaster, Transform},
    mstl::MSTLModel,
};

fn main() {
    let data = &[1.0, 1.2, 1.4, 1.5, f64::NAN, 1.4, 1.2, 1.5, 1.6, 2.0, 1.9, 1.8];

    // Set up model and transforms
    let ets = AutoETS::non_seasonal().into_trend_model();
    let mstl = MSTLModel::new(vec![2], ets);

    let transforms = vec![
        Transform::linear_interpolator(),
        Transform::min_max_scaler(MinMaxScaleParams::from_data(data.iter().copied())),
        Transform::log(),
    ];

    // Create and fit forecaster
    let mut forecaster = Forecaster::new(mstl).with_transforms(transforms);
    forecaster.fit(data).expect("model should fit");

    // Generate forecasts
    let forecast = forecaster
        .predict(5, 0.95)
        .expect("forecasting should work");
}

Outlier Detection

augurs provides multiple algorithms for outlier detection. Here's an example using the MAD (Median Absolute Deviation) detector:

extern crate augurs;
use augurs::outlier::{MADDetector, OutlierDetector};

fn main() {
    let series: &[&[f64]] = &[
        &[1.0, 2.0, 1.5, 2.3],
        &[1.9, 2.2, 1.2, 2.4],
        &[1.5, 2.1, 6.4, 8.5], // This series contains outliers
    ];

    // Create and configure detector
    let detector = MADDetector::with_sensitivity(0.5)
        .expect("sensitivity is between 0.0 and 1.0");

    // Detect outliers
    let processed = detector.preprocess(series).expect("input data is valid");
    let outliers = detector.detect(&processed).expect("detection succeeds");

    println!("Outlying series indices: {:?}", outliers.outlying_series);
}

Time Series Clustering

You can use DBSCAN clustering with Dynamic Time Warping (DTW) distance:

extern crate augurs;
use augurs::{clustering::DbscanClusterer, dtw::Dtw};

fn main() {
    let series: &[&[f64]] = &[
        &[0.0, 1.0, 2.0, 3.0, 4.0],
        &[0.1, 1.1, 2.1, 3.1, 4.1],
        &[5.0, 6.0, 7.0, 8.0, 9.0],
    ];

    // Compute distance matrix using DTW
    let distance_matrix = Dtw::euclidean()
        .with_window(2)
        .distance_matrix(series);

    // Perform clustering
    let clusters = DbscanClusterer::new(0.5, 2)
        .fit(&distance_matrix);

    println!("Cluster assignments: {:?}", clusters);
}

Next Steps

Core Concepts

augurs is a comprehensive time series analysis library that provides several core capabilities:

Forecasting

Time series forecasting involves predicting future values based on historical patterns. The library supports multiple forecasting methods:

  • MSTL (Multiple Seasonal-Trend decomposition using LOESS)
  • ETS (Error, Trend, Seasonal) models
  • Prophet (Facebook's forecasting tool)
  • Custom models through the Forecaster trait

Clustering

Time series clustering helps identify groups of similar time series within a dataset. Key features include:

  • DBSCAN clustering with DTW (Dynamic Time Warping) distance
  • Flexible distance metrics
  • Parallel processing support for large datasets

Outlier Detection

Outlier detection is the task of identifying one or more time series that deviate significantly from the norm. augurs includes:

  • MAD (Median Absolute Deviation) detection
  • DBSCAN-based outlier detection
  • Customizable sensitivity parameters

Changepoint Detection

augurs re-exports the changepoint crate for detecting changes in time series data:

  • Normal distribution-based changepoint detection
  • Autoregressive Gaussian process changepoint detection

Seasonality Analysis

Understanding seasonal patterns is essential for time series analysis:

  • Automatic period detection
  • Multiple seasonality handling
  • Seasonal decomposition

Data Transformations

The library supports various data transformations:

  • Linear interpolation for missing values
  • Min-max scaling
  • Logarithmic transformation
  • Custom transformations through the Transform trait

Forecasting with Prophet

This tutorial will guide you through using Facebook's Prophet forecasting model with the WebAssembly-based Stan backend in augurs. Prophet is particularly well-suited for time series that have strong seasonal effects and multiple seasons.

Prerequisites

First, add the necessary features to your Cargo.toml:

[dependencies]
augurs = { version = "0.6.0", features = ["prophet", "prophet-wasmstan"] }

Basic Prophet Forecasting

Let's start with a simple example:

extern crate augurs;
use augurs::prophet::{Prophet, TrainingData, wasmstan::WasmstanOptimizer};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create timestamps (as Unix timestamps)
    let timestamps = vec![
        1704067200, // 2024-01-01
        1704153600, // 2024-01-02
        1704240000, // 2024-01-03
        1704326400, // 2024-01-04
        1704412800, // 2024-01-05
        // ... more dates
    ];

    // Your observations
    let values = vec![1.1, 2.1, 3.2, 4.3, 5.5];

    // Create training data
    let data = TrainingData::new(timestamps, values)?;

    // Initialize Prophet with WASMSTAN optimizer
    let optimizer = WasmstanOptimizer::new();
    let mut prophet = Prophet::new(Default::default(), optimizer);

    // Fit the model
    prophet.fit(data, Default::default())?;

    // Make in-sample predictions
    let predictions = prophet.predict(None)?;

    println!("Predictions: {:?}", predictions.yhat.point);
    println!("Lower bounds: {:?}", predictions.yhat.lower.unwrap());
    println!("Upper bounds: {:?}", predictions.yhat.upper.unwrap());

    Ok(())
}

Adding Regressors

Prophet allows you to include additional regressors to improve your forecasts:

extern crate augurs;
use std::collections::HashMap;
use augurs::prophet::{Prophet, TrainingData, Regressor, wasmstan::WasmstanOptimizer};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create timestamps and values as before
    let timestamps = vec![
        1704067200, // 2024-01-01
        1704153600, // 2024-01-02
        1704240000, // 2024-01-03
        1704326400, // 2024-01-04
        1704412800, // 2024-01-05
    ];
    let values = vec![1.1, 2.1, 3.2, 4.3, 5.5];

    // Create regressors
    let regressors = HashMap::from([
        (
            "temperature".to_string(),
            vec![20.0, 22.0, 21.0, 21.5, 22.5], // temperature values
        ),
    ]);

    // Create training data with regressors
    let data = TrainingData::new(timestamps, values)?
        .with_regressors(regressors)?;

    // Initialize Prophet
    let optimizer = WasmstanOptimizer::new();
    let mut prophet = Prophet::new(Default::default(), optimizer);

    // Add regressors with their modes
    prophet.add_regressor("temperature".to_string(), Regressor::additive());

    // Fit and predict as before
    prophet.fit(data, Default::default())?;
    let predictions = prophet.predict(None)?;

    Ok(())
}

Customizing the Model

Prophet offers several customization options:

extern crate augurs;
use augurs::prophet::{
    Prophet, TrainingData, ProphetOptions, FeatureMode,
    GrowthType, SeasonalityOption,
    wasmstan::WasmstanOptimizer,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Configure Prophet with custom settings
    let options = ProphetOptions {
        // Set growth model
        growth: GrowthType::Linear,
        // Configure seasonality
        seasonality_mode: FeatureMode::Multiplicative,
        yearly_seasonality: SeasonalityOption::Manual(true),
        weekly_seasonality: SeasonalityOption::Manual(true),
        daily_seasonality: SeasonalityOption::Manual(false),
        ..Default::default()
    };

    let optimizer = WasmstanOptimizer::new();
    let mut prophet = Prophet::new(options, optimizer);

    // Proceed with fitting and prediction...

    Ok(())
}

Working with Future Dates

To forecast into the future, you'll need to create a PredictionData object with the timestamps you want to predict. It must also contain the same regressors as the training data:

extern crate augurs;
use augurs::prophet::{Prophet, PredictionData, wasmstan::WasmstanOptimizer};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Setup and fit model as before...
    let optimizer = WasmstanOptimizer::new();
    let mut prophet = Prophet::new(Default::default(), optimizer);

    let prediction_data = PredictionData::new(vec![
        1704499200, // 2024-01-06
        1704585600, // 2024-01-07
    ]);
    let predictions = prophet.predict(Some(prediction_data))?;

    // Access the forecasted values, and their bounds.
    println!("Predictions: {:?}", predictions.yhat.point);
    println!("Lower bounds: {:?}", predictions.yhat.lower.as_ref().unwrap());
    println!("Upper bounds: {:?}", predictions.yhat.upper.as_ref().unwrap());

    Ok(())
}

Best Practices

  1. Data Preparation

    • Ensure your timestamps are Unix timestamps
    • Handle missing values before passing to Prophet
    • Consider scaling your target variable if values are very large
  2. Model Configuration

    • Start with default settings and adjust based on your needs
    • Use additive seasonality for constant seasonal variations
    • Use multiplicative seasonality when variations scale with the trend
  3. Performance Considerations

    • WASMSTAN runs inside a WASM runtime and may be slower than native code
    • For server-side applications, consider using the prophet-cmdstan feature instead
    • Large datasets may require more computation time

Troubleshooting

Common issues and their solutions:

  • Invalid timestamps: Ensure timestamps are Unix timestamps in seconds
  • Missing values: Prophet can handle some missing values, but it's better to preprocess them
  • Convergence issues: Try adjusting the number of iterations or sampling parameters

Next Steps

Automated Outlier Detection

This tutorial demonstrates how to use augurs to automatically detect outliers in time series data. We'll explore both the MAD (Median Absolute Deviation) and DBSCAN approaches to outlier detection.

MAD-based Outlier Detection

The MAD detector is ideal for identifying time series that deviate significantly from the typical behavior pattern:

extern crate augurs;
use augurs::outlier::{MADDetector, OutlierDetector};

fn main() {
    // Example time series data
    let series: &[&[f64]] = &[
        &[1.0, 2.0, 1.5, 2.3],
        &[1.9, 2.2, 1.2, 2.4],
        &[1.5, 2.1, 6.4, 8.5], // This series contains outliers
    ];

    // Create detector with 50% sensitivity
    let detector = MADDetector::with_sensitivity(0.5)
        .expect("sensitivity is between 0.0 and 1.0");

    // Process and detect outliers
    let processed = detector.preprocess(series).expect("input data is valid");
    let outliers = detector.detect(&processed).expect("detection succeeds");

    println!("Outlying series indices: {:?}", outliers.outlying_series);
    println!("Series scores: {:?}", outliers.series_results);
}

DBSCAN-based Outlier Detection

DBSCAN is particularly effective when your time series have seasonal patterns:

extern crate augurs;
use augurs::outlier::{DbscanDetector, OutlierDetector};

fn main() {
    // Example time series data
    let series: &[&[f64]] = &[
        &[1.0, 2.0, 1.5, 2.3],
        &[1.9, 2.2, 1.2, 2.4],
        &[1.5, 2.1, 6.4, 8.5], // This series behaves differently
    ];

    // Create and configure detector
    let mut detector = DbscanDetector::with_sensitivity(0.5)
        .expect("sensitivity is between 0.0 and 1.0");

    // Enable parallel processing (requires 'parallel' feature)
    detector = detector.parallelize(true);

    // Process and detect outliers
    let processed = detector.preprocess(series).expect("input data is valid");
    let outliers = detector.detect(&processed).expect("detection succeeds");

    println!("Outlying series indices: {:?}", outliers.outlying_series);
    println!("Series scores: {:?}", outliers.series_results);
}

Handling Results

The outlier detection results provide several useful pieces of information:

extern crate augurs;
use augurs::outlier::{MADDetector, OutlierDetector};

fn main() {
    let series: &[&[f64]] = &[
        &[1.0, 2.0, 1.5, 2.3],
        &[1.9, 2.2, 1.2, 2.4],
        &[1.5, 2.1, 6.4, 8.5],
    ];

    let detector = MADDetector::with_sensitivity(0.5)
        .expect("sensitivity is between 0.0 and 1.0");

    let processed = detector.preprocess(series).expect("input data is valid");
    let outliers = detector.detect(&processed).expect("detection succeeds");

    // Get indices of outlying series
    for &idx in &outliers.outlying_series {
        println!("Series {} is an outlier", idx);
    }

    // Examine detailed results for each series
    for (idx, result) in outliers.series_results.iter().enumerate() {
        println!("Series {}: outlier = {}", idx, result.is_outlier);
        println!("Scores: {:?}", result.scores);
    }
}

Best Practices

  1. Choosing a Detector

    • Use MAD when you expect series to move within a stable band
    • Use DBSCAN when series may have seasonality or complex patterns
  2. Sensitivity Tuning

    • Start with 0.5 sensitivity and adjust based on results
    • Lower values (closer to 0.0) are more sensitive
    • Higher values (closer to 1.0) are more selective
  3. Performance Optimization

    • Enable parallelization for large datasets
    • Consider preprocessing data to remove noise
    • Handle missing values before detection

Example: Real-time Monitoring

Here's an example of using outlier detection in a monitoring context:

#![allow(unused)]
fn main() {
extern crate augurs;
use augurs::outlier::{MADDetector, OutlierDetector};

fn monitor_time_series(historical_data: &[&[f64]], new_data: &[f64]) -> bool {
    // Create detector from historical data
    let detector = MADDetector::with_sensitivity(0.5)
        .expect("sensitivity is between 0.0 and 1.0");

    // Combine historical and new data
    let mut all_series: Vec<&[f64]> = historical_data.to_vec();
    all_series.push(new_data);

    // Check for outliers
    let processed = detector.preprocess(&all_series)
        .expect("input data is valid");
    let outliers = detector.detect(&processed)
        .expect("detection succeeds");

    // Check if new series (last index) is an outlier
    outliers.outlying_series.contains(&(all_series.len() - 1))
}
}

This structure provides a comprehensive guide to outlier detection while maintaining a practical focus on implementation details.

Time Series Clustering

Time series clustering is a technique used to group similar time series together. This can be useful for finding patterns in data, detecting anomalies, or reducing the dimensionality of large datasets.

augurs provides several clustering algorithms, including DBSCAN (Density-Based Spatial Clustering of Applications with Noise). DBSCAN is particularly well-suited for time series data as it:

  • Doesn't require specifying the number of clusters upfront
  • Can find arbitrarily shaped clusters
  • Can identify noise points that don't belong to any cluster
  • Works well with Dynamic Time Warping (DTW) distance measures

Basic Example

Let's start with a simple example using DBSCAN clustering:

extern crate augurs;
use augurs::{clustering::DbscanClusterer, dtw::Dtw};

// Sample time series data
const SERIES: &[&[f64]] = &[
    &[0.0, 1.0, 2.0, 3.0, 4.0],
    &[0.1, 1.1, 2.1, 3.1, 4.1],
    &[5.0, 6.0, 7.0, 8.0, 9.0],
    &[5.1, 6.1, 7.1, 8.1, 9.1],
    &[10.0, 11.0, 12.0, 13.0, 14.0],
];

fn main() {
    // Compute distance matrix using DTW
    let distance_matrix = Dtw::euclidean()
        .with_window(2)
        .with_lower_bound(4.0)
        .with_upper_bound(10.0)
        .with_max_distance(10.0)
        .distance_matrix(SERIES);

    // Set DBSCAN parameters
    let epsilon = 0.5;
    let min_cluster_size = 2;

    // Perform clustering
    let clusters: Vec<isize> = DbscanClusterer::new(epsilon, min_cluster_size)
        .fit(&distance_matrix);

    // Clusters are labeled: -1 for noise, 0+ for cluster membership
    assert_eq!(clusters, vec![0, 0, 1, 1, -1]);
}

Understanding Parameters

DTW Parameters

  • window: Size of the Sakoe-Chiba band for constraining DTW computation
  • lower_bound: Minimum distance to consider
  • upper_bound: Maximum distance to consider
  • max_distance: Early termination threshold

DBSCAN Parameters

  • epsilon: Maximum distance between two points for one to be considered in the neighborhood of the other
  • min_cluster_size: Minimum number of points required to form a dense region

Best Practices

  1. Distance Measure Selection

    • Use DTW for time series that might be shifted or warped
    • Consider the computational cost of DTW for large datasets
    • Experiment with different window sizes to balance accuracy and performance
  2. Parameter Tuning

    • Start with a relatively large epsilon and reduce it if clusters are too large
    • Set min_cluster_size based on your domain knowledge
    • Use the DTW window parameter to prevent pathological alignments
  3. Performance Optimization

    • Enable parallel processing for large datasets
    • Use DTW bounds to speed up distance calculations
    • Consider downsampling very long time series

Example: Clustering with Multiple Distance Measures

#![allow(unused)]
fn main() {
extern crate augurs;
use augurs::{
    clustering::DbscanClusterer,
    dtw::{Dtw, Distance}
};

fn compare_distance_measures(series: &[&[f64]]) {
    // Euclidean DTW
    let euclidean_matrix = Dtw::euclidean()
        .distance_matrix(series);
    let euclidean_clusters = DbscanClusterer::new(0.5, 2)
        .fit(&euclidean_matrix);

    // Manhattan DTW
    let manhattan_matrix = Dtw::manhattan()
        .distance_matrix(series);
    let manhattan_clusters = DbscanClusterer::new(0.5, 2)
        .fit(&manhattan_matrix);

    // Compare results
    println!("Euclidean clusters: {:?}", euclidean_clusters);
    println!("Manhattan clusters: {:?}", manhattan_clusters);
}
}

Next Steps

API Documentation

API docs are available on docs.rs.

Contributing

Thank you for your interest in contributing to augurs! We welcome contributions from everyone.

The augurs repository can be found on GitHub.

Ways to Contribute

There are many ways to contribute:

  • Report bugs and request features in the issue tracker
  • Submit pull requests to fix issues or add features
  • Improve documentation
  • Share your experiences using augurs

Development Setup

  1. Fork and clone the repository
  2. Install Rust via rustup if you haven't already
  3. Install just for running tasks
  4. Build the WASM component:
just build-component
  1. Start building and checking the project using bacon:
just build
  1. Run tests:
just test

Pull Request Process

  1. Fork the repository and create a new branch from main
  2. Make your changes and add tests if applicable
  3. Update documentation as needed
  4. Run tests and clippy to ensure everything passes
  5. Submit a pull request describing your changes

Code Style

  • Follow Rust standard style guidelines
  • Run cargo fmt before committing
  • Run cargo clippy and address any warnings
  • Add documentation comments for public APIs

Testing

  • Add tests for new functionality
  • Ensure existing tests pass
  • Include both unit tests and integration tests where appropriate

License

By contributing, you agree that your contributions will be licensed under the same terms as the project (MIT license OR APACHE 2.0 license).