Introduction
augurs
is a Rust library for time series analysis and forecasting. It provides a comprehensive set of tools for working with time series data, including:
- Forecasting with multiple algorithms including ETS, MSTL, and Prophet
- Outlier detection using MAD and DBSCAN
- Time series clustering with DTW distance metrics
- Seasonality and changepoint detection
- Data preprocessing and transformation tools
Built with a focus on performance and ease of use, augurs
offers both high-level APIs for common tasks and low-level components for custom implementations. The library supports Python and JavaScript bindings, making it accessible across different programming environments.
Whether you're building a production forecasting system or analyzing seasonal patterns in data, augurs
provides the tools you need for robust time series analysis.
These docs are a work-in-progress; the missing pages will hopefully be added soon.
Installation
Rust
Add augurs
to your Cargo.toml
. The library is modular, so you only need to enable the features you plan to use:
[dependencies]
augurs = { version = "0.6.0", features = [] }
Available features include:
forecaster
- High-level forecasting API with data transformationsets
- Exponential smoothing modelsmstl
- Multiple Seasonal-Trend decomposition using LOESSoutlier
- Outlier detection algorithmsclustering
- Time series clustering algorithmsdtw
- Dynamic Time Warping distance calculationsfull
- All featuresprophet
- Facebook Prophet forecasting modelprophet-cmdstan
- Prophet with cmdstan backendprophet-wasmstan
- Prophet with WebAssembly stan backendseasons
- Seasonality detection
For example, to use forecasting with ETS and MSTL:
[dependencies]
augurs = { version = "0.6.0", features = ["forecaster", "ets", "mstl"] }
Python
The Python bindings can be installed via pip:
pip install augurs
JavaScript Installation
The JavaScript bindings are available through npm:
npm install @bsull/augurs
Quick Start Guide
This guide will help you get started with augurs
, a Rust library for time series analysis and forecasting.
Installation
Add augurs
to your Cargo.toml
:
[dependencies]
augurs = { version = "0.6.0", features = ["forecaster", "ets", "mstl", "outlier"] }
Basic Forecasting
Let's start with a simple example using the MSTL (Multiple Seasonal-Trend decomposition using LOESS) model and a naive trend forecaster:
extern crate augurs; use augurs::{mstl::MSTLModel, prelude::*}; fn main() { // Sample time series data let data = &[1.0, 1.2, 1.4, 1.5, 1.4, 1.4, 1.2, 1.5, 1.6, 2.0, 1.9, 1.8, 1.9, 2.0]; // Create an MSTL model with weekly seasonality (period = 7) let mstl = MSTLModel::naive(vec![7]); // Fit the model let fit = mstl.fit(data).expect("model should fit"); // Generate forecasts with 95% prediction intervals let forecast = fit .predict(10, 0.95) .expect("forecasting should work"); println!("Forecast values: {:?}", forecast.point); println!("Lower bounds: {:?}", forecast.intervals.as_ref().unwrap().lower); println!("Upper bounds: {:?}", forecast.intervals.as_ref().unwrap().upper); }
Advanced Forecasting with Transforms
For more complex scenarios, you can use the Forecaster
API which supports data transformations:
extern crate augurs; use augurs::{ ets::AutoETS, forecaster::{transforms::MinMaxScaleParams, Forecaster, Transform}, mstl::MSTLModel, }; fn main() { let data = &[1.0, 1.2, 1.4, 1.5, f64::NAN, 1.4, 1.2, 1.5, 1.6, 2.0, 1.9, 1.8]; // Set up model and transforms let ets = AutoETS::non_seasonal().into_trend_model(); let mstl = MSTLModel::new(vec![2], ets); let transforms = vec![ Transform::linear_interpolator(), Transform::min_max_scaler(MinMaxScaleParams::from_data(data.iter().copied())), Transform::log(), ]; // Create and fit forecaster let mut forecaster = Forecaster::new(mstl).with_transforms(transforms); forecaster.fit(data).expect("model should fit"); // Generate forecasts let forecast = forecaster .predict(5, 0.95) .expect("forecasting should work"); }
Outlier Detection
augurs
provides multiple algorithms for outlier detection. Here's an example using the MAD (Median Absolute Deviation) detector:
extern crate augurs; use augurs::outlier::{MADDetector, OutlierDetector}; fn main() { let series: &[&[f64]] = &[ &[1.0, 2.0, 1.5, 2.3], &[1.9, 2.2, 1.2, 2.4], &[1.5, 2.1, 6.4, 8.5], // This series contains outliers ]; // Create and configure detector let detector = MADDetector::with_sensitivity(0.5) .expect("sensitivity is between 0.0 and 1.0"); // Detect outliers let processed = detector.preprocess(series).expect("input data is valid"); let outliers = detector.detect(&processed).expect("detection succeeds"); println!("Outlying series indices: {:?}", outliers.outlying_series); }
Time Series Clustering
You can use DBSCAN clustering with Dynamic Time Warping (DTW) distance:
extern crate augurs; use augurs::{clustering::DbscanClusterer, dtw::Dtw}; fn main() { let series: &[&[f64]] = &[ &[0.0, 1.0, 2.0, 3.0, 4.0], &[0.1, 1.1, 2.1, 3.1, 4.1], &[5.0, 6.0, 7.0, 8.0, 9.0], ]; // Compute distance matrix using DTW let distance_matrix = Dtw::euclidean() .with_window(2) .distance_matrix(series); // Perform clustering let clusters = DbscanClusterer::new(0.5, 2) .fit(&distance_matrix); println!("Cluster assignments: {:?}", clusters); }
Next Steps
- Learn more about forecasting methods
- Explore outlier detection algorithms
- Understand seasonality analysis
- Check out the complete API documentation
Core Concepts
augurs
is a comprehensive time series analysis library that provides several core capabilities:
Forecasting
Time series forecasting involves predicting future values based on historical patterns. The library supports multiple forecasting methods:
- MSTL (Multiple Seasonal-Trend decomposition using LOESS)
- ETS (Error, Trend, Seasonal) models
- Prophet (Facebook's forecasting tool)
- Custom models through the
Forecaster
trait
Clustering
Time series clustering helps identify groups of similar time series within a dataset. Key features include:
- DBSCAN clustering with DTW (Dynamic Time Warping) distance
- Flexible distance metrics
- Parallel processing support for large datasets
Outlier Detection
Outlier detection is the task of identifying one or more time series that deviate significantly from the norm. augurs
includes:
- MAD (Median Absolute Deviation) detection
- DBSCAN-based outlier detection
- Customizable sensitivity parameters
Changepoint Detection
augurs
re-exports the changepoint
crate for detecting changes in time series data:
- Normal distribution-based changepoint detection
- Autoregressive Gaussian process changepoint detection
Seasonality Analysis
Understanding seasonal patterns is essential for time series analysis:
- Automatic period detection
- Multiple seasonality handling
- Seasonal decomposition
Data Transformations
The library supports various data transformations:
- Linear interpolation for missing values
- Min-max scaling
- Logarithmic transformation
- Custom transformations through the
Transform
trait
Forecasting with Prophet
This tutorial will guide you through using Facebook's Prophet forecasting model with the WebAssembly-based Stan backend in augurs
. Prophet is particularly well-suited for time series that have strong seasonal effects and multiple seasons.
Prerequisites
First, add the necessary features to your Cargo.toml
:
[dependencies]
augurs = { version = "0.6.0", features = ["prophet", "prophet-wasmstan"] }
Basic Prophet Forecasting
Let's start with a simple example:
extern crate augurs; use augurs::prophet::{Prophet, TrainingData, wasmstan::WasmstanOptimizer}; fn main() -> Result<(), Box<dyn std::error::Error>> { // Create timestamps (as Unix timestamps) let timestamps = vec![ 1704067200, // 2024-01-01 1704153600, // 2024-01-02 1704240000, // 2024-01-03 1704326400, // 2024-01-04 1704412800, // 2024-01-05 // ... more dates ]; // Your observations let values = vec![1.1, 2.1, 3.2, 4.3, 5.5]; // Create training data let data = TrainingData::new(timestamps, values)?; // Initialize Prophet with WASMSTAN optimizer let optimizer = WasmstanOptimizer::new(); let mut prophet = Prophet::new(Default::default(), optimizer); // Fit the model prophet.fit(data, Default::default())?; // Make in-sample predictions let predictions = prophet.predict(None)?; println!("Predictions: {:?}", predictions.yhat.point); println!("Lower bounds: {:?}", predictions.yhat.lower.unwrap()); println!("Upper bounds: {:?}", predictions.yhat.upper.unwrap()); Ok(()) }
Adding Regressors
Prophet allows you to include additional regressors to improve your forecasts:
extern crate augurs; use std::collections::HashMap; use augurs::prophet::{Prophet, TrainingData, Regressor, wasmstan::WasmstanOptimizer}; fn main() -> Result<(), Box<dyn std::error::Error>> { // Create timestamps and values as before let timestamps = vec![ 1704067200, // 2024-01-01 1704153600, // 2024-01-02 1704240000, // 2024-01-03 1704326400, // 2024-01-04 1704412800, // 2024-01-05 ]; let values = vec![1.1, 2.1, 3.2, 4.3, 5.5]; // Create regressors let regressors = HashMap::from([ ( "temperature".to_string(), vec![20.0, 22.0, 21.0, 21.5, 22.5], // temperature values ), ]); // Create training data with regressors let data = TrainingData::new(timestamps, values)? .with_regressors(regressors)?; // Initialize Prophet let optimizer = WasmstanOptimizer::new(); let mut prophet = Prophet::new(Default::default(), optimizer); // Add regressors with their modes prophet.add_regressor("temperature".to_string(), Regressor::additive()); // Fit and predict as before prophet.fit(data, Default::default())?; let predictions = prophet.predict(None)?; Ok(()) }
Customizing the Model
Prophet offers several customization options:
extern crate augurs; use augurs::prophet::{ Prophet, TrainingData, ProphetOptions, FeatureMode, GrowthType, SeasonalityOption, wasmstan::WasmstanOptimizer, }; fn main() -> Result<(), Box<dyn std::error::Error>> { // Configure Prophet with custom settings let options = ProphetOptions { // Set growth model growth: GrowthType::Linear, // Configure seasonality seasonality_mode: FeatureMode::Multiplicative, yearly_seasonality: SeasonalityOption::Manual(true), weekly_seasonality: SeasonalityOption::Manual(true), daily_seasonality: SeasonalityOption::Manual(false), ..Default::default() }; let optimizer = WasmstanOptimizer::new(); let mut prophet = Prophet::new(options, optimizer); // Proceed with fitting and prediction... Ok(()) }
Working with Future Dates
To forecast into the future, you'll need to create a PredictionData
object with the timestamps you want
to predict. It must also contain the same regressors as the training data:
extern crate augurs; use augurs::prophet::{Prophet, PredictionData, wasmstan::WasmstanOptimizer}; fn main() -> Result<(), Box<dyn std::error::Error>> { // Setup and fit model as before... let optimizer = WasmstanOptimizer::new(); let mut prophet = Prophet::new(Default::default(), optimizer); let prediction_data = PredictionData::new(vec![ 1704499200, // 2024-01-06 1704585600, // 2024-01-07 ]); let predictions = prophet.predict(Some(prediction_data))?; // Access the forecasted values, and their bounds. println!("Predictions: {:?}", predictions.yhat.point); println!("Lower bounds: {:?}", predictions.yhat.lower.as_ref().unwrap()); println!("Upper bounds: {:?}", predictions.yhat.upper.as_ref().unwrap()); Ok(()) }
Best Practices
-
Data Preparation
- Ensure your timestamps are Unix timestamps
- Handle missing values before passing to Prophet
- Consider scaling your target variable if values are very large
-
Model Configuration
- Start with default settings and adjust based on your needs
- Use additive seasonality for constant seasonal variations
- Use multiplicative seasonality when variations scale with the trend
-
Performance Considerations
- WASMSTAN runs inside a WASM runtime and may be slower than native code
- For server-side applications, consider using the
prophet-cmdstan
feature instead - Large datasets may require more computation time
Troubleshooting
Common issues and their solutions:
- Invalid timestamps: Ensure timestamps are Unix timestamps in seconds
- Missing values: Prophet can handle some missing values, but it's better to preprocess them
- Convergence issues: Try adjusting the number of iterations or sampling parameters
Next Steps
- Learn about changepoint detection
- Explore seasonal decomposition
- Understand cross-validation
Automated Outlier Detection
This tutorial demonstrates how to use augurs
to automatically detect outliers in time series data. We'll explore both the MAD (Median Absolute Deviation) and DBSCAN approaches to outlier detection.
MAD-based Outlier Detection
The MAD detector is ideal for identifying time series that deviate significantly from the typical behavior pattern:
extern crate augurs; use augurs::outlier::{MADDetector, OutlierDetector}; fn main() { // Example time series data let series: &[&[f64]] = &[ &[1.0, 2.0, 1.5, 2.3], &[1.9, 2.2, 1.2, 2.4], &[1.5, 2.1, 6.4, 8.5], // This series contains outliers ]; // Create detector with 50% sensitivity let detector = MADDetector::with_sensitivity(0.5) .expect("sensitivity is between 0.0 and 1.0"); // Process and detect outliers let processed = detector.preprocess(series).expect("input data is valid"); let outliers = detector.detect(&processed).expect("detection succeeds"); println!("Outlying series indices: {:?}", outliers.outlying_series); println!("Series scores: {:?}", outliers.series_results); }
DBSCAN-based Outlier Detection
DBSCAN is particularly effective when your time series have seasonal patterns:
extern crate augurs; use augurs::outlier::{DbscanDetector, OutlierDetector}; fn main() { // Example time series data let series: &[&[f64]] = &[ &[1.0, 2.0, 1.5, 2.3], &[1.9, 2.2, 1.2, 2.4], &[1.5, 2.1, 6.4, 8.5], // This series behaves differently ]; // Create and configure detector let mut detector = DbscanDetector::with_sensitivity(0.5) .expect("sensitivity is between 0.0 and 1.0"); // Enable parallel processing (requires 'parallel' feature) detector = detector.parallelize(true); // Process and detect outliers let processed = detector.preprocess(series).expect("input data is valid"); let outliers = detector.detect(&processed).expect("detection succeeds"); println!("Outlying series indices: {:?}", outliers.outlying_series); println!("Series scores: {:?}", outliers.series_results); }
Handling Results
The outlier detection results provide several useful pieces of information:
extern crate augurs; use augurs::outlier::{MADDetector, OutlierDetector}; fn main() { let series: &[&[f64]] = &[ &[1.0, 2.0, 1.5, 2.3], &[1.9, 2.2, 1.2, 2.4], &[1.5, 2.1, 6.4, 8.5], ]; let detector = MADDetector::with_sensitivity(0.5) .expect("sensitivity is between 0.0 and 1.0"); let processed = detector.preprocess(series).expect("input data is valid"); let outliers = detector.detect(&processed).expect("detection succeeds"); // Get indices of outlying series for &idx in &outliers.outlying_series { println!("Series {} is an outlier", idx); } // Examine detailed results for each series for (idx, result) in outliers.series_results.iter().enumerate() { println!("Series {}: outlier = {}", idx, result.is_outlier); println!("Scores: {:?}", result.scores); } }
Best Practices
-
Choosing a Detector
- Use MAD when you expect series to move within a stable band
- Use DBSCAN when series may have seasonality or complex patterns
-
Sensitivity Tuning
- Start with 0.5 sensitivity and adjust based on results
- Lower values (closer to 0.0) are more sensitive
- Higher values (closer to 1.0) are more selective
-
Performance Optimization
- Enable parallelization for large datasets
- Consider preprocessing data to remove noise
- Handle missing values before detection
Example: Real-time Monitoring
Here's an example of using outlier detection in a monitoring context:
#![allow(unused)] fn main() { extern crate augurs; use augurs::outlier::{MADDetector, OutlierDetector}; fn monitor_time_series(historical_data: &[&[f64]], new_data: &[f64]) -> bool { // Create detector from historical data let detector = MADDetector::with_sensitivity(0.5) .expect("sensitivity is between 0.0 and 1.0"); // Combine historical and new data let mut all_series: Vec<&[f64]> = historical_data.to_vec(); all_series.push(new_data); // Check for outliers let processed = detector.preprocess(&all_series) .expect("input data is valid"); let outliers = detector.detect(&processed) .expect("detection succeeds"); // Check if new series (last index) is an outlier outliers.outlying_series.contains(&(all_series.len() - 1)) } }
This structure provides a comprehensive guide to outlier detection while maintaining a practical focus on implementation details.
Time Series Clustering
Time series clustering is a technique used to group similar time series together. This can be useful for finding patterns in data, detecting anomalies, or reducing the dimensionality of large datasets.
augurs
provides several clustering algorithms, including DBSCAN (Density-Based Spatial Clustering of Applications with Noise). DBSCAN is particularly well-suited for time series data as it:
- Doesn't require specifying the number of clusters upfront
- Can find arbitrarily shaped clusters
- Can identify noise points that don't belong to any cluster
- Works well with Dynamic Time Warping (DTW) distance measures
Basic Example
Let's start with a simple example using DBSCAN clustering:
extern crate augurs; use augurs::{clustering::DbscanClusterer, dtw::Dtw}; // Sample time series data const SERIES: &[&[f64]] = &[ &[0.0, 1.0, 2.0, 3.0, 4.0], &[0.1, 1.1, 2.1, 3.1, 4.1], &[5.0, 6.0, 7.0, 8.0, 9.0], &[5.1, 6.1, 7.1, 8.1, 9.1], &[10.0, 11.0, 12.0, 13.0, 14.0], ]; fn main() { // Compute distance matrix using DTW let distance_matrix = Dtw::euclidean() .with_window(2) .with_lower_bound(4.0) .with_upper_bound(10.0) .with_max_distance(10.0) .distance_matrix(SERIES); // Set DBSCAN parameters let epsilon = 0.5; let min_cluster_size = 2; // Perform clustering let clusters: Vec<isize> = DbscanClusterer::new(epsilon, min_cluster_size) .fit(&distance_matrix); // Clusters are labeled: -1 for noise, 0+ for cluster membership assert_eq!(clusters, vec![0, 0, 1, 1, -1]); }
Understanding Parameters
DTW Parameters
window
: Size of the Sakoe-Chiba band for constraining DTW computationlower_bound
: Minimum distance to considerupper_bound
: Maximum distance to considermax_distance
: Early termination threshold
DBSCAN Parameters
epsilon
: Maximum distance between two points for one to be considered in the neighborhood of the othermin_cluster_size
: Minimum number of points required to form a dense region
Best Practices
-
Distance Measure Selection
- Use DTW for time series that might be shifted or warped
- Consider the computational cost of DTW for large datasets
- Experiment with different window sizes to balance accuracy and performance
-
Parameter Tuning
- Start with a relatively large
epsilon
and reduce it if clusters are too large - Set
min_cluster_size
based on your domain knowledge - Use the DTW window parameter to prevent pathological alignments
- Start with a relatively large
-
Performance Optimization
- Enable parallel processing for large datasets
- Use DTW bounds to speed up distance calculations
- Consider downsampling very long time series
Example: Clustering with Multiple Distance Measures
#![allow(unused)] fn main() { extern crate augurs; use augurs::{ clustering::DbscanClusterer, dtw::{Dtw, Distance} }; fn compare_distance_measures(series: &[&[f64]]) { // Euclidean DTW let euclidean_matrix = Dtw::euclidean() .distance_matrix(series); let euclidean_clusters = DbscanClusterer::new(0.5, 2) .fit(&euclidean_matrix); // Manhattan DTW let manhattan_matrix = Dtw::manhattan() .distance_matrix(series); let manhattan_clusters = DbscanClusterer::new(0.5, 2) .fit(&manhattan_matrix); // Compare results println!("Euclidean clusters: {:?}", euclidean_clusters); println!("Manhattan clusters: {:?}", manhattan_clusters); } }
Next Steps
- Learn about outlier detection using clustering
- Explore seasonality analysis for clustered time series
- Understand feature extraction for time series
API Documentation
API docs are available on docs.rs.
Contributing
Thank you for your interest in contributing to augurs! We welcome contributions from everyone.
The augurs repository can be found on GitHub.
Ways to Contribute
There are many ways to contribute:
- Report bugs and request features in the issue tracker
- Submit pull requests to fix issues or add features
- Improve documentation
- Share your experiences using augurs
Development Setup
- Fork and clone the repository
- Install Rust via rustup if you haven't already
- Install
just
for running tasks - Build the WASM component:
just build-component
- Start building and checking the project using bacon:
just build
- Run tests:
just test
Pull Request Process
- Fork the repository and create a new branch from
main
- Make your changes and add tests if applicable
- Update documentation as needed
- Run tests and clippy to ensure everything passes
- Submit a pull request describing your changes
Code Style
- Follow Rust standard style guidelines
- Run
cargo fmt
before committing - Run
cargo clippy
and address any warnings - Add documentation comments for public APIs
Testing
- Add tests for new functionality
- Ensure existing tests pass
- Include both unit tests and integration tests where appropriate
License
By contributing, you agree that your contributions will be licensed under the same terms as the project (MIT license OR APACHE 2.0 license).