6.3 KiB
Distance Metrics
Aeon provides specialized distance functions for measuring similarity between time series, compatible with both aeon and scikit-learn estimators.
Distance Categories
Elastic Distances
Allow flexible temporal alignment between series:
Dynamic Time Warping Family:
dtw- Classic Dynamic Time Warpingddtw- Derivative DTW (compares derivatives)wdtw- Weighted DTW (penalizes warping by location)wddtw- Weighted Derivative DTWshape_dtw- Shape-based DTW
Edit-Based:
erp- Edit distance with Real Penaltyedr- Edit Distance on Real sequenceslcss- Longest Common SubSequencetwe- Time Warp Edit distance
Specialized:
msm- Move-Split-Merge distanceadtw- Amerced DTWsbd- Shape-Based Distance
Use when: Time series may have temporal shifts, speed variations, or phase differences.
Lock-Step Distances
Compare time series point-by-point without alignment:
euclidean- Euclidean distance (L2 norm)manhattan- Manhattan distance (L1 norm)minkowski- Generalized Minkowski distance (Lp norm)squared- Squared Euclidean distance
Use when: Series already aligned, need computational speed, or no temporal warping expected.
Usage Patterns
Computing Single Distance
from aeon.distances import dtw_distance
# Distance between two time series
distance = dtw_distance(x, y)
# With window constraint (Sakoe-Chiba band)
distance = dtw_distance(x, y, window=0.1)
Pairwise Distance Matrix
from aeon.distances import dtw_pairwise_distance
# All pairwise distances in collection
X = [series1, series2, series3, series4]
distance_matrix = dtw_pairwise_distance(X)
# Cross-collection distances
distance_matrix = dtw_pairwise_distance(X_train, X_test)
Cost Matrix and Alignment Path
from aeon.distances import dtw_cost_matrix, dtw_alignment_path
# Get full cost matrix
cost_matrix = dtw_cost_matrix(x, y)
# Get optimal alignment path
path = dtw_alignment_path(x, y)
# Returns indices: [(0,0), (1,1), (2,1), (2,2), ...]
Using with Estimators
from aeon.classification.distance_based import KNeighborsTimeSeriesClassifier
# Use DTW distance in classifier
clf = KNeighborsTimeSeriesClassifier(
n_neighbors=5,
distance="dtw",
distance_params={"window": 0.2}
)
clf.fit(X_train, y_train)
Distance Parameters
Window Constraints
Limit warping path deviation (improves speed and prevents pathological warping):
# Sakoe-Chiba band: window as fraction of series length
dtw_distance(x, y, window=0.1) # Allow 10% deviation
# Itakura parallelogram: slopes constrain path
dtw_distance(x, y, itakura_max_slope=2.0)
Normalization
Control whether to z-normalize series before distance computation:
# Most elastic distances support normalization
distance = dtw_distance(x, y, normalize=True)
Distance-Specific Parameters
# ERP: penalty for gaps
distance = erp_distance(x, y, g=0.5)
# TWE: stiffness and penalty parameters
distance = twe_distance(x, y, nu=0.001, lmbda=1.0)
# LCSS: epsilon threshold for matching
distance = lcss_distance(x, y, epsilon=0.5)
Algorithm Selection
By Use Case:
Temporal misalignment: DTW, DDTW, WDTW Speed variations: DTW with window constraint Shape similarity: Shape DTW, SBD Edit operations: ERP, EDR, LCSS Derivative matching: DDTW Computational speed: Euclidean, Manhattan Outlier robustness: Manhattan, LCSS
By Computational Cost:
Fastest: Euclidean (O(n)) Fast: Constrained DTW (O(nw) where w is window) Medium: Full DTW (O(n²)) Slower: Complex elastic distances (ERP, TWE, MSM)
Quick Reference Table
| Distance | Alignment | Speed | Robustness | Interpretability |
|---|---|---|---|---|
| Euclidean | Lock-step | Very Fast | Low | High |
| DTW | Elastic | Medium | Medium | Medium |
| DDTW | Elastic | Medium | High | Medium |
| WDTW | Elastic | Medium | Medium | Medium |
| ERP | Edit-based | Slow | High | Low |
| LCSS | Edit-based | Slow | Very High | Low |
| Shape DTW | Elastic | Medium | Medium | High |
Best Practices
1. Normalization
Most distances sensitive to scale; normalize when appropriate:
from aeon.transformations.collection import Normalizer
normalizer = Normalizer()
X_normalized = normalizer.fit_transform(X)
2. Window Constraints
For DTW variants, use window constraints for speed and better generalization:
# Start with 10-20% window
distance = dtw_distance(x, y, window=0.1)
3. Series Length
- Equal-length required: Most lock-step distances
- Unequal-length supported: Elastic distances (DTW, ERP, etc.)
4. Multivariate Series
Most distances support multivariate time series:
# x.shape = (n_channels, n_timepoints)
distance = dtw_distance(x_multivariate, y_multivariate)
5. Performance Optimization
- Use numba-compiled implementations (default in aeon)
- Consider lock-step distances if alignment not needed
- Use windowed DTW instead of full DTW
- Precompute distance matrices for repeated use
6. Choosing the Right Distance
# Quick decision tree:
if series_aligned:
use_distance = "euclidean"
elif need_speed:
use_distance = "dtw" # with window constraint
elif temporal_shifts_expected:
use_distance = "dtw" or "shape_dtw"
elif outliers_present:
use_distance = "lcss" or "manhattan"
elif derivatives_matter:
use_distance = "ddtw" or "wddtw"
Integration with scikit-learn
Aeon distances work with sklearn estimators:
from sklearn.neighbors import KNeighborsClassifier
from aeon.distances import dtw_pairwise_distance
# Precompute distance matrix
X_train_distances = dtw_pairwise_distance(X_train)
# Use with sklearn
clf = KNeighborsClassifier(metric='precomputed')
clf.fit(X_train_distances, y_train)
Available Distance Functions
Get list of all available distances:
from aeon.distances import get_distance_function_names
print(get_distance_function_names())
# ['dtw', 'ddtw', 'wdtw', 'euclidean', 'erp', 'edr', ...]
Retrieve specific distance function:
from aeon.distances import get_distance_function
distance_func = get_distance_function("dtw")
result = distance_func(x, y, window=0.1)