7.5 KiB
Transformations
Aeon provides extensive transformation capabilities for preprocessing, feature extraction, and representation learning from time series data.
Transformation Types
Aeon distinguishes between:
- CollectionTransformers: Transform multiple time series (collections)
- SeriesTransformers: Transform individual time series
Collection Transformers
Convolution-Based Feature Extraction
Fast, scalable feature generation using random kernels:
RocketTransformer- Random convolutional kernelsMiniRocketTransformer- Simplified ROCKET for speedMultiRocketTransformer- Enhanced ROCKET variantHydraTransformer- Multi-resolution dilated convolutionsMultiRocketHydraTransformer- Combines ROCKET and HydraROCKETGPU- GPU-accelerated variant
Use when: Need fast, scalable features for any ML algorithm, strong baseline performance.
Statistical Feature Extraction
Domain-agnostic features based on time series characteristics:
Catch22- 22 canonical time-series characteristicsTSFresh- Comprehensive automated feature extraction (100+ features)TSFreshRelevant- Feature extraction with relevance filteringSevenNumberSummary- Descriptive statistics (mean, std, quantiles)
Use when: Need interpretable features, domain-agnostic approach, or feeding traditional ML.
Dictionary-Based Representations
Symbolic approximations for discrete representations:
SAX- Symbolic Aggregate approXimationPAA- Piecewise Aggregate ApproximationSFA- Symbolic Fourier ApproximationSFAFast- Optimized SFASFAWhole- SFA on entire series (no windowing)BORF- Bag-of-Receptive-Fields
Use when: Need discrete/symbolic representation, dimensionality reduction, interpretability.
Shapelet-Based Features
Discriminative subsequence extraction:
RandomShapeletTransform- Random discriminative shapeletsRandomDilatedShapeletTransform- Dilated shapelets for multi-scaleSAST- Scalable And Accurate Subsequence TransformRSAST- Randomized SAST
Use when: Need interpretable discriminative patterns, phase-invariant features.
Interval-Based Features
Statistical summaries from time intervals:
RandomIntervals- Features from random intervalsSupervisedIntervals- Supervised interval selectionQUANTTransformer- Quantile-based interval features
Use when: Predictive patterns localized to specific windows.
Preprocessing Transformations
Data preparation and normalization:
MinMaxScaler- Scale to [0, 1] rangeNormalizer- Z-normalization (zero mean, unit variance)Centerer- Center to zero meanSimpleImputer- Fill missing valuesDownsampleTransformer- Reduce temporal resolutionTabularizer- Convert time series to tabular format
Use when: Need standardization, missing value handling, format conversion.
Specialized Transformations
Advanced analysis methods:
MatrixProfile- Computes distance profiles for pattern discoveryDWTTransformer- Discrete Wavelet TransformAutocorrelationFunctionTransformer- ACF computationDobin- Distance-based Outlier BasIs using NeighborsSignatureTransformer- Path signature methodsPLATransformer- Piecewise Linear Approximation
Class Imbalance Handling
ADASYN- Adaptive Synthetic SamplingSMOTE- Synthetic Minority Over-samplingOHIT- Over-sampling with Highly Imbalanced Time series
Use when: Classification with imbalanced classes.
Pipeline Composition
CollectionTransformerPipeline- Chain multiple transformers
Series Transformers
Transform individual time series (e.g., for preprocessing in forecasting).
Statistical Analysis
AutoCorrelationSeriesTransformer- AutocorrelationStatsModelsACF- ACF using statsmodelsStatsModelsPACF- Partial autocorrelation
Smoothing and Filtering
ExponentialSmoothing- Exponentially weighted moving averageMovingAverage- Simple or weighted moving averageSavitzkyGolayFilter- Polynomial smoothingGaussianFilter- Gaussian kernel smoothingBKFilter- Baxter-King bandpass filterDiscreteFourierApproximation- Fourier-based filtering
Use when: Need noise reduction, trend extraction, or frequency filtering.
Dimensionality Reduction
PCASeriesTransformer- Principal component analysisPlASeriesTransformer- Piecewise Linear Approximation
Transformations
BoxCoxTransformer- Variance stabilizationLogTransformer- Logarithmic scalingClaSPTransformer- Classification Score Profile
Pipeline Composition
SeriesTransformerPipeline- Chain series transformers
Quick Start: Feature Extraction
from aeon.transformations.collection.convolution_based import RocketTransformer
from aeon.classification.sklearn import RotationForest
from aeon.datasets import load_classification
# Load data
X_train, y_train = load_classification("GunPoint", split="train")
X_test, y_test = load_classification("GunPoint", split="test")
# Extract ROCKET features
rocket = RocketTransformer()
X_train_features = rocket.fit_transform(X_train)
X_test_features = rocket.transform(X_test)
# Use with any sklearn classifier
clf = RotationForest()
clf.fit(X_train_features, y_train)
accuracy = clf.score(X_test_features, y_test)
Quick Start: Preprocessing Pipeline
from aeon.transformations.collection import (
MinMaxScaler,
SimpleImputer,
CollectionTransformerPipeline
)
# Build preprocessing pipeline
pipeline = CollectionTransformerPipeline([
('imputer', SimpleImputer(strategy='mean')),
('scaler', MinMaxScaler())
])
X_transformed = pipeline.fit_transform(X_train)
Quick Start: Series Smoothing
from aeon.transformations.series import MovingAverage
# Smooth individual time series
smoother = MovingAverage(window_size=5)
y_smoothed = smoother.fit_transform(y)
Algorithm Selection
For Feature Extraction:
- Speed + Performance: MiniRocketTransformer
- Interpretability: Catch22, TSFresh
- Dimensionality reduction: PAA, SAX, PCA
- Discriminative patterns: Shapelet transforms
- Comprehensive features: TSFresh (with longer runtime)
For Preprocessing:
- Normalization: Normalizer, MinMaxScaler
- Smoothing: MovingAverage, SavitzkyGolayFilter
- Missing values: SimpleImputer
- Frequency analysis: DWTTransformer, Fourier methods
For Symbolic Representation:
- Fast approximation: PAA
- Alphabet-based: SAX
- Frequency-based: SFA, SFAFast
Best Practices
-
Fit on training data only: Avoid data leakage
transformer.fit(X_train) X_train_tf = transformer.transform(X_train) X_test_tf = transformer.transform(X_test) -
Pipeline composition: Chain transformers for complex workflows
pipeline = CollectionTransformerPipeline([ ('imputer', SimpleImputer()), ('scaler', Normalizer()), ('features', RocketTransformer()) ]) -
Feature selection: TSFresh can generate many features; consider selection
from sklearn.feature_selection import SelectKBest selector = SelectKBest(k=100) X_selected = selector.fit_transform(X_features, y) -
Memory considerations: Some transformers memory-intensive on large datasets
- Use MiniRocket instead of ROCKET for speed
- Consider downsampling for very long series
- Use ROCKETGPU for GPU acceleration
-
Domain knowledge: Choose transformations matching domain:
- Periodic data: Fourier-based methods
- Noisy data: Smoothing filters
- Spike detection: Wavelet transforms