# Transformations Aeon provides extensive transformation capabilities for preprocessing, feature extraction, and representation learning from time series data. ## Transformation Types Aeon distinguishes between: - **CollectionTransformers**: Transform multiple time series (collections) - **SeriesTransformers**: Transform individual time series ## Collection Transformers ### Convolution-Based Feature Extraction Fast, scalable feature generation using random kernels: - `RocketTransformer` - Random convolutional kernels - `MiniRocketTransformer` - Simplified ROCKET for speed - `MultiRocketTransformer` - Enhanced ROCKET variant - `HydraTransformer` - Multi-resolution dilated convolutions - `MultiRocketHydraTransformer` - Combines ROCKET and Hydra - `ROCKETGPU` - GPU-accelerated variant **Use when**: Need fast, scalable features for any ML algorithm, strong baseline performance. ### Statistical Feature Extraction Domain-agnostic features based on time series characteristics: - `Catch22` - 22 canonical time-series characteristics - `TSFresh` - Comprehensive automated feature extraction (100+ features) - `TSFreshRelevant` - Feature extraction with relevance filtering - `SevenNumberSummary` - Descriptive statistics (mean, std, quantiles) **Use when**: Need interpretable features, domain-agnostic approach, or feeding traditional ML. ### Dictionary-Based Representations Symbolic approximations for discrete representations: - `SAX` - Symbolic Aggregate approXimation - `PAA` - Piecewise Aggregate Approximation - `SFA` - Symbolic Fourier Approximation - `SFAFast` - Optimized SFA - `SFAWhole` - SFA on entire series (no windowing) - `BORF` - Bag-of-Receptive-Fields **Use when**: Need discrete/symbolic representation, dimensionality reduction, interpretability. ### Shapelet-Based Features Discriminative subsequence extraction: - `RandomShapeletTransform` - Random discriminative shapelets - `RandomDilatedShapeletTransform` - Dilated shapelets for multi-scale - `SAST` - Scalable And Accurate Subsequence Transform - `RSAST` - Randomized SAST **Use when**: Need interpretable discriminative patterns, phase-invariant features. ### Interval-Based Features Statistical summaries from time intervals: - `RandomIntervals` - Features from random intervals - `SupervisedIntervals` - Supervised interval selection - `QUANTTransformer` - Quantile-based interval features **Use when**: Predictive patterns localized to specific windows. ### Preprocessing Transformations Data preparation and normalization: - `MinMaxScaler` - Scale to [0, 1] range - `Normalizer` - Z-normalization (zero mean, unit variance) - `Centerer` - Center to zero mean - `SimpleImputer` - Fill missing values - `DownsampleTransformer` - Reduce temporal resolution - `Tabularizer` - Convert time series to tabular format **Use when**: Need standardization, missing value handling, format conversion. ### Specialized Transformations Advanced analysis methods: - `MatrixProfile` - Computes distance profiles for pattern discovery - `DWTTransformer` - Discrete Wavelet Transform - `AutocorrelationFunctionTransformer` - ACF computation - `Dobin` - Distance-based Outlier BasIs using Neighbors - `SignatureTransformer` - Path signature methods - `PLATransformer` - Piecewise Linear Approximation ### Class Imbalance Handling - `ADASYN` - Adaptive Synthetic Sampling - `SMOTE` - Synthetic Minority Over-sampling - `OHIT` - Over-sampling with Highly Imbalanced Time series **Use when**: Classification with imbalanced classes. ### Pipeline Composition - `CollectionTransformerPipeline` - Chain multiple transformers ## Series Transformers Transform individual time series (e.g., for preprocessing in forecasting). ### Statistical Analysis - `AutoCorrelationSeriesTransformer` - Autocorrelation - `StatsModelsACF` - ACF using statsmodels - `StatsModelsPACF` - Partial autocorrelation ### Smoothing and Filtering - `ExponentialSmoothing` - Exponentially weighted moving average - `MovingAverage` - Simple or weighted moving average - `SavitzkyGolayFilter` - Polynomial smoothing - `GaussianFilter` - Gaussian kernel smoothing - `BKFilter` - Baxter-King bandpass filter - `DiscreteFourierApproximation` - Fourier-based filtering **Use when**: Need noise reduction, trend extraction, or frequency filtering. ### Dimensionality Reduction - `PCASeriesTransformer` - Principal component analysis - `PlASeriesTransformer` - Piecewise Linear Approximation ### Transformations - `BoxCoxTransformer` - Variance stabilization - `LogTransformer` - Logarithmic scaling - `ClaSPTransformer` - Classification Score Profile ### Pipeline Composition - `SeriesTransformerPipeline` - Chain series transformers ## Quick Start: Feature Extraction ```python from aeon.transformations.collection.convolution_based import RocketTransformer from aeon.classification.sklearn import RotationForest from aeon.datasets import load_classification # Load data X_train, y_train = load_classification("GunPoint", split="train") X_test, y_test = load_classification("GunPoint", split="test") # Extract ROCKET features rocket = RocketTransformer() X_train_features = rocket.fit_transform(X_train) X_test_features = rocket.transform(X_test) # Use with any sklearn classifier clf = RotationForest() clf.fit(X_train_features, y_train) accuracy = clf.score(X_test_features, y_test) ``` ## Quick Start: Preprocessing Pipeline ```python from aeon.transformations.collection import ( MinMaxScaler, SimpleImputer, CollectionTransformerPipeline ) # Build preprocessing pipeline pipeline = CollectionTransformerPipeline([ ('imputer', SimpleImputer(strategy='mean')), ('scaler', MinMaxScaler()) ]) X_transformed = pipeline.fit_transform(X_train) ``` ## Quick Start: Series Smoothing ```python from aeon.transformations.series import MovingAverage # Smooth individual time series smoother = MovingAverage(window_size=5) y_smoothed = smoother.fit_transform(y) ``` ## Algorithm Selection ### For Feature Extraction: - **Speed + Performance**: MiniRocketTransformer - **Interpretability**: Catch22, TSFresh - **Dimensionality reduction**: PAA, SAX, PCA - **Discriminative patterns**: Shapelet transforms - **Comprehensive features**: TSFresh (with longer runtime) ### For Preprocessing: - **Normalization**: Normalizer, MinMaxScaler - **Smoothing**: MovingAverage, SavitzkyGolayFilter - **Missing values**: SimpleImputer - **Frequency analysis**: DWTTransformer, Fourier methods ### For Symbolic Representation: - **Fast approximation**: PAA - **Alphabet-based**: SAX - **Frequency-based**: SFA, SFAFast ## Best Practices 1. **Fit on training data only**: Avoid data leakage ```python transformer.fit(X_train) X_train_tf = transformer.transform(X_train) X_test_tf = transformer.transform(X_test) ``` 2. **Pipeline composition**: Chain transformers for complex workflows ```python pipeline = CollectionTransformerPipeline([ ('imputer', SimpleImputer()), ('scaler', Normalizer()), ('features', RocketTransformer()) ]) ``` 3. **Feature selection**: TSFresh can generate many features; consider selection ```python from sklearn.feature_selection import SelectKBest selector = SelectKBest(k=100) X_selected = selector.fit_transform(X_features, y) ``` 4. **Memory considerations**: Some transformers memory-intensive on large datasets - Use MiniRocket instead of ROCKET for speed - Consider downsampling for very long series - Use ROCKETGPU for GPU acceleration 5. **Domain knowledge**: Choose transformations matching domain: - Periodic data: Fourier-based methods - Noisy data: Smoothing filters - Spike detection: Wavelet transforms