7.7 KiB
Deep Learning Networks
Aeon provides neural network architectures specifically designed for time series tasks. These networks serve as building blocks for classification, regression, clustering, and forecasting.
Core Network Architectures
Convolutional Networks
FCNNetwork - Fully Convolutional Network
- Three convolutional blocks with batch normalization
- Global average pooling for dimensionality reduction
- Use when: Need simple yet effective CNN baseline
ResNetNetwork - Residual Network
- Residual blocks with skip connections
- Prevents vanishing gradients in deep networks
- Use when: Deep networks needed, training stability important
InceptionNetwork - Inception Modules
- Multi-scale feature extraction with parallel convolutions
- Different kernel sizes capture patterns at various scales
- Use when: Patterns exist at multiple temporal scales
TimeCNNNetwork - Standard CNN
- Basic convolutional architecture
- Use when: Simple CNN sufficient, interpretability valued
DisjointCNNNetwork - Separate Pathways
- Disjoint convolutional pathways
- Use when: Different feature extraction strategies needed
DCNNNetwork - Dilated CNN
- Dilated convolutions for large receptive fields
- Use when: Long-range dependencies without many layers
Recurrent Networks
RecurrentNetwork - RNN/LSTM/GRU
- Configurable cell type (RNN, LSTM, GRU)
- Sequential modeling of temporal dependencies
- Use when: Sequential dependencies critical, variable-length series
Temporal Convolutional Network
TCNNetwork - Temporal Convolutional Network
- Dilated causal convolutions
- Large receptive field without recurrence
- Use when: Long sequences, need parallelizable architecture
Multi-Layer Perceptron
MLPNetwork - Basic Feedforward
- Simple fully-connected layers
- Flattens time series before processing
- Use when: Baseline needed, computational limits, or simple patterns
Encoder-Based Architectures
Networks designed for representation learning and clustering.
Autoencoder Variants
EncoderNetwork - Generic Encoder
- Flexible encoder structure
- Use when: Custom encoding needed
AEFCNNetwork - FCN-based Autoencoder
- Fully convolutional encoder-decoder
- Use when: Need convolutional representation learning
AEResNetNetwork - ResNet Autoencoder
- Residual blocks in encoder-decoder
- Use when: Deep autoencoding with skip connections
AEDCNNNetwork - Dilated CNN Autoencoder
- Dilated convolutions for compression
- Use when: Need large receptive field in autoencoder
AEDRNNNetwork - Dilated RNN Autoencoder
- Dilated recurrent connections
- Use when: Sequential patterns with long-range dependencies
AEBiGRUNetwork - Bidirectional GRU
- Bidirectional recurrent encoding
- Use when: Context from both directions helpful
AEAttentionBiGRUNetwork - Attention + BiGRU
- Attention mechanism on BiGRU outputs
- Use when: Need to focus on important time steps
Specialized Architectures
LITENetwork - Lightweight Inception Time Ensemble
- Efficient inception-based architecture
- LITEMV variant for multivariate series
- Use when: Need efficiency with strong performance
DeepARNetwork - Probabilistic Forecasting
- Autoregressive RNN for forecasting
- Produces probabilistic predictions
- Use when: Need forecast uncertainty quantification
Usage with Estimators
Networks are typically used within estimators, not directly:
from aeon.classification.deep_learning import FCNClassifier
from aeon.regression.deep_learning import ResNetRegressor
from aeon.clustering.deep_learning import AEFCNClusterer
# Classification with FCN
clf = FCNClassifier(n_epochs=100, batch_size=16)
clf.fit(X_train, y_train)
# Regression with ResNet
reg = ResNetRegressor(n_epochs=100)
reg.fit(X_train, y_train)
# Clustering with autoencoder
clusterer = AEFCNClusterer(n_clusters=3, n_epochs=100)
labels = clusterer.fit_predict(X_train)
Custom Network Configuration
Many networks accept configuration parameters:
# Configure FCN layers
clf = FCNClassifier(
n_epochs=200,
batch_size=32,
kernel_size=[7, 5, 3], # Kernel sizes for each layer
n_filters=[128, 256, 128], # Filters per layer
learning_rate=0.001
)
Base Classes
BaseDeepLearningNetwork- Abstract base for all networksBaseDeepRegressor- Base for deep regressionBaseDeepClassifier- Base for deep classificationBaseDeepForecaster- Base for deep forecasting
Extend these to implement custom architectures.
Training Considerations
Hyperparameters
Key hyperparameters to tune:
n_epochs- Training iterations (50-200 typical)batch_size- Samples per batch (16-64 typical)learning_rate- Step size (0.0001-0.01)- Network-specific: layers, filters, kernel sizes
Callbacks
Many networks support callbacks for training monitoring:
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
clf = FCNClassifier(
n_epochs=200,
callbacks=[
EarlyStopping(patience=20, restore_best_weights=True),
ReduceLROnPlateau(patience=10, factor=0.5)
]
)
GPU Acceleration
Deep learning networks benefit from GPU:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0' # Use first GPU
# Networks automatically use GPU if available
clf = InceptionTimeClassifier(n_epochs=100)
clf.fit(X_train, y_train)
Architecture Selection
By Task:
Classification: InceptionNetwork, ResNetNetwork, FCNNetwork Regression: InceptionNetwork, ResNetNetwork, TCNNetwork Forecasting: TCNNetwork, DeepARNetwork, RecurrentNetwork Clustering: AEFCNNetwork, AEResNetNetwork, AEAttentionBiGRUNetwork
By Data Characteristics:
Long sequences: TCNNetwork, DCNNNetwork (dilated convolutions) Short sequences: MLPNetwork, FCNNetwork Multivariate: InceptionNetwork, FCNNetwork, LITENetwork Variable length: RecurrentNetwork with masking Multi-scale patterns: InceptionNetwork
By Computational Resources:
Limited compute: MLPNetwork, LITENetwork Moderate compute: FCNNetwork, TimeCNNNetwork High compute available: InceptionNetwork, ResNetNetwork GPU available: Any deep network (major speedup)
Best Practices
1. Data Preparation
Normalize input data:
from aeon.transformations.collection import Normalizer
normalizer = Normalizer()
X_train_norm = normalizer.fit_transform(X_train)
X_test_norm = normalizer.transform(X_test)
2. Training/Validation Split
Use validation set for early stopping:
from sklearn.model_selection import train_test_split
X_train_fit, X_val, y_train_fit, y_val = train_test_split(
X_train, y_train, test_size=0.2, stratify=y_train
)
clf = FCNClassifier(n_epochs=200)
clf.fit(X_train_fit, y_train_fit, validation_data=(X_val, y_val))
3. Start Simple
Begin with simpler architectures before complex ones:
- Try MLPNetwork or FCNNetwork first
- If insufficient, try ResNetNetwork or InceptionNetwork
- Consider ensembles if single models insufficient
4. Hyperparameter Tuning
Use grid search or random search:
from sklearn.model_selection import GridSearchCV
param_grid = {
'n_epochs': [100, 200],
'batch_size': [16, 32],
'learning_rate': [0.001, 0.0001]
}
clf = FCNClassifier()
grid = GridSearchCV(clf, param_grid, cv=3)
grid.fit(X_train, y_train)
5. Regularization
Prevent overfitting:
- Use dropout (if network supports)
- Early stopping
- Data augmentation (if available)
- Reduce model complexity
6. Reproducibility
Set random seeds:
import numpy as np
import random
import tensorflow as tf
seed = 42
np.random.seed(seed)
random.seed(seed)
tf.random.set_seed(seed)