Files
gh-k-dense-ai-claude-scient…/skills/aeon/references/networks.md
2025-11-30 08:30:10 +08:00

290 lines
7.7 KiB
Markdown

# Deep Learning Networks
Aeon provides neural network architectures specifically designed for time series tasks. These networks serve as building blocks for classification, regression, clustering, and forecasting.
## Core Network Architectures
### Convolutional Networks
**FCNNetwork** - Fully Convolutional Network
- Three convolutional blocks with batch normalization
- Global average pooling for dimensionality reduction
- **Use when**: Need simple yet effective CNN baseline
**ResNetNetwork** - Residual Network
- Residual blocks with skip connections
- Prevents vanishing gradients in deep networks
- **Use when**: Deep networks needed, training stability important
**InceptionNetwork** - Inception Modules
- Multi-scale feature extraction with parallel convolutions
- Different kernel sizes capture patterns at various scales
- **Use when**: Patterns exist at multiple temporal scales
**TimeCNNNetwork** - Standard CNN
- Basic convolutional architecture
- **Use when**: Simple CNN sufficient, interpretability valued
**DisjointCNNNetwork** - Separate Pathways
- Disjoint convolutional pathways
- **Use when**: Different feature extraction strategies needed
**DCNNNetwork** - Dilated CNN
- Dilated convolutions for large receptive fields
- **Use when**: Long-range dependencies without many layers
### Recurrent Networks
**RecurrentNetwork** - RNN/LSTM/GRU
- Configurable cell type (RNN, LSTM, GRU)
- Sequential modeling of temporal dependencies
- **Use when**: Sequential dependencies critical, variable-length series
### Temporal Convolutional Network
**TCNNetwork** - Temporal Convolutional Network
- Dilated causal convolutions
- Large receptive field without recurrence
- **Use when**: Long sequences, need parallelizable architecture
### Multi-Layer Perceptron
**MLPNetwork** - Basic Feedforward
- Simple fully-connected layers
- Flattens time series before processing
- **Use when**: Baseline needed, computational limits, or simple patterns
## Encoder-Based Architectures
Networks designed for representation learning and clustering.
### Autoencoder Variants
**EncoderNetwork** - Generic Encoder
- Flexible encoder structure
- **Use when**: Custom encoding needed
**AEFCNNetwork** - FCN-based Autoencoder
- Fully convolutional encoder-decoder
- **Use when**: Need convolutional representation learning
**AEResNetNetwork** - ResNet Autoencoder
- Residual blocks in encoder-decoder
- **Use when**: Deep autoencoding with skip connections
**AEDCNNNetwork** - Dilated CNN Autoencoder
- Dilated convolutions for compression
- **Use when**: Need large receptive field in autoencoder
**AEDRNNNetwork** - Dilated RNN Autoencoder
- Dilated recurrent connections
- **Use when**: Sequential patterns with long-range dependencies
**AEBiGRUNetwork** - Bidirectional GRU
- Bidirectional recurrent encoding
- **Use when**: Context from both directions helpful
**AEAttentionBiGRUNetwork** - Attention + BiGRU
- Attention mechanism on BiGRU outputs
- **Use when**: Need to focus on important time steps
## Specialized Architectures
**LITENetwork** - Lightweight Inception Time Ensemble
- Efficient inception-based architecture
- LITEMV variant for multivariate series
- **Use when**: Need efficiency with strong performance
**DeepARNetwork** - Probabilistic Forecasting
- Autoregressive RNN for forecasting
- Produces probabilistic predictions
- **Use when**: Need forecast uncertainty quantification
## Usage with Estimators
Networks are typically used within estimators, not directly:
```python
from aeon.classification.deep_learning import FCNClassifier
from aeon.regression.deep_learning import ResNetRegressor
from aeon.clustering.deep_learning import AEFCNClusterer
# Classification with FCN
clf = FCNClassifier(n_epochs=100, batch_size=16)
clf.fit(X_train, y_train)
# Regression with ResNet
reg = ResNetRegressor(n_epochs=100)
reg.fit(X_train, y_train)
# Clustering with autoencoder
clusterer = AEFCNClusterer(n_clusters=3, n_epochs=100)
labels = clusterer.fit_predict(X_train)
```
## Custom Network Configuration
Many networks accept configuration parameters:
```python
# Configure FCN layers
clf = FCNClassifier(
n_epochs=200,
batch_size=32,
kernel_size=[7, 5, 3], # Kernel sizes for each layer
n_filters=[128, 256, 128], # Filters per layer
learning_rate=0.001
)
```
## Base Classes
- `BaseDeepLearningNetwork` - Abstract base for all networks
- `BaseDeepRegressor` - Base for deep regression
- `BaseDeepClassifier` - Base for deep classification
- `BaseDeepForecaster` - Base for deep forecasting
Extend these to implement custom architectures.
## Training Considerations
### Hyperparameters
Key hyperparameters to tune:
- `n_epochs` - Training iterations (50-200 typical)
- `batch_size` - Samples per batch (16-64 typical)
- `learning_rate` - Step size (0.0001-0.01)
- Network-specific: layers, filters, kernel sizes
### Callbacks
Many networks support callbacks for training monitoring:
```python
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
clf = FCNClassifier(
n_epochs=200,
callbacks=[
EarlyStopping(patience=20, restore_best_weights=True),
ReduceLROnPlateau(patience=10, factor=0.5)
]
)
```
### GPU Acceleration
Deep learning networks benefit from GPU:
```python
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0' # Use first GPU
# Networks automatically use GPU if available
clf = InceptionTimeClassifier(n_epochs=100)
clf.fit(X_train, y_train)
```
## Architecture Selection
### By Task:
**Classification**: InceptionNetwork, ResNetNetwork, FCNNetwork
**Regression**: InceptionNetwork, ResNetNetwork, TCNNetwork
**Forecasting**: TCNNetwork, DeepARNetwork, RecurrentNetwork
**Clustering**: AEFCNNetwork, AEResNetNetwork, AEAttentionBiGRUNetwork
### By Data Characteristics:
**Long sequences**: TCNNetwork, DCNNNetwork (dilated convolutions)
**Short sequences**: MLPNetwork, FCNNetwork
**Multivariate**: InceptionNetwork, FCNNetwork, LITENetwork
**Variable length**: RecurrentNetwork with masking
**Multi-scale patterns**: InceptionNetwork
### By Computational Resources:
**Limited compute**: MLPNetwork, LITENetwork
**Moderate compute**: FCNNetwork, TimeCNNNetwork
**High compute available**: InceptionNetwork, ResNetNetwork
**GPU available**: Any deep network (major speedup)
## Best Practices
### 1. Data Preparation
Normalize input data:
```python
from aeon.transformations.collection import Normalizer
normalizer = Normalizer()
X_train_norm = normalizer.fit_transform(X_train)
X_test_norm = normalizer.transform(X_test)
```
### 2. Training/Validation Split
Use validation set for early stopping:
```python
from sklearn.model_selection import train_test_split
X_train_fit, X_val, y_train_fit, y_val = train_test_split(
X_train, y_train, test_size=0.2, stratify=y_train
)
clf = FCNClassifier(n_epochs=200)
clf.fit(X_train_fit, y_train_fit, validation_data=(X_val, y_val))
```
### 3. Start Simple
Begin with simpler architectures before complex ones:
1. Try MLPNetwork or FCNNetwork first
2. If insufficient, try ResNetNetwork or InceptionNetwork
3. Consider ensembles if single models insufficient
### 4. Hyperparameter Tuning
Use grid search or random search:
```python
from sklearn.model_selection import GridSearchCV
param_grid = {
'n_epochs': [100, 200],
'batch_size': [16, 32],
'learning_rate': [0.001, 0.0001]
}
clf = FCNClassifier()
grid = GridSearchCV(clf, param_grid, cv=3)
grid.fit(X_train, y_train)
```
### 5. Regularization
Prevent overfitting:
- Use dropout (if network supports)
- Early stopping
- Data augmentation (if available)
- Reduce model complexity
### 6. Reproducibility
Set random seeds:
```python
import numpy as np
import random
import tensorflow as tf
seed = 42
np.random.seed(seed)
random.seed(seed)
tf.random.set_seed(seed)
```