Initial commit

2025-11-30 08:30:10 +08:00
commit f0bd18fb4e
824 changed files with 331919 additions and 0 deletions
--- a/skills/aeon/references/networks.md
+++ b/skills/aeon/references/networks.md
@@ -0,0 +1,289 @@
+# Deep Learning Networks
+
+Aeon provides neural network architectures specifically designed for time series tasks. These networks serve as building blocks for classification, regression, clustering, and forecasting.
+
+## Core Network Architectures
+
+### Convolutional Networks
+
+**FCNNetwork** - Fully Convolutional Network
+- Three convolutional blocks with batch normalization
+- Global average pooling for dimensionality reduction
+- **Use when**: Need simple yet effective CNN baseline
+
+**ResNetNetwork** - Residual Network
+- Residual blocks with skip connections
+- Prevents vanishing gradients in deep networks
+- **Use when**: Deep networks needed, training stability important
+
+**InceptionNetwork** - Inception Modules
+- Multi-scale feature extraction with parallel convolutions
+- Different kernel sizes capture patterns at various scales
+- **Use when**: Patterns exist at multiple temporal scales
+
+**TimeCNNNetwork** - Standard CNN
+- Basic convolutional architecture
+- **Use when**: Simple CNN sufficient, interpretability valued
+
+**DisjointCNNNetwork** - Separate Pathways
+- Disjoint convolutional pathways
+- **Use when**: Different feature extraction strategies needed
+
+**DCNNNetwork** - Dilated CNN
+- Dilated convolutions for large receptive fields
+- **Use when**: Long-range dependencies without many layers
+
+### Recurrent Networks
+
+**RecurrentNetwork** - RNN/LSTM/GRU
+- Configurable cell type (RNN, LSTM, GRU)
+- Sequential modeling of temporal dependencies
+- **Use when**: Sequential dependencies critical, variable-length series
+
+### Temporal Convolutional Network
+
+**TCNNetwork** - Temporal Convolutional Network
+- Dilated causal convolutions
+- Large receptive field without recurrence
+- **Use when**: Long sequences, need parallelizable architecture
+
+### Multi-Layer Perceptron
+
+**MLPNetwork** - Basic Feedforward
+- Simple fully-connected layers
+- Flattens time series before processing
+- **Use when**: Baseline needed, computational limits, or simple patterns
+
+## Encoder-Based Architectures
+
+Networks designed for representation learning and clustering.
+
+### Autoencoder Variants
+
+**EncoderNetwork** - Generic Encoder
+- Flexible encoder structure
+- **Use when**: Custom encoding needed
+
+**AEFCNNetwork** - FCN-based Autoencoder
+- Fully convolutional encoder-decoder
+- **Use when**: Need convolutional representation learning
+
+**AEResNetNetwork** - ResNet Autoencoder
+- Residual blocks in encoder-decoder
+- **Use when**: Deep autoencoding with skip connections
+
+**AEDCNNNetwork** - Dilated CNN Autoencoder
+- Dilated convolutions for compression
+- **Use when**: Need large receptive field in autoencoder
+
+**AEDRNNNetwork** - Dilated RNN Autoencoder
+- Dilated recurrent connections
+- **Use when**: Sequential patterns with long-range dependencies
+
+**AEBiGRUNetwork** - Bidirectional GRU
+- Bidirectional recurrent encoding
+- **Use when**: Context from both directions helpful
+
+**AEAttentionBiGRUNetwork** - Attention + BiGRU
+- Attention mechanism on BiGRU outputs
+- **Use when**: Need to focus on important time steps
+
+## Specialized Architectures
+
+**LITENetwork** - Lightweight Inception Time Ensemble
+- Efficient inception-based architecture
+- LITEMV variant for multivariate series
+- **Use when**: Need efficiency with strong performance
+
+**DeepARNetwork** - Probabilistic Forecasting
+- Autoregressive RNN for forecasting
+- Produces probabilistic predictions
+- **Use when**: Need forecast uncertainty quantification
+
+## Usage with Estimators
+
+Networks are typically used within estimators, not directly:
+
+```python
+from aeon.classification.deep_learning import FCNClassifier
+from aeon.regression.deep_learning import ResNetRegressor
+from aeon.clustering.deep_learning import AEFCNClusterer
+
+# Classification with FCN
+clf = FCNClassifier(n_epochs=100, batch_size=16)
+clf.fit(X_train, y_train)
+
+# Regression with ResNet
+reg = ResNetRegressor(n_epochs=100)
+reg.fit(X_train, y_train)
+
+# Clustering with autoencoder
+clusterer = AEFCNClusterer(n_clusters=3, n_epochs=100)
+labels = clusterer.fit_predict(X_train)
+```
+
+## Custom Network Configuration
+
+Many networks accept configuration parameters:
+
+```python
+# Configure FCN layers
+clf = FCNClassifier(
+    n_epochs=200,
+    batch_size=32,
+    kernel_size=[7, 5, 3],  # Kernel sizes for each layer
+    n_filters=[128, 256, 128],  # Filters per layer
+    learning_rate=0.001
+)
+```
+
+## Base Classes
+
+- `BaseDeepLearningNetwork` - Abstract base for all networks
+- `BaseDeepRegressor` - Base for deep regression
+- `BaseDeepClassifier` - Base for deep classification
+- `BaseDeepForecaster` - Base for deep forecasting
+
+Extend these to implement custom architectures.
+
+## Training Considerations
+
+### Hyperparameters
+
+Key hyperparameters to tune:
+
+- `n_epochs` - Training iterations (50-200 typical)
+- `batch_size` - Samples per batch (16-64 typical)
+- `learning_rate` - Step size (0.0001-0.01)
+- Network-specific: layers, filters, kernel sizes
+
+### Callbacks
+
+Many networks support callbacks for training monitoring:
+
+```python
+from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
+
+clf = FCNClassifier(
+    n_epochs=200,
+    callbacks=[
+        EarlyStopping(patience=20, restore_best_weights=True),
+        ReduceLROnPlateau(patience=10, factor=0.5)
+    ]
+)
+```
+
+### GPU Acceleration
+
+Deep learning networks benefit from GPU:
+
+```python
+import os
+os.environ['CUDA_VISIBLE_DEVICES'] = '0'  # Use first GPU
+
+# Networks automatically use GPU if available
+clf = InceptionTimeClassifier(n_epochs=100)
+clf.fit(X_train, y_train)
+```
+
+## Architecture Selection
+
+### By Task:
+
+**Classification**: InceptionNetwork, ResNetNetwork, FCNNetwork
+**Regression**: InceptionNetwork, ResNetNetwork, TCNNetwork
+**Forecasting**: TCNNetwork, DeepARNetwork, RecurrentNetwork
+**Clustering**: AEFCNNetwork, AEResNetNetwork, AEAttentionBiGRUNetwork
+
+### By Data Characteristics:
+
+**Long sequences**: TCNNetwork, DCNNNetwork (dilated convolutions)
+**Short sequences**: MLPNetwork, FCNNetwork
+**Multivariate**: InceptionNetwork, FCNNetwork, LITENetwork
+**Variable length**: RecurrentNetwork with masking
+**Multi-scale patterns**: InceptionNetwork
+
+### By Computational Resources:
+
+**Limited compute**: MLPNetwork, LITENetwork
+**Moderate compute**: FCNNetwork, TimeCNNNetwork
+**High compute available**: InceptionNetwork, ResNetNetwork
+**GPU available**: Any deep network (major speedup)
+
+## Best Practices
+
+### 1. Data Preparation
+
+Normalize input data:
+
+```python
+from aeon.transformations.collection import Normalizer
+
+normalizer = Normalizer()
+X_train_norm = normalizer.fit_transform(X_train)
+X_test_norm = normalizer.transform(X_test)
+```
+
+### 2. Training/Validation Split
+
+Use validation set for early stopping:
+
+```python
+from sklearn.model_selection import train_test_split
+
+X_train_fit, X_val, y_train_fit, y_val = train_test_split(
+    X_train, y_train, test_size=0.2, stratify=y_train
+)
+
+clf = FCNClassifier(n_epochs=200)
+clf.fit(X_train_fit, y_train_fit, validation_data=(X_val, y_val))
+```
+
+### 3. Start Simple
+
+Begin with simpler architectures before complex ones:
+
+1. Try MLPNetwork or FCNNetwork first
+2. If insufficient, try ResNetNetwork or InceptionNetwork
+3. Consider ensembles if single models insufficient
+
+### 4. Hyperparameter Tuning
+
+Use grid search or random search:
+
+```python
+from sklearn.model_selection import GridSearchCV
+
+param_grid = {
+    'n_epochs': [100, 200],
+    'batch_size': [16, 32],
+    'learning_rate': [0.001, 0.0001]
+}
+
+clf = FCNClassifier()
+grid = GridSearchCV(clf, param_grid, cv=3)
+grid.fit(X_train, y_train)
+```
+
+### 5. Regularization
+
+Prevent overfitting:
+- Use dropout (if network supports)
+- Early stopping
+- Data augmentation (if available)
+- Reduce model complexity
+
+### 6. Reproducibility
+
+Set random seeds:
+
+```python
+import numpy as np
+import random
+import tensorflow as tf
+
+seed = 42
+np.random.seed(seed)
+random.seed(seed)
+tf.random.set_seed(seed)
+```