# Deep Learning Networks Aeon provides neural network architectures specifically designed for time series tasks. These networks serve as building blocks for classification, regression, clustering, and forecasting. ## Core Network Architectures ### Convolutional Networks **FCNNetwork** - Fully Convolutional Network - Three convolutional blocks with batch normalization - Global average pooling for dimensionality reduction - **Use when**: Need simple yet effective CNN baseline **ResNetNetwork** - Residual Network - Residual blocks with skip connections - Prevents vanishing gradients in deep networks - **Use when**: Deep networks needed, training stability important **InceptionNetwork** - Inception Modules - Multi-scale feature extraction with parallel convolutions - Different kernel sizes capture patterns at various scales - **Use when**: Patterns exist at multiple temporal scales **TimeCNNNetwork** - Standard CNN - Basic convolutional architecture - **Use when**: Simple CNN sufficient, interpretability valued **DisjointCNNNetwork** - Separate Pathways - Disjoint convolutional pathways - **Use when**: Different feature extraction strategies needed **DCNNNetwork** - Dilated CNN - Dilated convolutions for large receptive fields - **Use when**: Long-range dependencies without many layers ### Recurrent Networks **RecurrentNetwork** - RNN/LSTM/GRU - Configurable cell type (RNN, LSTM, GRU) - Sequential modeling of temporal dependencies - **Use when**: Sequential dependencies critical, variable-length series ### Temporal Convolutional Network **TCNNetwork** - Temporal Convolutional Network - Dilated causal convolutions - Large receptive field without recurrence - **Use when**: Long sequences, need parallelizable architecture ### Multi-Layer Perceptron **MLPNetwork** - Basic Feedforward - Simple fully-connected layers - Flattens time series before processing - **Use when**: Baseline needed, computational limits, or simple patterns ## Encoder-Based Architectures Networks designed for representation learning and clustering. ### Autoencoder Variants **EncoderNetwork** - Generic Encoder - Flexible encoder structure - **Use when**: Custom encoding needed **AEFCNNetwork** - FCN-based Autoencoder - Fully convolutional encoder-decoder - **Use when**: Need convolutional representation learning **AEResNetNetwork** - ResNet Autoencoder - Residual blocks in encoder-decoder - **Use when**: Deep autoencoding with skip connections **AEDCNNNetwork** - Dilated CNN Autoencoder - Dilated convolutions for compression - **Use when**: Need large receptive field in autoencoder **AEDRNNNetwork** - Dilated RNN Autoencoder - Dilated recurrent connections - **Use when**: Sequential patterns with long-range dependencies **AEBiGRUNetwork** - Bidirectional GRU - Bidirectional recurrent encoding - **Use when**: Context from both directions helpful **AEAttentionBiGRUNetwork** - Attention + BiGRU - Attention mechanism on BiGRU outputs - **Use when**: Need to focus on important time steps ## Specialized Architectures **LITENetwork** - Lightweight Inception Time Ensemble - Efficient inception-based architecture - LITEMV variant for multivariate series - **Use when**: Need efficiency with strong performance **DeepARNetwork** - Probabilistic Forecasting - Autoregressive RNN for forecasting - Produces probabilistic predictions - **Use when**: Need forecast uncertainty quantification ## Usage with Estimators Networks are typically used within estimators, not directly: ```python from aeon.classification.deep_learning import FCNClassifier from aeon.regression.deep_learning import ResNetRegressor from aeon.clustering.deep_learning import AEFCNClusterer # Classification with FCN clf = FCNClassifier(n_epochs=100, batch_size=16) clf.fit(X_train, y_train) # Regression with ResNet reg = ResNetRegressor(n_epochs=100) reg.fit(X_train, y_train) # Clustering with autoencoder clusterer = AEFCNClusterer(n_clusters=3, n_epochs=100) labels = clusterer.fit_predict(X_train) ``` ## Custom Network Configuration Many networks accept configuration parameters: ```python # Configure FCN layers clf = FCNClassifier( n_epochs=200, batch_size=32, kernel_size=[7, 5, 3], # Kernel sizes for each layer n_filters=[128, 256, 128], # Filters per layer learning_rate=0.001 ) ``` ## Base Classes - `BaseDeepLearningNetwork` - Abstract base for all networks - `BaseDeepRegressor` - Base for deep regression - `BaseDeepClassifier` - Base for deep classification - `BaseDeepForecaster` - Base for deep forecasting Extend these to implement custom architectures. ## Training Considerations ### Hyperparameters Key hyperparameters to tune: - `n_epochs` - Training iterations (50-200 typical) - `batch_size` - Samples per batch (16-64 typical) - `learning_rate` - Step size (0.0001-0.01) - Network-specific: layers, filters, kernel sizes ### Callbacks Many networks support callbacks for training monitoring: ```python from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau clf = FCNClassifier( n_epochs=200, callbacks=[ EarlyStopping(patience=20, restore_best_weights=True), ReduceLROnPlateau(patience=10, factor=0.5) ] ) ``` ### GPU Acceleration Deep learning networks benefit from GPU: ```python import os os.environ['CUDA_VISIBLE_DEVICES'] = '0' # Use first GPU # Networks automatically use GPU if available clf = InceptionTimeClassifier(n_epochs=100) clf.fit(X_train, y_train) ``` ## Architecture Selection ### By Task: **Classification**: InceptionNetwork, ResNetNetwork, FCNNetwork **Regression**: InceptionNetwork, ResNetNetwork, TCNNetwork **Forecasting**: TCNNetwork, DeepARNetwork, RecurrentNetwork **Clustering**: AEFCNNetwork, AEResNetNetwork, AEAttentionBiGRUNetwork ### By Data Characteristics: **Long sequences**: TCNNetwork, DCNNNetwork (dilated convolutions) **Short sequences**: MLPNetwork, FCNNetwork **Multivariate**: InceptionNetwork, FCNNetwork, LITENetwork **Variable length**: RecurrentNetwork with masking **Multi-scale patterns**: InceptionNetwork ### By Computational Resources: **Limited compute**: MLPNetwork, LITENetwork **Moderate compute**: FCNNetwork, TimeCNNNetwork **High compute available**: InceptionNetwork, ResNetNetwork **GPU available**: Any deep network (major speedup) ## Best Practices ### 1. Data Preparation Normalize input data: ```python from aeon.transformations.collection import Normalizer normalizer = Normalizer() X_train_norm = normalizer.fit_transform(X_train) X_test_norm = normalizer.transform(X_test) ``` ### 2. Training/Validation Split Use validation set for early stopping: ```python from sklearn.model_selection import train_test_split X_train_fit, X_val, y_train_fit, y_val = train_test_split( X_train, y_train, test_size=0.2, stratify=y_train ) clf = FCNClassifier(n_epochs=200) clf.fit(X_train_fit, y_train_fit, validation_data=(X_val, y_val)) ``` ### 3. Start Simple Begin with simpler architectures before complex ones: 1. Try MLPNetwork or FCNNetwork first 2. If insufficient, try ResNetNetwork or InceptionNetwork 3. Consider ensembles if single models insufficient ### 4. Hyperparameter Tuning Use grid search or random search: ```python from sklearn.model_selection import GridSearchCV param_grid = { 'n_epochs': [100, 200], 'batch_size': [16, 32], 'learning_rate': [0.001, 0.0001] } clf = FCNClassifier() grid = GridSearchCV(clf, param_grid, cv=3) grid.fit(X_train, y_train) ``` ### 5. Regularization Prevent overfitting: - Use dropout (if network supports) - Early stopping - Data augmentation (if available) - Reduce model complexity ### 6. Reproducibility Set random seeds: ```python import numpy as np import random import tensorflow as tf seed = 42 np.random.seed(seed) random.seed(seed) tf.random.set_seed(seed) ```