Files
gh-k-dense-ai-claude-scient…/skills/neurokit2/references/signal_processing.md
2025-11-30 08:30:10 +08:00

649 lines
16 KiB
Markdown

# General Signal Processing
## Overview
NeuroKit2 provides comprehensive signal processing utilities applicable to any time series data. These functions support filtering, transformation, peak detection, decomposition, and analysis operations that work across all signal types.
## Preprocessing Functions
### signal_filter()
Apply frequency-domain filtering to remove noise or isolate frequency bands.
```python
filtered = nk.signal_filter(signal, sampling_rate=1000, lowcut=None, highcut=None,
method='butterworth', order=5)
```
**Filter types (via lowcut/highcut combinations):**
**Lowpass** (highcut only):
```python
lowpass = nk.signal_filter(signal, sampling_rate=1000, highcut=50)
```
- Removes frequencies above highcut
- Smooths signal, removes high-frequency noise
**Highpass** (lowcut only):
```python
highpass = nk.signal_filter(signal, sampling_rate=1000, lowcut=0.5)
```
- Removes frequencies below lowcut
- Removes baseline drift, DC offset
**Bandpass** (both lowcut and highcut):
```python
bandpass = nk.signal_filter(signal, sampling_rate=1000, lowcut=0.5, highcut=50)
```
- Retains frequencies between lowcut and highcut
- Isolates specific frequency band
**Bandstop/Notch** (powerline removal):
```python
notch = nk.signal_filter(signal, sampling_rate=1000, method='powerline', powerline=50)
```
- Removes 50 or 60 Hz powerline noise
- Narrow notch filter
**Methods:**
- `'butterworth'` (default): Smooth frequency response, flat passband
- `'bessel'`: Linear phase, minimal ringing
- `'chebyshev1'`: Steeper rolloff, ripple in passband
- `'chebyshev2'`: Steeper rolloff, ripple in stopband
- `'elliptic'`: Steepest rolloff, ripple in both bands
- `'powerline'`: Notch filter for 50/60 Hz
**Order parameter:**
- Higher order: Steeper transition, more ringing
- Lower order: Gentler transition, less ringing
- Typical: 2-5 for physiological signals
### signal_sanitize()
Remove invalid values (NaN, inf) and optionally interpolate.
```python
clean_signal = nk.signal_sanitize(signal, interpolate=True)
```
**Use cases:**
- Handle missing data points
- Remove artifacts marked as NaN
- Prepare signal for algorithms requiring continuous data
### signal_resample()
Change sampling rate of signal (upsample or downsample).
```python
resampled = nk.signal_resample(signal, sampling_rate=1000, desired_sampling_rate=500,
method='interpolation')
```
**Methods:**
- `'interpolation'`: Cubic spline interpolation
- `'FFT'`: Frequency-domain resampling
- `'poly'`: Polyphase filtering (best for downsampling)
**Use cases:**
- Match sampling rates across multi-modal recordings
- Reduce data size (downsample)
- Increase temporal resolution (upsample)
### signal_fillmissing()
Interpolate missing or invalid data points.
```python
filled = nk.signal_fillmissing(signal, method='linear')
```
**Methods:**
- `'linear'`: Linear interpolation
- `'nearest'`: Nearest neighbor
- `'pad'`: Forward/backward fill
- `'cubic'`: Cubic spline
- `'polynomial'`: Polynomial fitting
## Transformation Functions
### signal_detrend()
Remove slow trends from signal.
```python
detrended = nk.signal_detrend(signal, method='polynomial', order=1)
```
**Methods:**
- `'polynomial'`: Fit and subtract polynomial (order 1 = linear)
- `'loess'`: Locally weighted regression
- `'tarvainen2002'`: Smoothness priors detrending
**Use cases:**
- Remove baseline drift
- Stabilize mean before analysis
- Prepare for stationarity-assuming algorithms
### signal_decompose()
Decompose signal into constituent components.
```python
components = nk.signal_decompose(signal, sampling_rate=1000, method='emd')
```
**Methods:**
**Empirical Mode Decomposition (EMD):**
```python
components = nk.signal_decompose(signal, sampling_rate=1000, method='emd')
```
- Data-adaptive decomposition into Intrinsic Mode Functions (IMFs)
- Each IMF represents different frequency content (high to low)
- No predefined basis functions
**Singular Spectrum Analysis (SSA):**
```python
components = nk.signal_decompose(signal, method='ssa')
```
- Decomposes into trend, oscillations, and noise
- Based on eigenvalue decomposition of trajectory matrix
**Wavelet decomposition:**
- Time-frequency representation
- Localized in both time and frequency
**Returns:**
- Dictionary with component signals
- Trend, oscillatory components, residual
**Use cases:**
- Isolate physiological rhythms
- Separate signal from noise
- Multi-scale analysis
### signal_recompose()
Reconstruct signal from decomposed components.
```python
reconstructed = nk.signal_recompose(components, indices=[1, 2, 3])
```
**Use case:**
- Selective reconstruction after decomposition
- Remove specific IMFs or components
- Adaptive filtering
### signal_binarize()
Convert continuous signal to binary (0/1) based on threshold.
```python
binary = nk.signal_binarize(signal, method='threshold', threshold=0.5)
```
**Methods:**
- `'threshold'`: Simple threshold
- `'median'`: Median-based
- `'mean'`: Mean-based
- `'quantile'`: Percentile-based
**Use case:**
- Event detection from continuous signal
- Trigger extraction
- State classification
### signal_distort()
Add controlled noise or artifacts for testing.
```python
distorted = nk.signal_distort(signal, sampling_rate=1000, noise_amplitude=0.1,
noise_frequency=50, artifacts_amplitude=0.5)
```
**Parameters:**
- `noise_amplitude`: Gaussian noise level
- `noise_frequency`: Sinusoidal interference (e.g., powerline)
- `artifacts_amplitude`: Random spike artifacts
- `artifacts_number`: Number of artifacts to add
**Use cases:**
- Algorithm robustness testing
- Preprocessing method evaluation
- Realistic data simulation
### signal_interpolate()
Interpolate signal at new time points or fill gaps.
```python
interpolated = nk.signal_interpolate(x_values, y_values, x_new=None, method='quadratic')
```
**Methods:**
- `'linear'`, `'quadratic'`, `'cubic'`: Polynomial interpolation
- `'nearest'`: Nearest neighbor
- `'monotone_cubic'`: Preserves monotonicity
**Use case:**
- Convert irregular samples to regular grid
- Upsample for visualization
- Align signals with different time bases
### signal_merge()
Combine multiple signals with different sampling rates.
```python
merged = nk.signal_merge(signal1, signal2, time1=None, time2=None, sampling_rate=None)
```
**Use case:**
- Multi-modal signal integration
- Combine data from different devices
- Synchronize based on timestamps
### signal_flatline()
Identify periods of constant signal (artifacts or sensor failure).
```python
flatline_mask = nk.signal_flatline(signal, duration=5.0, sampling_rate=1000)
```
**Returns:**
- Binary mask where True indicates flatline periods
- Duration threshold prevents false positives from normal stability
### signal_noise()
Add various types of noise to signal.
```python
noisy = nk.signal_noise(signal, sampling_rate=1000, noise_type='gaussian',
amplitude=0.1)
```
**Noise types:**
- `'gaussian'`: White noise
- `'pink'`: 1/f noise (common in physiological signals)
- `'brown'`: Brownian (random walk)
- `'powerline'`: Sinusoidal interference (50/60 Hz)
### signal_surrogate()
Generate surrogate signals preserving certain properties.
```python
surrogate = nk.signal_surrogate(signal, method='IAAFT')
```
**Methods:**
- `'IAAFT'`: Iterated Amplitude Adjusted Fourier Transform
- Preserves amplitude distribution and power spectrum
- `'random_shuffle'`: Random permutation (null hypothesis testing)
**Use case:**
- Nonlinearity testing
- Null hypothesis generation for statistical tests
## Peak Detection and Correction
### signal_findpeaks()
Detect local maxima (peaks) in signal.
```python
peaks_dict = nk.signal_findpeaks(signal, height_min=None, height_max=None,
relative_height_min=None, relative_height_max=None)
```
**Key parameters:**
- `height_min/max`: Absolute amplitude thresholds
- `relative_height_min/max`: Relative to signal range (0-1)
- `threshold`: Minimum prominence
- `distance`: Minimum samples between peaks
**Returns:**
- Dictionary with:
- `'Peaks'`: Peak indices
- `'Height'`: Peak amplitudes
- `'Distance'`: Inter-peak intervals
**Use cases:**
- Generic peak detection for any signal
- R-peaks, respiratory peaks, pulse peaks
- Event detection
### signal_fixpeaks()
Correct detected peaks for artifacts and anomalies.
```python
corrected = nk.signal_fixpeaks(peaks, sampling_rate=1000, iterative=True,
method='Kubios', interval_min=None, interval_max=None)
```
**Methods:**
- `'Kubios'`: Kubios HRV software method (default)
- `'Malik1996'`: Task Force Standards (1996)
- `'Kamath1993'`: Kamath's approach
**Corrections:**
- Remove physiologically implausible intervals
- Interpolate missing peaks
- Remove extra detected peaks (duplicates)
**Use case:**
- Artifact correction in R-R intervals
- Improve HRV analysis quality
- Respiratory or pulse peak correction
## Analysis Functions
### signal_rate()
Compute instantaneous rate from event occurrences (peaks).
```python
rate = nk.signal_rate(peaks, sampling_rate=1000, desired_length=None)
```
**Method:**
- Calculate inter-event intervals
- Convert to events per minute
- Interpolate to match desired length
**Use case:**
- Heart rate from R-peaks
- Breathing rate from respiratory peaks
- Any periodic event rate
### signal_period()
Find dominant period/frequency in signal.
```python
period = nk.signal_period(signal, sampling_rate=1000, method='autocorrelation',
show=False)
```
**Methods:**
- `'autocorrelation'`: Peak in autocorrelation function
- `'powerspectraldensity'`: Peak in frequency spectrum
**Returns:**
- Period in samples or seconds
- Frequency (1/period) in Hz
**Use case:**
- Detect dominant rhythm
- Estimate fundamental frequency
- Breathing rate, heart rate estimation
### signal_phase()
Compute instantaneous phase of signal.
```python
phase = nk.signal_phase(signal, method='hilbert')
```
**Methods:**
- `'hilbert'`: Hilbert transform (analytic signal)
- `'wavelet'`: Wavelet-based phase
**Returns:**
- Phase in radians (-π to π) or 0 to 1 (normalized)
**Use cases:**
- Phase-locked analysis
- Synchronization measures
- Phase-amplitude coupling
### signal_psd()
Compute Power Spectral Density.
```python
psd, freqs = nk.signal_psd(signal, sampling_rate=1000, method='welch',
max_frequency=None, show=False)
```
**Methods:**
- `'welch'`: Welch's periodogram (windowed FFT, default)
- `'multitapers'`: Multitaper method (superior spectral estimation)
- `'lomb'`: Lomb-Scargle (unevenly sampled data)
- `'burg'`: Autoregressive (parametric)
**Returns:**
- `psd`: Power at each frequency (units²/Hz)
- `freqs`: Frequency bins (Hz)
**Use case:**
- Frequency content analysis
- HRV frequency domain
- Spectral signatures
### signal_power()
Compute power in specific frequency bands.
```python
power_dict = nk.signal_power(signal, sampling_rate=1000, frequency_bands={
'VLF': (0.003, 0.04),
'LF': (0.04, 0.15),
'HF': (0.15, 0.4)
}, method='welch')
```
**Returns:**
- Dictionary with absolute and relative power per band
- Peak frequencies
**Use case:**
- HRV frequency analysis
- EEG band power
- Rhythm quantification
### signal_autocor()
Compute autocorrelation function.
```python
autocorr = nk.signal_autocor(signal, lag=1000, show=False)
```
**Interpretation:**
- High autocorrelation at lag: signal repeats every lag samples
- Periodic signals: peaks at multiples of period
- Random signals: rapid decay to zero
**Use cases:**
- Detect periodicity
- Assess temporal structure
- Memory in signal
### signal_zerocrossings()
Count zero crossings (sign changes).
```python
n_crossings = nk.signal_zerocrossings(signal)
```
**Interpretation:**
- More crossings: higher frequency content
- Related to dominant frequency (rough estimate)
**Use case:**
- Simple frequency estimation
- Signal regularity assessment
### signal_changepoints()
Detect abrupt changes in signal properties (mean, variance).
```python
changepoints = nk.signal_changepoints(signal, penalty=10, method='pelt', show=False)
```
**Methods:**
- `'pelt'`: Pruned Exact Linear Time (fast, exact)
- `'binseg'`: Binary segmentation (faster, approximate)
**Parameters:**
- `penalty`: Controls sensitivity (higher = fewer changepoints)
**Returns:**
- Indices of detected changepoints
- Segments between changepoints
**Use cases:**
- Segment signal into states
- Detect transitions (e.g., sleep stages, arousal states)
- Automatic epoch definition
### signal_synchrony()
Assess synchronization between two signals.
```python
sync = nk.signal_synchrony(signal1, signal2, method='correlation')
```
**Methods:**
- `'correlation'`: Pearson correlation
- `'coherence'`: Frequency-domain coherence
- `'mutual_information'`: Information-theoretic measure
- `'phase'`: Phase locking value
**Use cases:**
- Heart-brain coupling
- Inter-brain synchrony
- Multi-channel coordination
### signal_smooth()
Apply smoothing to reduce noise.
```python
smoothed = nk.signal_smooth(signal, method='convolution', kernel='boxzen', size=10)
```
**Methods:**
- `'convolution'`: Apply kernel (boxcar, Gaussian, etc.)
- `'median'`: Median filter (robust to outliers)
- `'savgol'`: Savitzky-Golay filter (preserves peaks)
- `'loess'`: Locally weighted regression
**Kernel types (for convolution):**
- `'boxcar'`: Simple moving average
- `'gaussian'`: Gaussian-weighted average
- `'hann'`, `'hamming'`, `'blackman'`: Windowing functions
**Use cases:**
- Noise reduction
- Trend extraction
- Visualization enhancement
### signal_timefrequency()
Time-frequency representation (spectrogram).
```python
tf, time, freq = nk.signal_timefrequency(signal, sampling_rate=1000, method='stft',
max_frequency=50, show=False)
```
**Methods:**
- `'stft'`: Short-Time Fourier Transform
- `'cwt'`: Continuous Wavelet Transform
**Returns:**
- `tf`: Time-frequency matrix (power at each time-frequency point)
- `time`: Time bins
- `freq`: Frequency bins
**Use cases:**
- Non-stationary signal analysis
- Time-varying frequency content
- EEG/MEG time-frequency analysis
## Simulation
### signal_simulate()
Generate various synthetic signals for testing.
```python
signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=[5, 10],
amplitude=[1.0, 0.5], noise=0.1)
```
**Signal types:**
- Sinusoidal oscillations (specify frequencies)
- Multiple frequency components
- Gaussian noise
- Combinations
**Use cases:**
- Algorithm testing
- Method validation
- Educational demonstrations
## Visualization
### signal_plot()
Visualize signal and optional markers.
```python
nk.signal_plot(signal, sampling_rate=1000, peaks=None, show=True)
```
**Features:**
- Time axis in seconds
- Peak markers
- Multiple subplots for signal arrays
## Practical Tips
**Choosing filter parameters:**
- **Lowcut**: Set below lowest frequency of interest
- **Highcut**: Set above highest frequency of interest
- **Order**: Start with 2-5, increase if transition too slow
- **Method**: Butterworth is safe default
**Handling edge effects:**
- Filtering introduces artifacts at signal edges
- Pad signal before filtering, then trim
- Or discard initial/final seconds
**Dealing with gaps:**
- Small gaps: `signal_fillmissing()` with interpolation
- Large gaps: Segment signal, analyze separately
- Mark gaps as NaN, use interpolation carefully
**Combining operations:**
```python
# Typical preprocessing pipeline
signal = nk.signal_sanitize(raw_signal) # Remove invalid values
signal = nk.signal_filter(signal, sampling_rate=1000, lowcut=0.5, highcut=40) # Bandpass
signal = nk.signal_detrend(signal, method='polynomial', order=1) # Remove linear trend
```
**Performance considerations:**
- Filtering: FFT-based methods faster for long signals
- Resampling: Downsample early in pipeline to speed up
- Large datasets: Process in chunks if memory-limited
## References
- Virtanen, P., et al. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods, 17(3), 261-272.
- Tarvainen, M. P., Ranta-aho, P. O., & Karjalainen, P. A. (2002). An advanced detrending method with application to HRV analysis. IEEE Transactions on Biomedical Engineering, 49(2), 172-175.
- Huang, N. E., et al. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London A, 454(1971), 903-995.