649 lines
16 KiB
Markdown
649 lines
16 KiB
Markdown
# General Signal Processing
|
|
|
|
## Overview
|
|
|
|
NeuroKit2 provides comprehensive signal processing utilities applicable to any time series data. These functions support filtering, transformation, peak detection, decomposition, and analysis operations that work across all signal types.
|
|
|
|
## Preprocessing Functions
|
|
|
|
### signal_filter()
|
|
|
|
Apply frequency-domain filtering to remove noise or isolate frequency bands.
|
|
|
|
```python
|
|
filtered = nk.signal_filter(signal, sampling_rate=1000, lowcut=None, highcut=None,
|
|
method='butterworth', order=5)
|
|
```
|
|
|
|
**Filter types (via lowcut/highcut combinations):**
|
|
|
|
**Lowpass** (highcut only):
|
|
```python
|
|
lowpass = nk.signal_filter(signal, sampling_rate=1000, highcut=50)
|
|
```
|
|
- Removes frequencies above highcut
|
|
- Smooths signal, removes high-frequency noise
|
|
|
|
**Highpass** (lowcut only):
|
|
```python
|
|
highpass = nk.signal_filter(signal, sampling_rate=1000, lowcut=0.5)
|
|
```
|
|
- Removes frequencies below lowcut
|
|
- Removes baseline drift, DC offset
|
|
|
|
**Bandpass** (both lowcut and highcut):
|
|
```python
|
|
bandpass = nk.signal_filter(signal, sampling_rate=1000, lowcut=0.5, highcut=50)
|
|
```
|
|
- Retains frequencies between lowcut and highcut
|
|
- Isolates specific frequency band
|
|
|
|
**Bandstop/Notch** (powerline removal):
|
|
```python
|
|
notch = nk.signal_filter(signal, sampling_rate=1000, method='powerline', powerline=50)
|
|
```
|
|
- Removes 50 or 60 Hz powerline noise
|
|
- Narrow notch filter
|
|
|
|
**Methods:**
|
|
- `'butterworth'` (default): Smooth frequency response, flat passband
|
|
- `'bessel'`: Linear phase, minimal ringing
|
|
- `'chebyshev1'`: Steeper rolloff, ripple in passband
|
|
- `'chebyshev2'`: Steeper rolloff, ripple in stopband
|
|
- `'elliptic'`: Steepest rolloff, ripple in both bands
|
|
- `'powerline'`: Notch filter for 50/60 Hz
|
|
|
|
**Order parameter:**
|
|
- Higher order: Steeper transition, more ringing
|
|
- Lower order: Gentler transition, less ringing
|
|
- Typical: 2-5 for physiological signals
|
|
|
|
### signal_sanitize()
|
|
|
|
Remove invalid values (NaN, inf) and optionally interpolate.
|
|
|
|
```python
|
|
clean_signal = nk.signal_sanitize(signal, interpolate=True)
|
|
```
|
|
|
|
**Use cases:**
|
|
- Handle missing data points
|
|
- Remove artifacts marked as NaN
|
|
- Prepare signal for algorithms requiring continuous data
|
|
|
|
### signal_resample()
|
|
|
|
Change sampling rate of signal (upsample or downsample).
|
|
|
|
```python
|
|
resampled = nk.signal_resample(signal, sampling_rate=1000, desired_sampling_rate=500,
|
|
method='interpolation')
|
|
```
|
|
|
|
**Methods:**
|
|
- `'interpolation'`: Cubic spline interpolation
|
|
- `'FFT'`: Frequency-domain resampling
|
|
- `'poly'`: Polyphase filtering (best for downsampling)
|
|
|
|
**Use cases:**
|
|
- Match sampling rates across multi-modal recordings
|
|
- Reduce data size (downsample)
|
|
- Increase temporal resolution (upsample)
|
|
|
|
### signal_fillmissing()
|
|
|
|
Interpolate missing or invalid data points.
|
|
|
|
```python
|
|
filled = nk.signal_fillmissing(signal, method='linear')
|
|
```
|
|
|
|
**Methods:**
|
|
- `'linear'`: Linear interpolation
|
|
- `'nearest'`: Nearest neighbor
|
|
- `'pad'`: Forward/backward fill
|
|
- `'cubic'`: Cubic spline
|
|
- `'polynomial'`: Polynomial fitting
|
|
|
|
## Transformation Functions
|
|
|
|
### signal_detrend()
|
|
|
|
Remove slow trends from signal.
|
|
|
|
```python
|
|
detrended = nk.signal_detrend(signal, method='polynomial', order=1)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'polynomial'`: Fit and subtract polynomial (order 1 = linear)
|
|
- `'loess'`: Locally weighted regression
|
|
- `'tarvainen2002'`: Smoothness priors detrending
|
|
|
|
**Use cases:**
|
|
- Remove baseline drift
|
|
- Stabilize mean before analysis
|
|
- Prepare for stationarity-assuming algorithms
|
|
|
|
### signal_decompose()
|
|
|
|
Decompose signal into constituent components.
|
|
|
|
```python
|
|
components = nk.signal_decompose(signal, sampling_rate=1000, method='emd')
|
|
```
|
|
|
|
**Methods:**
|
|
|
|
**Empirical Mode Decomposition (EMD):**
|
|
```python
|
|
components = nk.signal_decompose(signal, sampling_rate=1000, method='emd')
|
|
```
|
|
- Data-adaptive decomposition into Intrinsic Mode Functions (IMFs)
|
|
- Each IMF represents different frequency content (high to low)
|
|
- No predefined basis functions
|
|
|
|
**Singular Spectrum Analysis (SSA):**
|
|
```python
|
|
components = nk.signal_decompose(signal, method='ssa')
|
|
```
|
|
- Decomposes into trend, oscillations, and noise
|
|
- Based on eigenvalue decomposition of trajectory matrix
|
|
|
|
**Wavelet decomposition:**
|
|
- Time-frequency representation
|
|
- Localized in both time and frequency
|
|
|
|
**Returns:**
|
|
- Dictionary with component signals
|
|
- Trend, oscillatory components, residual
|
|
|
|
**Use cases:**
|
|
- Isolate physiological rhythms
|
|
- Separate signal from noise
|
|
- Multi-scale analysis
|
|
|
|
### signal_recompose()
|
|
|
|
Reconstruct signal from decomposed components.
|
|
|
|
```python
|
|
reconstructed = nk.signal_recompose(components, indices=[1, 2, 3])
|
|
```
|
|
|
|
**Use case:**
|
|
- Selective reconstruction after decomposition
|
|
- Remove specific IMFs or components
|
|
- Adaptive filtering
|
|
|
|
### signal_binarize()
|
|
|
|
Convert continuous signal to binary (0/1) based on threshold.
|
|
|
|
```python
|
|
binary = nk.signal_binarize(signal, method='threshold', threshold=0.5)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'threshold'`: Simple threshold
|
|
- `'median'`: Median-based
|
|
- `'mean'`: Mean-based
|
|
- `'quantile'`: Percentile-based
|
|
|
|
**Use case:**
|
|
- Event detection from continuous signal
|
|
- Trigger extraction
|
|
- State classification
|
|
|
|
### signal_distort()
|
|
|
|
Add controlled noise or artifacts for testing.
|
|
|
|
```python
|
|
distorted = nk.signal_distort(signal, sampling_rate=1000, noise_amplitude=0.1,
|
|
noise_frequency=50, artifacts_amplitude=0.5)
|
|
```
|
|
|
|
**Parameters:**
|
|
- `noise_amplitude`: Gaussian noise level
|
|
- `noise_frequency`: Sinusoidal interference (e.g., powerline)
|
|
- `artifacts_amplitude`: Random spike artifacts
|
|
- `artifacts_number`: Number of artifacts to add
|
|
|
|
**Use cases:**
|
|
- Algorithm robustness testing
|
|
- Preprocessing method evaluation
|
|
- Realistic data simulation
|
|
|
|
### signal_interpolate()
|
|
|
|
Interpolate signal at new time points or fill gaps.
|
|
|
|
```python
|
|
interpolated = nk.signal_interpolate(x_values, y_values, x_new=None, method='quadratic')
|
|
```
|
|
|
|
**Methods:**
|
|
- `'linear'`, `'quadratic'`, `'cubic'`: Polynomial interpolation
|
|
- `'nearest'`: Nearest neighbor
|
|
- `'monotone_cubic'`: Preserves monotonicity
|
|
|
|
**Use case:**
|
|
- Convert irregular samples to regular grid
|
|
- Upsample for visualization
|
|
- Align signals with different time bases
|
|
|
|
### signal_merge()
|
|
|
|
Combine multiple signals with different sampling rates.
|
|
|
|
```python
|
|
merged = nk.signal_merge(signal1, signal2, time1=None, time2=None, sampling_rate=None)
|
|
```
|
|
|
|
**Use case:**
|
|
- Multi-modal signal integration
|
|
- Combine data from different devices
|
|
- Synchronize based on timestamps
|
|
|
|
### signal_flatline()
|
|
|
|
Identify periods of constant signal (artifacts or sensor failure).
|
|
|
|
```python
|
|
flatline_mask = nk.signal_flatline(signal, duration=5.0, sampling_rate=1000)
|
|
```
|
|
|
|
**Returns:**
|
|
- Binary mask where True indicates flatline periods
|
|
- Duration threshold prevents false positives from normal stability
|
|
|
|
### signal_noise()
|
|
|
|
Add various types of noise to signal.
|
|
|
|
```python
|
|
noisy = nk.signal_noise(signal, sampling_rate=1000, noise_type='gaussian',
|
|
amplitude=0.1)
|
|
```
|
|
|
|
**Noise types:**
|
|
- `'gaussian'`: White noise
|
|
- `'pink'`: 1/f noise (common in physiological signals)
|
|
- `'brown'`: Brownian (random walk)
|
|
- `'powerline'`: Sinusoidal interference (50/60 Hz)
|
|
|
|
### signal_surrogate()
|
|
|
|
Generate surrogate signals preserving certain properties.
|
|
|
|
```python
|
|
surrogate = nk.signal_surrogate(signal, method='IAAFT')
|
|
```
|
|
|
|
**Methods:**
|
|
- `'IAAFT'`: Iterated Amplitude Adjusted Fourier Transform
|
|
- Preserves amplitude distribution and power spectrum
|
|
- `'random_shuffle'`: Random permutation (null hypothesis testing)
|
|
|
|
**Use case:**
|
|
- Nonlinearity testing
|
|
- Null hypothesis generation for statistical tests
|
|
|
|
## Peak Detection and Correction
|
|
|
|
### signal_findpeaks()
|
|
|
|
Detect local maxima (peaks) in signal.
|
|
|
|
```python
|
|
peaks_dict = nk.signal_findpeaks(signal, height_min=None, height_max=None,
|
|
relative_height_min=None, relative_height_max=None)
|
|
```
|
|
|
|
**Key parameters:**
|
|
- `height_min/max`: Absolute amplitude thresholds
|
|
- `relative_height_min/max`: Relative to signal range (0-1)
|
|
- `threshold`: Minimum prominence
|
|
- `distance`: Minimum samples between peaks
|
|
|
|
**Returns:**
|
|
- Dictionary with:
|
|
- `'Peaks'`: Peak indices
|
|
- `'Height'`: Peak amplitudes
|
|
- `'Distance'`: Inter-peak intervals
|
|
|
|
**Use cases:**
|
|
- Generic peak detection for any signal
|
|
- R-peaks, respiratory peaks, pulse peaks
|
|
- Event detection
|
|
|
|
### signal_fixpeaks()
|
|
|
|
Correct detected peaks for artifacts and anomalies.
|
|
|
|
```python
|
|
corrected = nk.signal_fixpeaks(peaks, sampling_rate=1000, iterative=True,
|
|
method='Kubios', interval_min=None, interval_max=None)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'Kubios'`: Kubios HRV software method (default)
|
|
- `'Malik1996'`: Task Force Standards (1996)
|
|
- `'Kamath1993'`: Kamath's approach
|
|
|
|
**Corrections:**
|
|
- Remove physiologically implausible intervals
|
|
- Interpolate missing peaks
|
|
- Remove extra detected peaks (duplicates)
|
|
|
|
**Use case:**
|
|
- Artifact correction in R-R intervals
|
|
- Improve HRV analysis quality
|
|
- Respiratory or pulse peak correction
|
|
|
|
## Analysis Functions
|
|
|
|
### signal_rate()
|
|
|
|
Compute instantaneous rate from event occurrences (peaks).
|
|
|
|
```python
|
|
rate = nk.signal_rate(peaks, sampling_rate=1000, desired_length=None)
|
|
```
|
|
|
|
**Method:**
|
|
- Calculate inter-event intervals
|
|
- Convert to events per minute
|
|
- Interpolate to match desired length
|
|
|
|
**Use case:**
|
|
- Heart rate from R-peaks
|
|
- Breathing rate from respiratory peaks
|
|
- Any periodic event rate
|
|
|
|
### signal_period()
|
|
|
|
Find dominant period/frequency in signal.
|
|
|
|
```python
|
|
period = nk.signal_period(signal, sampling_rate=1000, method='autocorrelation',
|
|
show=False)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'autocorrelation'`: Peak in autocorrelation function
|
|
- `'powerspectraldensity'`: Peak in frequency spectrum
|
|
|
|
**Returns:**
|
|
- Period in samples or seconds
|
|
- Frequency (1/period) in Hz
|
|
|
|
**Use case:**
|
|
- Detect dominant rhythm
|
|
- Estimate fundamental frequency
|
|
- Breathing rate, heart rate estimation
|
|
|
|
### signal_phase()
|
|
|
|
Compute instantaneous phase of signal.
|
|
|
|
```python
|
|
phase = nk.signal_phase(signal, method='hilbert')
|
|
```
|
|
|
|
**Methods:**
|
|
- `'hilbert'`: Hilbert transform (analytic signal)
|
|
- `'wavelet'`: Wavelet-based phase
|
|
|
|
**Returns:**
|
|
- Phase in radians (-π to π) or 0 to 1 (normalized)
|
|
|
|
**Use cases:**
|
|
- Phase-locked analysis
|
|
- Synchronization measures
|
|
- Phase-amplitude coupling
|
|
|
|
### signal_psd()
|
|
|
|
Compute Power Spectral Density.
|
|
|
|
```python
|
|
psd, freqs = nk.signal_psd(signal, sampling_rate=1000, method='welch',
|
|
max_frequency=None, show=False)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'welch'`: Welch's periodogram (windowed FFT, default)
|
|
- `'multitapers'`: Multitaper method (superior spectral estimation)
|
|
- `'lomb'`: Lomb-Scargle (unevenly sampled data)
|
|
- `'burg'`: Autoregressive (parametric)
|
|
|
|
**Returns:**
|
|
- `psd`: Power at each frequency (units²/Hz)
|
|
- `freqs`: Frequency bins (Hz)
|
|
|
|
**Use case:**
|
|
- Frequency content analysis
|
|
- HRV frequency domain
|
|
- Spectral signatures
|
|
|
|
### signal_power()
|
|
|
|
Compute power in specific frequency bands.
|
|
|
|
```python
|
|
power_dict = nk.signal_power(signal, sampling_rate=1000, frequency_bands={
|
|
'VLF': (0.003, 0.04),
|
|
'LF': (0.04, 0.15),
|
|
'HF': (0.15, 0.4)
|
|
}, method='welch')
|
|
```
|
|
|
|
**Returns:**
|
|
- Dictionary with absolute and relative power per band
|
|
- Peak frequencies
|
|
|
|
**Use case:**
|
|
- HRV frequency analysis
|
|
- EEG band power
|
|
- Rhythm quantification
|
|
|
|
### signal_autocor()
|
|
|
|
Compute autocorrelation function.
|
|
|
|
```python
|
|
autocorr = nk.signal_autocor(signal, lag=1000, show=False)
|
|
```
|
|
|
|
**Interpretation:**
|
|
- High autocorrelation at lag: signal repeats every lag samples
|
|
- Periodic signals: peaks at multiples of period
|
|
- Random signals: rapid decay to zero
|
|
|
|
**Use cases:**
|
|
- Detect periodicity
|
|
- Assess temporal structure
|
|
- Memory in signal
|
|
|
|
### signal_zerocrossings()
|
|
|
|
Count zero crossings (sign changes).
|
|
|
|
```python
|
|
n_crossings = nk.signal_zerocrossings(signal)
|
|
```
|
|
|
|
**Interpretation:**
|
|
- More crossings: higher frequency content
|
|
- Related to dominant frequency (rough estimate)
|
|
|
|
**Use case:**
|
|
- Simple frequency estimation
|
|
- Signal regularity assessment
|
|
|
|
### signal_changepoints()
|
|
|
|
Detect abrupt changes in signal properties (mean, variance).
|
|
|
|
```python
|
|
changepoints = nk.signal_changepoints(signal, penalty=10, method='pelt', show=False)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'pelt'`: Pruned Exact Linear Time (fast, exact)
|
|
- `'binseg'`: Binary segmentation (faster, approximate)
|
|
|
|
**Parameters:**
|
|
- `penalty`: Controls sensitivity (higher = fewer changepoints)
|
|
|
|
**Returns:**
|
|
- Indices of detected changepoints
|
|
- Segments between changepoints
|
|
|
|
**Use cases:**
|
|
- Segment signal into states
|
|
- Detect transitions (e.g., sleep stages, arousal states)
|
|
- Automatic epoch definition
|
|
|
|
### signal_synchrony()
|
|
|
|
Assess synchronization between two signals.
|
|
|
|
```python
|
|
sync = nk.signal_synchrony(signal1, signal2, method='correlation')
|
|
```
|
|
|
|
**Methods:**
|
|
- `'correlation'`: Pearson correlation
|
|
- `'coherence'`: Frequency-domain coherence
|
|
- `'mutual_information'`: Information-theoretic measure
|
|
- `'phase'`: Phase locking value
|
|
|
|
**Use cases:**
|
|
- Heart-brain coupling
|
|
- Inter-brain synchrony
|
|
- Multi-channel coordination
|
|
|
|
### signal_smooth()
|
|
|
|
Apply smoothing to reduce noise.
|
|
|
|
```python
|
|
smoothed = nk.signal_smooth(signal, method='convolution', kernel='boxzen', size=10)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'convolution'`: Apply kernel (boxcar, Gaussian, etc.)
|
|
- `'median'`: Median filter (robust to outliers)
|
|
- `'savgol'`: Savitzky-Golay filter (preserves peaks)
|
|
- `'loess'`: Locally weighted regression
|
|
|
|
**Kernel types (for convolution):**
|
|
- `'boxcar'`: Simple moving average
|
|
- `'gaussian'`: Gaussian-weighted average
|
|
- `'hann'`, `'hamming'`, `'blackman'`: Windowing functions
|
|
|
|
**Use cases:**
|
|
- Noise reduction
|
|
- Trend extraction
|
|
- Visualization enhancement
|
|
|
|
### signal_timefrequency()
|
|
|
|
Time-frequency representation (spectrogram).
|
|
|
|
```python
|
|
tf, time, freq = nk.signal_timefrequency(signal, sampling_rate=1000, method='stft',
|
|
max_frequency=50, show=False)
|
|
```
|
|
|
|
**Methods:**
|
|
- `'stft'`: Short-Time Fourier Transform
|
|
- `'cwt'`: Continuous Wavelet Transform
|
|
|
|
**Returns:**
|
|
- `tf`: Time-frequency matrix (power at each time-frequency point)
|
|
- `time`: Time bins
|
|
- `freq`: Frequency bins
|
|
|
|
**Use cases:**
|
|
- Non-stationary signal analysis
|
|
- Time-varying frequency content
|
|
- EEG/MEG time-frequency analysis
|
|
|
|
## Simulation
|
|
|
|
### signal_simulate()
|
|
|
|
Generate various synthetic signals for testing.
|
|
|
|
```python
|
|
signal = nk.signal_simulate(duration=10, sampling_rate=1000, frequency=[5, 10],
|
|
amplitude=[1.0, 0.5], noise=0.1)
|
|
```
|
|
|
|
**Signal types:**
|
|
- Sinusoidal oscillations (specify frequencies)
|
|
- Multiple frequency components
|
|
- Gaussian noise
|
|
- Combinations
|
|
|
|
**Use cases:**
|
|
- Algorithm testing
|
|
- Method validation
|
|
- Educational demonstrations
|
|
|
|
## Visualization
|
|
|
|
### signal_plot()
|
|
|
|
Visualize signal and optional markers.
|
|
|
|
```python
|
|
nk.signal_plot(signal, sampling_rate=1000, peaks=None, show=True)
|
|
```
|
|
|
|
**Features:**
|
|
- Time axis in seconds
|
|
- Peak markers
|
|
- Multiple subplots for signal arrays
|
|
|
|
## Practical Tips
|
|
|
|
**Choosing filter parameters:**
|
|
- **Lowcut**: Set below lowest frequency of interest
|
|
- **Highcut**: Set above highest frequency of interest
|
|
- **Order**: Start with 2-5, increase if transition too slow
|
|
- **Method**: Butterworth is safe default
|
|
|
|
**Handling edge effects:**
|
|
- Filtering introduces artifacts at signal edges
|
|
- Pad signal before filtering, then trim
|
|
- Or discard initial/final seconds
|
|
|
|
**Dealing with gaps:**
|
|
- Small gaps: `signal_fillmissing()` with interpolation
|
|
- Large gaps: Segment signal, analyze separately
|
|
- Mark gaps as NaN, use interpolation carefully
|
|
|
|
**Combining operations:**
|
|
```python
|
|
# Typical preprocessing pipeline
|
|
signal = nk.signal_sanitize(raw_signal) # Remove invalid values
|
|
signal = nk.signal_filter(signal, sampling_rate=1000, lowcut=0.5, highcut=40) # Bandpass
|
|
signal = nk.signal_detrend(signal, method='polynomial', order=1) # Remove linear trend
|
|
```
|
|
|
|
**Performance considerations:**
|
|
- Filtering: FFT-based methods faster for long signals
|
|
- Resampling: Downsample early in pipeline to speed up
|
|
- Large datasets: Process in chunks if memory-limited
|
|
|
|
## References
|
|
|
|
- Virtanen, P., et al. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods, 17(3), 261-272.
|
|
- Tarvainen, M. P., Ranta-aho, P. O., & Karjalainen, P. A. (2002). An advanced detrending method with application to HRV analysis. IEEE Transactions on Biomedical Engineering, 49(2), 172-175.
|
|
- Huang, N. E., et al. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London A, 454(1971), 903-995.
|