Voice Activity Detection
Frame-based voice activity detection on a synthetic audio stream using energy, zero-crossing rate, and spectral centroid features. Includes median smoothing and consecutive-frame merging for clean segment boundaries.
CLI Equivalent
N/A
Key Concepts
- Frame-level feature extraction (RMS, ZCR, spectral centroid)
- Threshold-based speech/silence classification
- Median smoothing and segment merging
Run
cargo run --example speech_vad