In 2014-2016, I have worked as an undergraduate researcher for speech pathology detection projects. One of my main works was developing robust eldery people's voice activity detection.
In 2015, the most popular method for voice activity detection was pitch-based end-point determination. Pitch computation commonly relied on the Autocorrelation function (ACF) and Average Magnitude Difference Function (AMDF) methods. However, time-series speech relied on ACF and AMDF to compute Pitch for not only voiced sounds but also unvoiced sounds.
For speech pathology detection, capturing voiced sounds produced from vocal cords is crucial. Elderly people's speech contains various unvoiced sounds, making it significant to develop voice activity detection algorithms robust to unvoiced sounds for more accurate pathology detection within the elderly population.
I have worked on Higher Order Differential Energy Operators(HODEO) for desgin novel voice activity detection algorithms. Through this work, I have delved deeply into speech signal processing and experimented with various voice activity detection algorithms from scratch!
This Github provides ACF, AMDF, and HODEO-based voice activity detection codes.
Put your data (data.wav) and run 'Code/Moon_2016_Scratch_Voice_Activity_Detection.m'
We found that the HODEO-based Voice Activity Detection (VAD) approach might be better suited for the elderly population compared to ACF and AMDF-based methods. However, since I had access to only a limited amount of elderly speech data, further validation would necessitate the use of various datasets with appropriate ground truths.
The HODEO-based voice activity detection module was integrated into my Speech Analysis Software after 2016!
Please feel free to contact me at [email protected] if you are interested in! :)