go back

Speaker Independent Voiced-Unvoiced Detection Evaluated in Different Speaking Styles

Martin Heckmann, Marco Moebus, Frank Joublin, Christian Goerick, "Speaker Independent Voiced-Unvoiced Detection Evaluated in Different Speaking Styles", Proceedings of Interspeech, pp. 1670-1673, 2006.


We propose a new algorithm for voiced/unvoiced classification of speech on a phoneme or sample level. The algorithm is inspired by auditory based approaches and combines two cues. One cue is based on the energy distribution of the signal and the other on the harmonicity. In order to extract the harmonicity of the signal we calculate a histogram of the zero crossings of the filter channels after applying a Gammatone filterbank to the signal. A measure similar to the variance of the zero crossings yields the harmonicity cue. The performance of the algorithm was measured on several minutes of read and spontaneous speech with various speakers. An algorithm proposed by Mustafa et al. \cite{Mustafa06} served as benchmark. The results show that our algorithm performs significantly better as well on read as on spontaneous speech and seems in particular be better able to to cope with different speaking styles.

Download Bibtex file