go back

Auditory Gist

Ming Liu, "Auditory Gist", 2010.

Abstract

In this thesis, Auditory gist analysis and sound signal processing methods for auditory gist classification are developed. Generally, the thesis concerns the problem of getting auditory gist. The auditory gist, general assumed to be a perception of a sound. The auditory gist idea is from visual gist. Just see an image, people can immediately get a rough idea about the image¡Çs main scene information, this is visual gist. By the same way, the gist of an auditory scene is perceived by s short duration sound. After hearing one short duration sound, people can give a quick perception, what sound it is, or where the sound happened, or some other description about the environment, this is auditory gist. In this thesis which to obtain audi- tory gist features by a computational sound processing method. 1 Take sounds from BBC sound library; 2 Sound preprocessing with Gammatone filter banks inspired by human sound process; 3 Extract Spectral-Temporal feature by audio Gabor; 4 Average feature over a few seconds; 5 Compressing feature vector with PCA (Principal Component Analysis) method; 6 Using K-nearest neighbor(K-nn) to compare to database of reference sounds. These features capture rough information about the overall scene. The core of thesis is in how to get the auditory gist and do audio classification. The auditory gist part involves sound signal processing, feature selection and extraction. The audio classification part used suitable classification algo- rithm. The classification result show that, the best obtained recognition rate for 10 different scenes was 77.5%, which is substantially better than the random guess probability 10.0%. It also demonstrates that with the sound duration increase the recognition rate get better.



Download Bibtex file Per Mail Request

Search