go back

A Hierarchical Framework for Spectro-Temporal Feature Extraction

Martin Heckmann, Xavier Domont, Frank Joublin, Christian Goerick, "A Hierarchical Framework for Spectro-Temporal Feature Extraction", Speech Communication, vol. 53, no. 5, pp. 736 - 752, 2011.

Abstract

In this paper we present a hierarchical framework for the extraction of spectro-temporal acoustic features. The design of the features targets higher robustness in dynamic environments. Motivated by the large gap between human and machine performance in such conditions we take inspirations from the organization of the mammalian auditory cortex in the design of our features. This includes the joint processing of spectral and temporal information, the organization in hierarchical layers, competition between coequal features, the use of high-dimensional sparse feature spaces, and the learning of the underlying receptive fields in a data-driven manner. Due to these properties we termed the features as Hierarchical Spectro-Temporal (HIST) features. For the learning of the features at the first layer we use Independent Component Analysis (ICA). At the second layer of our feature hierarchy we apply Non-Negative Sparse Coding (NNSC) to obtain features spanning a larger frequency and time region. We investigate the contribution of the different subparts of this feature extraction process to the overall performance. This includes an analysis of the benefits of the hierarchical processing, the comparison of different feature extraction methods on the first layer, the evaluation of the feature competition, and the investigation of the influence of different receptive field sizes on the second layer. Additionally, we compare our features to MFCC and RASTA features in a continuous digit recognition task in noise, as well on a wideband dataset we constructed ourselves as on Aurora-2. We show that a combination of the proposed HIST features and RASTA features yields significant improvements and that the proposed features carry complementary information to RASTA and MFCC features.



Download Bibtex file Download PDF

Search