Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 16 von 1136

Details

Autor(en) / Beteiligte
Titel
Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition
Ist Teil von
  • 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2015, p.496-503
Ort / Verlag
IEEE
Erscheinungsjahr
2015
Quelle
IEEE Xplore
Beschreibungen/Notizen
  • Automatic Speech Recognition systems suffer from severe performance degradation in the presence of myriad complicating factors such as noise, reverberation, multiple speech sources, multiple recording devices, etc. Previous challenges have sparked much innovation when it comes to designing systems capable of handling these complications. In this spirit, the CHiME-3 challenge presents system builders with the task of recognizing speech in a real-world noisy setting wherein speakers talk to an array of 6 microphones in a tablet. In order to address these issues, we explore the effectiveness of first applying a model-based source separation mask to the output of a beamformer that combines the source signals recorded by each microphone, followed by a DNN-based front end spectral mapper that predicts clean filterbank features. The source separation algorithm MESSL (Model-based EM Source Separation and Localization) has been extended from two channels to multiple channels in order to meet the demands of the challenge. We report on interactions between the two systems, cross-cut by the use of a robust beamforming algorithm called BeamformIt. Evaluations of different system settings reveal that combining MESSL and the spectral mapper together on the baseline beamformer algorithm boosts the performance substantially.

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX