Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 26 von 493
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, p.7314-7318
2020
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
Deep Audio-Visual Speech Separation with Attention Mechanism
Ist Teil von
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, p.7314-7318
Ort / Verlag
IEEE
Erscheinungsjahr
2020
Quelle
IEEE Electronic Library Online
Beschreibungen/Notizen
  • Previous work shows that audio-visual fusion is a practical approach to deal with the speech separation task in the cocktail party problem. In this paper, we explore a better strategy to utilize visual representations with the attention mechanism. Compared to the previous baseline only using one visual stream of the target speaker, both speaker-dependent visual streams in the mixed audio are fed into the model, and it also predicts two separated speech streams simultaneously. To further enhance the performance, the attention mechanism is designed on the audio-visual speech separation architecture. The results show that the proposed approach works well in audio-visual speech separation. Our best model achieves an obvious and consistent improvement in speech separation when compared to the traditional method only using the target speaker visual stream.
Sprache
Englisch
Identifikatoren
eISSN: 2379-190X
DOI: 10.1109/ICASSP40776.2020.9054180
Titel-ID: cdi_ieee_primary_9054180

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX