Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 17 von 541

Details

Autor(en) / Beteiligte
Titel
Deep Learning Approaches for Bimodal Speech Emotion Recognition: Advancements, Challenges, and a Multi-Learning Model
Ist Teil von
  • IEEE access, 2023-01, Vol.11, p.1-1
Ort / Verlag
IEEE
Erscheinungsjahr
2023
Link zum Volltext
Quelle
EZB Electronic Journals Library
Beschreibungen/Notizen
  • Though acoustic speech emotion recognition has been studied for a while, bimodal speech emotion recognition using both acoustic and text has gained momentum since speech emotion recognition doesn't only involve the acoustic modality. However, there is less review work on the available bimodal speech emotion recognition (SER) research. The review works available mostly concentrate on the use of convolution neural networks (CNNs) and recurrent neural networks (RNNs). However, recent deep learning techniques like attention mechanisms and fusion strategies have shaped the bimodal SER research without explicit analysis of their significance when used singly or in combination with the traditional deep learning techniques. We therefore, review the recently published literature that involves these deep learning techniques in this paper to ascertain the current trends and challenges of bimodal SER research that have hampered it to be fully deployed in the natural environment for off-the-shelf SER applications. In addition, we carried out experiments to ascertain the optimal combination of acoustic features and the significance of the attention mechanisms and their combination with the traditional deep learning techniques. We propose a multi-technique model called the deep learning-based multi-learning model for emotion recognition (DBMER) that operates with multi-learning capabilities of CNNs, RNNs, and multi-head attention mechanisms. We noted that attention mechanisms play a pivotal role in the performance of bimodal dyadic SER systems. However, few publicly available datasets, the difficulty in acquisition of bimodal SER data, cross-corpus and multilingual studies remain open problems in bimodal SER research. Our experiments on the proposed DBMER model showed that though each of the deep learning techniques benefits the task, the results are more accurate and robust when they are used in careful combination with multi-level fusion approaches.
Sprache
Englisch
Identifikatoren
ISSN: 2169-3536
eISSN: 2169-3536
DOI: 10.1109/ACCESS.2023.3325037
Titel-ID: cdi_crossref_primary_10_1109_ACCESS_2023_3325037

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX