UB Paderborn / Katalog / Suche / Details

Zur Ergebnisliste

Ergebnis 17 von 541

Deep Learning Approaches for Bimodal Speech Emotion Recognition: Advancements, Challenges, and a Multi-Learning Model

IEEE access, 2023-01, Vol.11, p.1-1

2023

Details

Autor(en) / Beteiligte

Titel

Deep Learning Approaches for Bimodal Speech Emotion Recognition: Advancements, Challenges, and a Multi-Learning Model

Ist Teil von

IEEE access, 2023-01, Vol.11, p.1-1

Ort / Verlag

IEEE

Erscheinungsjahr

2023

Link zum Volltext

Quelle

EZB Electronic Journals Library

Beschreibungen/Notizen

Though acoustic speech emotion recognition has been studied for a while, bimodal speech emotion recognition using both acoustic and text has gained momentum since speech emotion recognition doesn't only involve the acoustic modality. However, there is less review work on the available bimodal speech emotion recognition (SER) research. The review works available mostly concentrate on the use of convolution neural networks (CNNs) and recurrent neural networks (RNNs). However, recent deep learning techniques like attention mechanisms and fusion strategies have shaped the bimodal SER research without explicit analysis of their significance when used singly or in combination with the traditional deep learning techniques. We therefore, review the recently published literature that involves these deep learning techniques in this paper to ascertain the current trends and challenges of bimodal SER research that have hampered it to be fully deployed in the natural environment for off-the-shelf SER applications. In addition, we carried out experiments to ascertain the optimal combination of acoustic features and the significance of the attention mechanisms and their combination with the traditional deep learning techniques. We propose a multi-technique model called the deep learning-based multi-learning model for emotion recognition (DBMER) that operates with multi-learning capabilities of CNNs, RNNs, and multi-head attention mechanisms. We noted that attention mechanisms play a pivotal role in the performance of bimodal dyadic SER systems. However, few publicly available datasets, the difficulty in acquisition of bimodal SER data, cross-corpus and multilingual studies remain open problems in bimodal SER research. Our experiments on the proposed DBMER model showed that though each of the deep learning techniques benefits the task, the results are more accurate and robust when they are used in careful combination with multi-level fusion approaches.

Sprache: Englisch
Identifikatoren: ISSN: 2169-3536
eISSN: 2169-3536
DOI: 10.1109/ACCESS.2023.3325037
Titel-ID: cdi_crossref_primary_10_1109_ACCESS_2023_3325037

Format: –
Schlagworte: acoustic and lexical data, Acoustics, attention mechanisms, Computational modeling, Data models, Deep learning, Emotion recognition, Feature extraction, Speech recognition

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX

Menü

Weitere Dienste

Einstellungen

Deep Learning Approaches for Bimodal Speech Emotion Recognition: Advancements, Challenges, and a Multi-Learning Model

Details

Weiterführende Literatur