Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 24 von 255

Details

Autor(en) / Beteiligte
Titel
Bodyformer: Semantics-guided 3D Body Gesture Synthesis with Transformer
Ist Teil von
  • ACM transactions on graphics, 2023-08, Vol.42 (4), p.1-12, Article 43
Ort / Verlag
New York, NY, USA: ACM
Erscheinungsjahr
2023
Link zum Volltext
Quelle
ACM Digital Library
Beschreibungen/Notizen
  • Automatic gesture synthesis from speech is a topic that has attracted researchers for applications in remote communication, video games and Metaverse. Learning the mapping between speech and 3D full-body gestures is difficult due to the stochastic nature of the problem and the lack of a rich cross-modal dataset that is needed for training. In this paper, we propose a novel transformer-based framework for automatic 3D body gesture synthesis from speech. To learn the stochastic nature of the body gesture during speech, we propose a variational transformer to effectively model a probabilistic distribution over gestures, which can produce diverse gestures during inference. Furthermore, we introduce a mode positional embedding layer to capture the different motion speeds in different speaking modes. To cope with the scarcity of data, we design an intra-modal pre-training scheme that can learn the complex mapping between the speech and the 3D gesture from a limited amount of data. Our system is trained with either the Trinity speech-gesture dataset or the Talking With Hands 16.2M dataset. The results show that our system can produce more realistic, appropriate, and diverse body gestures compared to existing state-of-the-art approaches.
Sprache
Englisch
Identifikatoren
ISSN: 0730-0301
eISSN: 1557-7368
DOI: 10.1145/3592456
Titel-ID: cdi_crossref_primary_10_1145_3592456

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX