UB Paderborn / Katalog / Suche / Details

Ergebnis 6 von 1037676

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, p.7644-7648

2020

Autor(en) / Beteiligte

Titel

Semi-Supervised Learning Based on Hierarchical Generative Models for End-to-End Speech Synthesis

Ist Teil von

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, p.7644-7648

Ort / Verlag

IEEE

Erscheinungsjahr

2020

Link zum Volltext

Quelle

IEEE/IET Electronic Library (IEL)

Beschreibungen/Notizen

This paper proposes a general framework of semi-supervised learning based on hierarchical generative models and adapts it to a Japanese end-to-end text-to-speech (TTS) system. In English TTS, several end-to-end systems have recently achieved sound quality close to that of natural human speech. However, in non-alphabetic languages such as Japanese, it is difficult to realize true text-input end-to-end TTS due to character diversity and pitch accents. To address this problem, we propose end-to-end TTS based on semi-supervised learning that makes the most of existing data consisting of any combination of text, phoneme, and waveform as training data. To demonstrate the effectiveness of the proposed system, listening tests were conducted for pronunciation and naturalness. Our results show that the proposed system improves both pronunciation and naturalness.

Sprache: Englisch
Identifikatoren: eISSN: 2379-190X
DOI: 10.1109/ICASSP40776.2020.9054466
Titel-ID: cdi_ieee_primary_9054466

Format: –
Schlagworte: Adaptation models, End-to-end speech synthesis, hierarchical generative model, Japanese speech synthesis, semi-supervised learning, Semisupervised learning, Signal processing, Speech recognition, Speech synthesis, System analysis and design, Training data, variational auto-encoder

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX