Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 6 von 1037676
ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, p.7644-7648
2020

Details

Autor(en) / Beteiligte
Titel
Semi-Supervised Learning Based on Hierarchical Generative Models for End-to-End Speech Synthesis
Ist Teil von
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, p.7644-7648
Ort / Verlag
IEEE
Erscheinungsjahr
2020
Link zum Volltext
Quelle
IEEE/IET Electronic Library (IEL)
Beschreibungen/Notizen
  • This paper proposes a general framework of semi-supervised learning based on hierarchical generative models and adapts it to a Japanese end-to-end text-to-speech (TTS) system. In English TTS, several end-to-end systems have recently achieved sound quality close to that of natural human speech. However, in non-alphabetic languages such as Japanese, it is difficult to realize true text-input end-to-end TTS due to character diversity and pitch accents. To address this problem, we propose end-to-end TTS based on semi-supervised learning that makes the most of existing data consisting of any combination of text, phoneme, and waveform as training data. To demonstrate the effectiveness of the proposed system, listening tests were conducted for pronunciation and naturalness. Our results show that the proposed system improves both pronunciation and naturalness.

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX