Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Integration of speaker and pitch adaptive training for HMM-based singing voice synthesis
Ist Teil von
2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014, p.2559-2563
Ort / Verlag
IEEE
Erscheinungsjahr
2014
Quelle
IEEE Electronic Library Online
Beschreibungen/Notizen
A statistical parametric approach to singing voice synthesis based on hidden Markov models (HMMs) has been growing in popularity over the last few years. The spectrum, excitation, vibrato, and duration of the singing voice in this approach are simultaneously modeled with context-dependent HMMs and waveforms are generated from the HMMs themselves. Since HMM-based singing voice synthesis systems are "corpus-based," the HMMs corresponding to contextual factors that rarely appear in the training data cannot be well-trained. However, it may be difficult to prepare a large enough quantity of singing voice data sung by one singer. Furthermore, the pitch included in each song is imbalanced, and there is the vocal range of the singer. In this paper, we propose "singer adaptive training" which can solve the data sparse-ness problem. Experimental results demonstrated that the proposed technique improved the quality of the synthesized singing voices.