Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 23 von 64

Details

Autor(en) / Beteiligte
Titel
English–Korean speech translation corpus (EnKoST-C): Construction procedure and evaluation results
Ist Teil von
  • ETRI journal, 2023-02, Vol.45 (1), p.18-27
Ort / Verlag
한국전자통신연구원
Erscheinungsjahr
2023
Quelle
EZB Electronic Journals Library
Beschreibungen/Notizen
  • We present an English-Korean speech translation corpus, named EnKoST-C. End-to-end model training for speech translation tasks often suffers from a lack of parallel data, such as speech data in the source language and equivalent text data in the target language. Most available public speech translation corpora were developed for European languages, and there is currently no public corpus for English-Korean end-to-end speech translation. Thus, we created an EnKoST-C centered on TED Talks. In this process, we enhance the sentence alignment approach using the subtitle time information and bilingual sentence embedding information. As a result, we built a 559-h English-Korean speech translation corpus. The proposed sentence alignment approach showed excellent performance of 0.96 f-measure score. We also show the baseline performance of an English-Korean speech translation model trained with EnKoST-C. The EnKoST-C is freely available on a Korean government open data hub site.
Sprache
Koreanisch
Identifikatoren
ISSN: 1225-6463
eISSN: 2233-7326
Titel-ID: cdi_kisti_ndsl_JAKO202355350298748
Format

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX