Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Ergebnis 11 von 44
Expert systems with applications, 2022-12, Vol.209, p.118187, Article 118187
2022
Volltextzugriff (PDF)

Details

Autor(en) / Beteiligte
Titel
Low resource machine translation of english–manipuri: A semi-supervised approach
Ist Teil von
  • Expert systems with applications, 2022-12, Vol.209, p.118187, Article 118187
Ort / Verlag
Elsevier Ltd
Erscheinungsjahr
2022
Quelle
Access via ScienceDirect (Elsevier)
Beschreibungen/Notizen
  • The language barrier is one of the practical challenges human being face during communication. To overcome this, researchers are focusing on using machines to translate a source language to a target language using the textual representations of the languages. Thus, machine translation (MT) could achieve a near human-level performance in terms of translation quality for several resource-rich languages. However, machine translation performance is still far from a production-level quality for the low resource languages. This work reports a semi-supervised neural machine translation system to boost the translation quality for an extremely resource constraint language pair, i.e. English–Manipuri. Our proposed approach exploits self-training and back-translation in a combined technique. The quantitative evaluation shows that the system performance improves by +0.9 BLEU score after introducing external noise to the input data. Additionally, a multi-reference test dataset developed in-house is used to evaluate the linguistic diversity of the highly agglutinative and morphologically rich Manipuri language. Experimental result attests that the proposed semi-supervised system outperforms the supervised, the pretrained mBART and existing semi-supervised baselines in terms of automatic score and subjective evaluation parameters by a significant margin up to +4.5 and +1.2 BLEU improvements against the supervised and mBART baselines respectively. •Backtranslation and forward-translation improve the low resource machine translation.•External perturbations to the noisy synthetic data help in converging the model.•Linguistic variations are tackled via the inclusion of multiple test references.•The proposed method is competitive with pre-trained models.
Sprache
Englisch
Identifikatoren
ISSN: 0957-4174
eISSN: 1873-6793
DOI: 10.1016/j.eswa.2022.118187
Titel-ID: cdi_crossref_primary_10_1016_j_eswa_2022_118187

Weiterführende Literatur

Empfehlungen zum selben Thema automatisch vorgeschlagen von bX