Sie befinden Sich nicht im Netzwerk der Universität Paderborn. Der Zugriff auf elektronische Ressourcen ist gegebenenfalls nur via VPN oder Shibboleth (DFN-AAI) möglich. mehr Informationen...
Research on high-performance English translation based on topic model
Ist Teil von
Digital communications and networks, 2023-04, Vol.9 (2), p.505-511
Ort / Verlag
Elsevier B.V
Erscheinungsjahr
2023
Link zum Volltext
Quelle
Free E-Journal (出版社公開部分のみ)
Beschreibungen/Notizen
Retelling extraction is an important branch of Natural Language Processing (NLP), and high-quality retelling resources are very helpful to improve the performance of machine translation. However, traditional methods based on the bilingual parallel corpus often ignore the document background in the process of retelling acquisition and application. In order to solve this problem, we introduce topic model information into the translation mode and propose a topic-based statistical machine translation method to improve the translation performance. In this method, Probabilistic Latent Semantic Analysis (PLSA) is used to obtains the co-occurrence relationship between words and documents by the hybrid matrix decomposition. Then we design a decoder to simplify the decoding process. Experiments show that the proposed method can effectively improve the accuracy of translation.