《厦门大学学报（自然科学版）》

基于增量式自学习策略的多语言翻译模型

周张萍,黄荣城,王博立,胡金铭,史晓东^*,陈毅东

(厦门大学信息科学与技术学院,福建厦门 360001)

关键词：神经网络机器翻译; 多语言机器翻译; 增量式自学习

Multilanguage translation model based on incremental self-learning strategy

ZHOU Zhangping,HUANG Rongcheng,WANG Boli,HU Jinming,SHI Xiaodong,CHEN Yidong

(School of Information Science and Engineering,Xiamen University,Xiamen 360001,China)

DOI: 10.6043/j.issn.0438-0479.201811016

备注

摘要

全文

图/表

参考文献

针对源语言到目标语言缺乏平行语料的情况,提出了一种基于增量式自学习策略的多语言翻译模型,即利用中介语双语语料训练源语言到目标语言的翻译模型.在Transformer架构下,相比于基于中介语和直接在伪平行语料上训练的普通双语翻译模型,使用该方法在第十四届全国机器翻译研讨会(CWMT 2018)多语言翻译评测数据集上的机器双语互译评估(BLEU)值提升了0.98个百分点.在此基础上,还对比了不同的预处理方法、训练策略以及多模型的平均和集成策略,其中多模型集成策略的BLEU值上可在多模型策略的基础上进一步提升0.53个百分点.

Without parallel corpus from the source language to the target language,we train multilingual neural machine translation models on bilingual corpus of the pivot language and propose an incremental learning strategy to improve source-language to target-language translation.Experimental results under Transformer framework show that our multilingual iterative method can improve the BLEU score by 0.98 percent point on the China workshop on machine translation(CWMT)2018 multi-language translation evaluation data set,compared to traditional pivot-based translation and the vanilla multilingual neural machine translation(NMT).In addition,we also compared different preprocessing methods,training strategies,multi-model average and ensemble,where multi-model ensemble can further increase the BLEU score by 0.53 percent point unpon common multi-model strategy.

引言
1 基于增量式自学习的多语言翻译模型
2 数据处理
3 实验
4 结论

pdf格式下载

+分享

导出

学报简介

《厦门大学学报（自然科学版）》于1931年创刊，是由教育部主管，厦门大学主办，国内外公开发行的综合性学术期刊（双月刊），是我国自然科学核心期刊。本刊以印刷版、网络版的方式同时出版。主要刊载自然科学各学科的最新研究成果，包括自然科学基础理论研究、应用基础研究、高新技术方面的学术论文。所刊载的论文分三大类型：（1）“快讯”：报道某前沿领域具有突破性的最新研究成果。（2）“研究论文”：刊载理工科基础理论研究与实验研究学术论文。（3）“研究简报”：刊载内容新颖、实用（或阶段性）的成果。更多>>

备注

引言

1 基于增量式自学习的多语言翻译模型

2 数据处理

3 实 验

4 结 论

学报简介

3 实验

4 结论