多语言的无监督神经机器翻译

(苏州大学计算机科学与技术学院,江苏 苏州 215006)

无监督; 神经机器翻译; 多语言; 多任务

Multi-language unsupervised neural machine translation
XUE Qingtian,LI Junhui*,GONG Zhengxian

(School of Computer Science and Technology,Soochow University,Suzhou 215006,China)

DOI: 10.6043/j.issn.0438-0479.201908042

备注

依赖于大规模的平行语料库,神经机器翻译在某些语言对上已经取得了巨大的成功.然而高质量平行语料的获取却是机器翻译研究的主要难点之一.为了解决这一问题,一种可行的方案是采用无监督神经机器翻译(unsupervised neural machine translation,UNMT),该方法仅仅使用两门不相关的单语语料就可以进行训练,并获得一个不错的翻译结果.受多任务学习在有监督神经机器翻译上取得的良好效果的启发,本文主要探究UNMT在多语言、多任务学习上的应用.实验使用3门互不相关的单语语料,两两建立双向的翻译任务.实验结果表明,与单任务UNMT相比,该方法在部分语言对上最高取得了2~3个百分点的双语互译评估(BLEU)值提升.

Depending on the large-scale parallel corpus,neural machine translation has achieved great success in some language pairs.Unfortunately,for the vast majority of language pairs,the acquisition of high quality parallel corpus remains one of the main difficulties in machine translation research.To solve this problem,we propose to use unsupervised neural machine translation(UNMT).This method can train two unrelated monolingual corpora in a neural machine translation system,and obtain a good translation result.Inspired by meritorious results of multi-task learning in supervised neural machine translation,we explore the application of the UNMT in multi-task learning.In our experiment,we use three unrelated monolingual corpora to create a translation task.According to the experimental results,compared with the single-task UNMT,this method has performed greatly in some language pairs.