《厦门大学学报（自然科学版）》

如何改善神经机器翻译模型的翻译性能一直是学术界研究的热门课题,特别是在低资源语种的翻译任务上,如何提高原有平行语料训练出来的翻译模型的翻译质量是一个迫切需要解决的问题.为此,对传统的统计机器翻译任务上使用的协同训练方法进行优化,进一步提出新的协同训练方法,并应用于神经机器翻译任务中,改善原有神经机器翻译模型的翻译质量.实验表明神经机器翻译中使用协同训练的方法能显著提高翻译质量,在语料数量稀少(低资源语料)的情况下提升效果更为显著.

Improving the performance of neural machine translation system is a hot research topic in the academia,especially for the low-resource language translation tasks.Co-training is a method which uses large amounts of unlabeled data in addition to a small labeled data set.It is used to label unlabeled data with high quality.Such additional labeled data is further used to enlarge the original labeled data set and finally retrain the translation model to obtain a better translation model.In this paper we propose to use the co-training method in neural machine translation.In order to make this method more applicable,we introduce a new co-training method to improve its practicality.Experimental results show that the proposed co-training in neural machine translation can significantly improve the translation performance,especially for low-resource language translation tasks.

引言
1 背景知识
2 基于注意力机制的神经机器翻译系统
3 自训练与协同训练方法介绍
4 实验
5 结论

图1 基于注意力机制的神经机器翻译模型<br/>Fig.1 Neural machine translation model based on attention mechanism

图1 基于注意力机制的神经机器翻译模型
Fig.1 Neural machine translation model based on attention mechanism

图2 利用3对平行语料训练出3个模型<br/>Fig.2 Training three models with three pairs of parallel corpora

图2 利用3对平行语料训练出3个模型
Fig.2 Training three models with three pairs of parallel corpora

图3 对多源端相互平行语料进行标记并挑选最好的结果<br/>Fig.3 Marking multiple sources of parallel corpus and choosing the best results

图3 对多源端相互平行语料进行标记并挑选最好的结果
Fig.3 Marking multiple sources of parallel corpus and choosing the best results

图4 将标记好的语料对原有语料进行扩充并重新训练模型<br/>Fig.4 Enlarging the original corpus with the newly labelled corpus and retraining the model

图4 将标记好的语料对原有语料进行扩充并重新训练模型
Fig.4 Enlarging the original corpus with the newly labelled corpus and retraining the model

图5 利用已有平行语料训练出翻译模型并对单语语料进行翻译<br/>Fig.5 Use the existing parallel corpus to train the translation model and translate the monolingual corpus

图5 利用已有平行语料训练出翻译模型并对单语语料进行翻译
Fig.5 Use the existing parallel corpus to train the translation model and translate the monolingual corpus

图6 n-gram模型测量图 5翻译结果的PPL并挑选PPL小于指定参数x的平行语句进行语料扩充<br/>Fig.6 Using the n-gram model to measure the perplexity of the translation results in Fig. 5 and picking parallel sentences with perplexity less than the specified parameter x for corpus expansion

图6 n-gram模型测量图 5翻译结果的PPL并挑选PPL小于指定参数x的平行语句进行语料扩充
Fig.6 Using the n-gram model to measure the perplexity of the translation results in Fig. 5 and picking parallel sentences with perplexity less than the specified parameter x for corpus expansion

表1 不同方法在不同规模语料条件下的训练模型的最佳BLEU值<br/>Tab.1 The best BLEU scores of the models trained by different methods under different corpus sizes%

表1 不同方法在不同规模语料条件下的训练模型的最佳BLEU值
Tab.1 The best BLEU scores of the models trained by different methods under different corpus sizes%

表2 新协同训练方法在不同语料规模下的模型的最佳BLEU值<br/>Tab.2 The best BLEU scores of the co-training_new model under different corpus sizes%

表2 新协同训练方法在不同语料规模下的模型的最佳BLEU值
Tab.2 The best BLEU scores of the co-training_new model under different corpus sizes%

[1] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.0473.
[2] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.3215.
[3] BARRET Z,DENIZ Y,JONATHAN M,et al.Transfer learning for low-resource neural machine translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1604.02201.
[4] BLUM A,MITCHELL T.Combining labeled and unlabeled data with co-training[C]∥Proceedings of the 11th Annual Conference on Computational Learning Theory(COLT).New York:ACM,1998:92-100.
[5] NIGAM K,GHANIR.Analyzing the effectiveness and applicability of co-training[C]∥Proceedings of the 9th International Conference on Information and Knowledge Management(CIKM).McLean,Virginia:ACM,2000:86-93.
[6] CALLISON-BURCH C.Co-training for statistical machine translation[EB/OL].[2018-10-01].http:∥www.cis.upenn.edu/～ccb/publications/msc-thesis.pdf.
[7] LIANG P.Semi-supervised learning for natural language[D].Cambridge:Massachusetts Institute of Technology,2005:1-86.
[8] SENNRICH R,HADDOW B,BIRCH A.Improving neural machine translation models with monolingual data[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1511.06709.
[9] KAY M.Triangulation in translation[EB/OL].[2018-10-01].http:∥clu.uni.no/icame/corpora/2000-3/0119.
[10] ZOPH B,KNIGHT K.Multi-source neural translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1601.00710.
[11] JOHNSON M,SCHUSTER M,LE Q V,et al.Google's multilingual neural machine translation system:enabling zero-shot translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1611.04558.
[12] KOEHN P.Pharaoh:a beam search decoder for phrase-based statistical machine translation models[M]∥Machine translation:from real users to research.Berlin Heidelberg:Springer,2004:115-124.
[13] MATTHEW D,ZEILER.ADADELTA:an adaptive learning rate method[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1212.5701.
[14] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.

备注

引言

1 背景知识

2 基于注意力机制的神经机器翻译系统

3 自训练与协同训练方法介绍

4 实验

5 结论

学报简介

备注

引言