《厦门大学学报（自然科学版）》

近年来,基于神经网络的机器翻译取得了快速发展,然而由于它需要大规模的平行语料库,所以对于资源稀缺的小语种的翻译往往显得效果不佳.在分析编码-解码框架和注意力机制的基础上,基于对偶学习的思想,提出了一种面向小语种翻译的半监督神经网络模型.该模型利用较大的单语语料库与少量平行语料库来实现小语种翻译.实验结果表明,当平行语料资源不足以训练一个普通神经网络模型时,使用半监督网络模型能够取得较好的结果,但所采用的半监督学习模型对单语语料库的数量要求非常高,要达到一定数量级才能达到良好效果.

Recent years,neural machine translation has achieved great development.However,its requirement for large-scale parallel corpora,translating low-resource languages fluently becomes a big challenge.This paper first briefly introduces the encoder-decoder framework and attention mechanism.Next,we propose a semi-supervised neural network model based on dual-learning,which can translate low-resource languages using some monolingual corpora and small parallel corpora.Finally,results show that semi-supervised neural machine translation can achieve reasonable results with parallel corpora which are insufficient to train a common neural model.However,the semi-supervised model requires a large number of monolingual corpora to achieve great performance.

引言
1 通用的NMT模型
2 半监督的NMT模型
3 实验与分析
4 结论

图1 自然语言处理领域的编码-解码框架示意图<br/>Fig.1 Schematic diagram of encoder-decoder framework of natural language processing domain

图1 自然语言处理领域的编码-解码框架示意图
Fig.1 Schematic diagram of encoder-decoder framework of natural language processing domain

图2 加入注意力机制的编码-解码框架示意图<br/>Fig.2 Schematic diagram of encoder-decoder framework with attention mechanism

图2 加入注意力机制的编码-解码框架示意图
Fig.2 Schematic diagram of encoder-decoder framework with attention mechanism

图3 机器翻译中对偶学习的过程模拟<br/>Fig.3 Simulation of dual learning in machine translation

图3 机器翻译中对偶学习的过程模拟
Fig.3 Simulation of dual learning in machine translation

图4 半监督的NMT模型<br/>Fig.4 Semi-supervised neural machine translation model

图4 半监督的NMT模型
Fig.4 Semi-supervised neural machine translation model

图5 半监督模型的5个训练过程<br/>Fig.5 Five training processes of semi-supervised model

图5 半监督模型的5个训练过程
Fig.5 Five training processes of semi-supervised model

图6 使用数字对齐进行词映射过程的示意图(修改自文献[11])<br/>Fig.6 Schematic diagram of mapping process using numeral alignment(modified from reference[11])

图6 使用数字对齐进行词映射过程的示意图(修改自文献[11])
Fig.6 Schematic diagram of mapping process using numeral alignment(modified from reference[11])

图7 不同训练集情况下的困惑度-迭代次数关系<br/>Fig.7 Perplexity-iteration relationship of different training sets

图7 不同训练集情况下的困惑度-迭代次数关系
Fig.7 Perplexity-iteration relationship of different training sets

表1 3种模型的实验结果<br/>Tab.1 Experimental results of three models

表1 3种模型的实验结果
Tab.1 Experimental results of three models

表2 英语-法语对偶学习翻译的实验结果[16]<br/>Tab.2 Experimental results of dual learning English-French translation results[16]

表2 英语-法语对偶学习翻译的实验结果[16]
Tab.2 Experimental results of dual learning English-French translation results[16]

表3 藏语-汉语机器翻译研究的实验结果<br/>Tab.3 Experimental results of Tibetan-Chinese machine translation

表3 藏语-汉语机器翻译研究的实验结果
Tab.3 Experimental results of Tibetan-Chinese machine translation

表4 维吾尔语-汉语机器翻译研究的实验结果<br/>Tab.4 Experimental results of Uyghur-Chinese machine translation

表4 维吾尔语-汉语机器翻译研究的实验结果
Tab.4 Experimental results of Uyghur-Chinese machine translation

表5 无监督机器翻译研究的实验结果<br/>Tab.5 Experimental results of unsupervised machine translation

表5 无监督机器翻译研究的实验结果
Tab.5 Experimental results of unsupervised machine translation

[1] BROWN P F,PIETRA V J D,PIETRA S A D,et al.The mathematics of statistical machine translation:parameter estimation[J].Computational Linguistics,1993,19(2):263-311.
[2] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]∥Advances in Neural Information Processing Systems.[S.l.]:NIPS,2014:3104-3112.
[3] FORCADA M L,ÑECO R P.Recursive hetero-associative memories for translation[C]∥International Work-Conference on Artificial Neural Networks.Berlin:Springer,1997:453-462.
[4] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1409.0473.
[5] KARAKANTA A,DEHDARI J,VAN GENABITH J.Neural machine translation for low-resource languages without parallel corpora[J].Machine Translation,2018,32(1/2):167-189.
[6] 杜金华,张萌,宗成庆,等.中国机器翻译研究的机遇与挑战:第八届全国机器翻译研讨会总结与展望[J].中文信息学报,2013,27(4):1-8.
[7] CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1406.1078.
[8] HE D,XIA Y,QIN T,et al.Dual learning for machine translation[C]∥Advances in Neural Information Processing Systems.[S.l.]:NIPS,2016:820-828.
[9] SENNRICH R,HADDOW B,BIRCH A.Improving neural machine translation models with monolingual data[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1511.06709.
[10] SENNRICH R,HADDOW B,BIRCH A.Neural machine translation of rare words with subword units[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1508.07909.
[11] ARTETXE M,LABAKA G,AGIRRE E.Learning bilingual word embeddings with(almost)no bilingual data[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.[S.l.]:ACL,2017,1:451-462.
[12] 李良友,贡正仙,周国栋.机器翻译自动评价综述[J].中文信息学报,2014,28(3):81-91.
[13] 李亚超,江静,加羊吉,等.TIP-LAS:一个开源的藏文分词词性标注系统[J].中文信息学报,2015,29(6):203-207.
[14] 韩冬,李军辉,熊德意,等.基于子字单元的神经机器翻译未登录词翻译分析[J].中文信息学报,2018,32(4):74-79,119.
[15] ZOPH B,YURET D,MAY J,et al.Transfer learning for low-resource neural machine translation[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1604.02201.
[16] HE D,XIA Y C,QIN T,et al.Dual learning for machine translation[C].[S.l.]:NIPS,2016:820-828.
[17] 李亚超,熊德意,张民,等.藏汉神经网络机器翻译研究[J].中文信息学报,2017,31(6):103-109.
[18] 位素东.基于短语的藏汉在线翻译系统研究[D].兰州:西北民族大学,2015.
[19] 罗延根,李晓,蒋同海,等.基于词向量的维吾尔语词项归一化方法[J].计算机工程,2018,44(2):220-225.
[20] 潘一荣,李晓,杨雅婷,等.面向汉维机器翻译的调序表重构模型[J].计算机应用,2018,38(5):1283-1288.
[21] 哈里旦木·阿布都克里木,刘洋,孙茂松.神经机器翻译系统在维吾尔语-汉语翻译中的性能对比[J].清华大学学报(自然科学版),2017,57(8):878-883.
[22] ARTETXE M,LABAKA G,AGIRRE E,et al.Unsupervised neural machine translation[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1710.11041.
[23] LAMPLE G,CONNEAU A,DENOYER L,et al.Unsupervised machine translation using monolingual corpora only[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1711.00043.
[24] YANG Z,CHEN W,WANG F,et al.Unsupervised neural machine translation with weight sharing[EB/OL].[2018-11-08].https:∥arxiv.org/pdf/1804.09057.
[25] GUZMÁN F,CHEN P J,OTT M,et al.Two new evaluation datasets for low-resource machine translation:Nepali-English and Sinhala-English[EB/OL].[2019-02-27].https:∥arxiv.org/pdf/1902.01382.

备注

引言

1 通用的NMT模型

2 半监督的NMT模型

3 实验与分析

4 结论

学报简介

备注

引言

1 通用的NMT模型

2 半监督的NMT模型

3 实验与分析

4 结 论

学报简介

4 结论