《厦门大学学报（自然科学版）》

现有的机器翻译模型通常在词粒度切分的数据集上进行训练,然而不同的切分粒度蕴含着不同的语法、语义的特征和信息,仅考虑词粒度将制约神经机器翻译系统的高效训练.这对于藏语相关翻译因其语言特点而显得尤为突出.为此提出针对藏汉双向机器翻译的具有音节、词语以及音词融合的多粒度训练方法,并基于现有的注意力机制神经机器翻译框架,在解码器中融入自注意力机制以捕获更多的目标端信息,提出了一种新的神经机器翻译模型.在CWMT2018藏汉双语数据集上的实验结果表明,多粒度训练方法的翻译效果明显优于其余切分粒度的基线系统,同时解码器中引入自注意力机制的神经机器翻译模型能够显著提升翻译效果.此外在WMT2017德英双语数据集上的实验结果进一步证明了该方法在其他语种方向上的适用性.

Existing machine translation models are usually trained on word-granularity data sets.However,different segmentations contain different grammatical,semantic features.Segmenting word granularity merely will interfere efficient training of neural machine translation(NMT)models,and is particularly prominent for Tibetan-related translation due to Tibetan linguistic features.Hence,for bidirectional Tibetan-Chinese NMT,we propose a multi-granularity training method focusing on syllables,words and phonetic fusion.We also propose a novel NMT model within the attention-based NMT framework,where a self-attention mechanism is incorporated into the decoder to capture more target-side information.Experimental results on CWMT2018 Tibetan-Chinese bilingual dataset show that the translation performance of the phonetic word fusion segmentation granularity significantly outperforms other segmentation granularity,and that integrating self-attention mechanism into the decoder can improve the translation quality greatly.In this paper,we also use the additional WMT2017 German-English bilingual dataset to demonstrate the universality of the proposed method across different languages.

引言
1 NMT模型
2 不同粒度的切分策略
3 实验与结果分析
4 结论

图1 RNN*Self-Attention的编码器-解码器网络模型<br/>Fig.1 Encoder-decoder network model of RNN*Self-Attention

图1 RNN*Self-Attention的编码器-解码器网络模型
Fig.1 Encoder-decoder network model of RNN*Self-Attention

表1 不同粒度下的训练语料<br/>Tab.1 Training corpus at different granularities

表1 不同粒度下的训练语料
Tab.1 Training corpus at different granularities

表2 测试语料
Tab.2 Test corpus

表3 藏汉双向翻译模型测试结果<br/>Tab.3 Test results of Tibetan-Chinese bidirectional translation models

表3 藏汉双向翻译模型测试结果
Tab.3 Test results of Tibetan-Chinese bidirectional translation models

表4 德英和英德翻译的测试结果<br/>Tab.4 Test results in German-English and English-German

表4 德英和英德翻译的测试结果
Tab.4 Test results in German-English and English-German

[1] CHO K,VAN MERRIENBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1406.1078.pdf.
[2] COSTA-JUSSÀ M R,ALDÓN D,FONOLLOSA J A R.Chinese-Spanish neural machine translation enhanced with character and word bitmap fonts[J].Machine Translation,2017,31(1/2):35-47.
[3] SUTSKEVER I,VINYALS O,LE V Q.Sequence to sequence learning with neural networks[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1409.3215.pdf.
[4] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointlyl earning to align and translate[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1409.0473.pdf.
[5] SENNRICH R,HADDOW R,BIRCH A.Neural machine translation of rare words with subword units[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1508.07909.pdf.
[6] BRITZ D,LE Q,PRYZANT R.Effective domain mixing for neural machine translation[C]∥Proceedings of the Second Conference on Machine Translation.Copenhagen:ACL,2017:118-126.
[7] MURRAY K,CHIANG D.Auto-sizing neural networks:with applications to n-gram language models[C]∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon:ACL,2015:908-916.
[8] YAN S,DAHLMANN L,PETRUSHKOV P,et al.Word-based domain adaptation for neural machine translation[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1906.03129.pdf
[9] LIN Z H,FENG M W,DOS SANTOS C N,et al.A structured self-attentive sentence embedding[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1703.03130.pdf.
[10] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1706.03762.pdf.
[11] ZHAO S J,ZHANG Z H.An Efficient character-level neural machine translation[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1608.04738.pdf.
[12] 李亚超,江静,加羊吉,等.TIP-LAS:一个开源的藏文分词词性标注系统[J].中文信息学报,2015,29(6):203-207.
[13] SENNRICH R,BIRCH A,CURREY A,et al.The University of Edinburgh's neural MT systems for WMT17[EB/OL].[2019-12-16].https:∥arxiv.org/pdf/1708.00726.pdf.
[14] 拉玛扎西,才智杰,扎西吉.藏文紧缩格识别方法[J].计算机应用研究,2019,36(4):1080-1083.

备注

引言

1 NMT模型

2 不同粒度的切分策略

3 实验与结果分析

4 结论

学报简介

备注

引言

1 NMT模型

2 不同粒度的切分策略

3 实验与结果分析

4 结 论

学报简介

4 结论