《厦门大学学报（自然科学版）》

为了将统计机器翻译技术中的规则信息引入到端到端的神经网络模型中,提出了一种将规则信息转化为近似等价的序列信息的方法.在此基础上,提出了两种融入规则信息的神经机器翻译模型,并在基于注意力机制的循环神经网络(RNN)模型上进行了验证.相对于未融入规则信息的基准模型在美国国家标准与技术研究院(NIST)评测集上的评测结果,上述两种模型的双语互译评估(BLEU)值均有所提高.实验表明,将规则等外部知识融入到神经机器翻译系统中是提升模型翻译质量的一种有效途径.

Neural machine translation is currently the most popular research method in the field of machine translation.Introduction of external knowledge into the neural machine translation system has become a research hotspot in this field.To introduce the rule information in statistical machine translation technology into the end-to-end neural network model,we have developed a method to convert the rule information into approximately equivalent sequence information.On this basis,we propose two neural machine translation model frameworks that incorporate rule information,and validate these methods on the attention-based RNN model and find that these methods can achieve a better BLEU point on the NIST evaluation set.Experimental results show that integrating external knowledge such as rules into the neural machine translation system is an effective way to improve the quality of model translation.

引言
1 基线模型
2 融入规则信息的方法
3 实验
4 分析
5 结论

图1 基于注意力机制的RNN模型<br/>Fig.1 RNN model based on attention mechanism

图1 基于注意力机制的RNN模型
Fig.1 RNN model based on attention mechanism

图2 基于层次短语的SMT及其序列转化过程示例<br/>Fig.2 Examples of SMT based on hierarchical phrase and its sequence conversions

图2 基于层次短语的SMT及其序列转化过程示例
Fig.2 Examples of SMT based on hierarchical phrase and its sequence conversions

图3 翻译推导树与还原树<br/>Fig.3 Translation derivation tree and restoration tree

图3 翻译推导树与还原树
Fig.3 Translation derivation tree and restoration tree

图4 双端融合模型<br/>Fig.4 Double-ended integration model

图4 双端融合模型
Fig.4 Double-ended integration model

表1 不同翻译系统的BLEU比较<br/>Tab.1 BLEU value comparison of different translation systems

表1 不同翻译系统的BLEU比较
Tab.1 BLEU value comparison of different translation systems

图5 泛化部分的翻译质量比较<br/>Fig.5 Comparison of translation quality in generalization

图5 泛化部分的翻译质量比较
Fig.5 Comparison of translation quality in generalization

图6 不同句长的翻译质量比较<br/>Fig.6 Comparison of translation quality with different sentence lengths

图6 不同句长的翻译质量比较
Fig.6 Comparison of translation quality with different sentence lengths

表2 不同模型的翻译示例比较<br/>Tab.2 Translation examples comparison among different models

表2 不同模型的翻译示例比较
Tab.2 Translation examples comparison among different models

[1] 冯志伟.自然语言机器翻译新论[M].北京:语文出版社, 1994:1-259.
[2] 刘群.机器翻译研究新进展[J].当代语言学, 2009, 11(2):147-158.
[3] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]∥Advances in Neural Information Processing Systems 27.Montreal:NIPS Press,2014:3104-3112.
[4] CHIANG D.Hierarchical phrase-based translation[J].Computational Linguistics,2007,33(2):201-228.
[5] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2019-08-01].https:∥arxiv.org/pdf/1409.0473.pdf.
[6] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]∥Advances in Neural Information Processing Systems 30.Long Beach:NIPS,2017:5998-6008.
[7] SENNRICH R,HADDOW B.Linguistic input features improve neural machine translation[C]∥Proceedings of the First Conference on Machine Translation WMT 2016 colocated with ACL 2016.Berlin:ACL,2016:83-91.
[8] LI J,XIONG D,TU Z,et al.Modeling source syntax for neural machine translation[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver:ACL,2017:688-697.
[9] WANG X,LU Z,TU Z,et al.Neural machine translation advised by statistical machine translation[C]∥Thirty-First AAAI Conference on Artificial Intelligence.San Francisco:AAAI,2017:3330-3336.
[10] KUANG S,LI J,BRANCO A,et al.Attention focusing for neural machine translation by bridging source and target embeddings[C]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Melbourne:ACL,2018:1767-1776.
[11] DOU Z Y,TU Z,WANG X,et al.Exploiting deep representations for neural machine translation[C]∥Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels:ACL,2018:4253-4262.
[12] KUANG S,HAN L.Apply Chinese radicals into neural machine translation:deeper than character level[EB/OL].[2019-08-01].https:∥arxiv.org/pdf/1805.01565.pdf.
[13] WU W,MENG Y,HAN Q,et al.Glyce:glyph-vectors for Chinese character representations[EB/OL].[2019-08-01].https:∥arxiv.org/pdf/1901.10125.pdf.
[14] DYER C,WEESE J,SETIAWAN H,et al.CDEC:a decoder,alignment,and learning framework for finite-state and context-free translation models[C]∥Proceedings of the ACL 2010 System Demonstrations.Uppsala:ACL,2010:7-12.
[15] MI H,WANG Z,ITTYCHERIAH A.Supervised attentions for neural machine translation[C]∥Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Texas:ACL,2016:2283-2288.
[16] SENNRICH R,HADDOW B,BIRCH A.Neural machine translation of rare words with subword units[C]∥Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Berlin:ACL,2016:1715-1725.
[17] CHUNG J,GULCEHRE C,CHO K H,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL].[2019-08-01].https:∥arxiv.org/pdf/1412.3555.pdf.
[18] KINGMA D P,BA J.Adam:a method for stochastic optimization[C]∥3rd International Conference on Learning Representations.San Diego:ICLR,2015:1-13.
[19] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958.
[20] PAPINENI K,ROUKOS S,WARD T,et al.BLEU:a method for automatic evaluation of machine translation[C]∥Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.Philadelphia:ACL,2002:311-318.

备注

引言

1 基线模型

2 融入规则信息的方法

3 实验

4 分析

5 结论

学报简介

备注

引言

1 基线模型

2 融入规则信息的方法

3 实 验

4 分 析

5 结 论

学报简介

3 实验

4 分析

5 结论