|本期目录/Table of Contents|

[1]汪 琪,王 坤,段湘煜*,等.融入依存关联指导的神经机器翻译方法[J].厦门大学学报(自然科学版),2019,58(02):158-163.[doi:10.6043/j.issn.0438-0479.201811024]
 WANG Qi,WANG Kun,DUAN Xiangyu*,et al.Dependency-based correlation guidance for neural machine translation[J].Journal of Xiamen University(Natural Science),2019,58(02):158-163.[doi:10.6043/j.issn.0438-0479.201811024]
点击复制

融入依存关联指导的神经机器翻译方法(PDF/HTML)
分享到:

《厦门大学学报(自然科学版)》[ISSN:0438-0479/CN:35-1070/N]

卷:
58卷
期数:
2019年02期
页码:
158-163
栏目:
机器翻译模型
出版日期:
2019-03-27

文章信息/Info

Title:
Dependency-based correlation guidance for neural machine translation
文章编号:
0438-0479(2019)02-0158-06
作者:
汪 琪王 坤段湘煜*张 民
苏州大学计算机科学与技术学院,江苏 苏州 215000
Author(s):
WANG QiWANG KunDUAN Xiangyu*ZHANG Min
School of Computer Science and Technology,Soochow University,Suzhou 215000,China
关键词:
神经机器翻译 依存关联指导 依存关联损失
Keywords:
neural machine translation dependency-based correlation guidance dependency loss function
分类号:
TP 391.2
DOI:
10.6043/j.issn.0438-0479.201811024
文献标志码:
A
摘要:
现有的神经机器翻译模型的注意力机制仅考虑目标端对应源端的关联信息,未考虑源端单词之间的关联信息.通过在源端进行关联性建模,融入依存关联指导,以此加强源端单词之间的关联性,提高机器翻译的性能.首先构建源端隐藏层之间的关联性,其次构建依存关联损失函数,从而将依存关联指导融入基准的神经机器翻译系统.利用循环神经网络基准模型和Transformer基准模型分别在大规模的中-英测试数据集上进行实验,结果表明,相较于基准神经机器翻译系统,融入依存关联指导可以有效提升机器翻译质量.
Abstract:
The attention mechanism commonly used by the existing neural machine translation only considers the correlation information between the target and the source,and does not take source correlation information among words into account.This paper enhances the correlation information among source words and improves the performance of machine translation by building correlation models at the source and incorporating dependency guidance.We constructed the correlation information between the source hidden layers,and built dependent loss function into neural machine translation.We experimented with large-scale Chinese-to-English data set on RNN system and Transformer model.Experiment results show that dependency guidance for neural machine translation can effectively improve the translation quality.

参考文献/References:

[1] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]∥Advances in neural information processing systems.[S.l.]:NIPS,2014:3104-3112.
[2] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.0473.
[3] LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1508.04025.
[4] 李亚超,熊德意,张民.神经机器翻译综述[EB/OL].[2018-10-05].https:∥max.book118.com/html/2018/0105/147392219.shtm.
[5] KOEHN P,OCH F J,MARCU D.Statistical phrase-based translation[C]∥Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Edmonton:Association for Computational Linguistics,2003:48-54.
[6] WU S,ZHANG D,YANG N,et al.Sequence-to-dependency neural machine translation[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics,2017:698-707.
[7] WU S,ZHOU M,ZHANG D.Improved neural machine translation with source syntax[C]∥Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.Melbourne:IJCAI,2017:4179-4185.
[8] HASHIMOTO K,TSURUOKA Y.Neural machine translation with source-side latent graph parsing[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1702.02265.
[9] CHEN K,WANG R,UTIYAMA M,et al.Syntax-Directed Attention for Neural Machine Translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1711.04231.
[10] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[11] CHO K,VAN MERRI?NBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1406.1078.
[12] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]∥Advances in Neural Information Processing Systems.[S.l.]:arXiv,2017:5998-6008.
[13] BA J L,KIROS J R,HINTON G E.Layer nnormalization[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1607.06450.
[14] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:770-778.
[15] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1412.6980.
[16] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958.[1] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]∥Advances in neural information processing systems.[S.l.]:NIPS,2014:3104-3112.
[2] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.0473.
[3] LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1508.04025.
[4] 李亚超,熊德意,张民.神经机器翻译综述[EB/OL].[2018-10-05].https:∥max.book118.com/html/2018/0105/147392219.shtm.
[5] KOEHN P,OCH F J,MARCU D.Statistical phrase-based translation[C]∥Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Edmonton:Association for Computational Linguistics,2003:48-54.
[6] WU S,ZHANG D,YANG N,et al.Sequence-to-dependency neural machine translation[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.[S.l.]:Association for Computational Linguistics,2017:698-707.
[7] WU S,ZHOU M,ZHANG D.Improved neural machine translation with source syntax[C]∥Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence.Melbourne:IJCAI,2017:4179-4185.
[8] HASHIMOTO K,TSURUOKA Y.Neural machine translation with source-side latent graph parsing[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1702.02265.
[9] CHEN K,WANG R,UTIYAMA M,et al.Syntax-Directed Attention for Neural Machine Translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1711.04231.
[10] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[11] CHO K,VAN MERRI?NBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1406.1078.
[12] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]∥Advances in Neural Information Processing Systems.[S.l.]:arXiv,2017:5998-6008.
[13] BA J L,KIROS J R,HINTON G E.Layer nnormalization[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1607.06450.
[14] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:770-778.
[15] KINGMA D P,BA J.Adam:a method for stochastic optimization[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1412.6980.
[16] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958.

备注/Memo

备注/Memo:
收稿日期:2018-11-13 录用日期:2019-01-10
基金项目:国家重点研发计划重点专项(2016YFE0132100); 国家自然科学基金(61673289,61273319)
*通信作者:xiangyuduan@suda.edu.cn
更新日期/Last Update: 1900-01-01