|本期目录/Table of Contents|

[1]黄佳跃,熊德意*.利用协同训练提高神经机器翻译系统的翻译性能[J].厦门大学学报(自然科学版),2019,58(02):176-183.[doi:10.6043/j.issn.0438-0479.201811011]
 HUANG Jiayue,XIONG Deyi*.Improve the performance of neural machine translation with co-training[J].Journal of Xiamen University(Natural Science),2019,58(02):176-183.[doi:10.6043/j.issn.0438-0479.201811011]
点击复制

利用协同训练提高神经机器翻译系统的翻译性能(PDF/HTML)
分享到:

《厦门大学学报(自然科学版)》[ISSN:0438-0479/CN:35-1070/N]

卷:
58卷
期数:
2019年02期
页码:
176-183
栏目:
机器翻译模型
出版日期:
2019-03-27

文章信息/Info

Title:
Improve the performance of neural machine translation with co-training
文章编号:
0438-0479(2019)02-0176-08
作者:
黄佳跃熊德意*
苏州大学计算机科学与技术学院,江苏 苏州 215006
Author(s):
HUANG JiayueXIONG Deyi*
School of Computer Science and Technology,Soochow University,Suzhou 215006,China
关键词:
神经机器翻译 协同训练 半监督学习
Keywords:
neural machine translation co-training semi-supervised learning
分类号:
TP 391
DOI:
10.6043/j.issn.0438-0479.201811011
文献标志码:
A
摘要:
如何改善神经机器翻译模型的翻译性能一直是学术界研究的热门课题,特别是在低资源语种的翻译任务上,如何提高原有平行语料训练出来的翻译模型的翻译质量是一个迫切需要解决的问题.为此,对传统的统计机器翻译任务上使用的协同训练方法进行优化,进一步提出新的协同训练方法,并应用于神经机器翻译任务中,改善原有神经机器翻译模型的翻译质量.实验表明神经机器翻译中使用协同训练的方法能显著提高翻译质量,在语料数量稀少(低资源语料)的情况下提升效果更为显著.
Abstract:
Improving the performance of neural machine translation system is a hot research topic in the academia,especially for the low-resource language translation tasks.Co-training is a method which uses large amounts of unlabeled data in addition to a small labeled data set.It is used to label unlabeled data with high quality.Such additional labeled data is further used to enlarge the original labeled data set and finally retrain the translation model to obtain a better translation model.In this paper we propose to use the co-training method in neural machine translation.In order to make this method more applicable,we introduce a new co-training method to improve its practicality.Experimental results show that the proposed co-training in neural machine translation can significantly improve the translation performance,especially for low-resource language translation tasks.

参考文献/References:

[1] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.0473.
[2] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.3215.
[3] BARRET Z,DENIZ Y,JONATHAN M,et al.Transfer learning for low-resource neural machine translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1604.02201.
[4] BLUM A,MITCHELL T.Combining labeled and unlabeled data with co-training[C]∥Proceedings of the 11th Annual Conference on Computational Learning Theory(COLT).New York:ACM,1998:92-100.
[5] NIGAM K,GHANIR.Analyzing the effectiveness and applicability of co-training[C]∥Proceedings of the 9th International Conference on Information and Knowledge Management(CIKM).McLean,Virginia:ACM,2000:86-93.
[6] CALLISON-BURCH C.Co-training for statistical machine translation[EB/OL].[2018-10-01].http:∥www.cis.upenn.edu/~ccb/publications/msc-thesis.pdf.
[7] LIANG P.Semi-supervised learning for natural language[D].Cambridge:Massachusetts Institute of Technology,2005:1-86.
[8] SENNRICH R,HADDOW B,BIRCH A.Improving neural machine translation models with monolingual data[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1511.06709.
[9] KAY M.Triangulation in translation[EB/OL].[2018-10-01].http:∥clu.uni.no/icame/corpora/2000-3/0119.
[10] ZOPH B,KNIGHT K.Multi-source neural translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1601.00710.
[11] JOHNSON M,SCHUSTER M,LE Q V,et al.Google’s multilingual neural machine translation system:enabling zero-shot translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1611.04558.
[12] KOEHN P.Pharaoh:a beam search decoder for phrase-based statistical machine translation models[M]∥Machine translation:from real users to research.Berlin Heidelberg:Springer,2004:115-124.
[13] MATTHEW D,ZEILER.ADADELTA:an adaptive learning rate method[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1212.5701.
[14] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.[1] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.0473.
[2] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1409.3215.
[3] BARRET Z,DENIZ Y,JONATHAN M,et al.Transfer learning for low-resource neural machine translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1604.02201.
[4] BLUM A,MITCHELL T.Combining labeled and unlabeled data with co-training[C]∥Proceedings of the 11th Annual Conference on Computational Learning Theory(COLT).New York:ACM,1998:92-100.
[5] NIGAM K,GHANIR.Analyzing the effectiveness and applicability of co-training[C]∥Proceedings of the 9th International Conference on Information and Knowledge Management(CIKM).McLean,Virginia:ACM,2000:86-93.
[6] CALLISON-BURCH C.Co-training for statistical machine translation[EB/OL].[2018-10-01].http:∥www.cis.upenn.edu/~ccb/publications/msc-thesis.pdf.
[7] LIANG P.Semi-supervised learning for natural language[D].Cambridge:Massachusetts Institute of Technology,2005:1-86.
[8] SENNRICH R,HADDOW B,BIRCH A.Improving neural machine translation models with monolingual data[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1511.06709.
[9] KAY M.Triangulation in translation[EB/OL].[2018-10-01].http:∥clu.uni.no/icame/corpora/2000-3/0119.
[10] ZOPH B,KNIGHT K.Multi-source neural translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1601.00710.
[11] JOHNSON M,SCHUSTER M,LE Q V,et al.Google’s multilingual neural machine translation system:enabling zero-shot translation[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1611.04558.
[12] KOEHN P.Pharaoh:a beam search decoder for phrase-based statistical machine translation models[M]∥Machine translation:from real users to research.Berlin Heidelberg:Springer,2004:115-124.
[13] MATTHEW D,ZEILER.ADADELTA:an adaptive learning rate method[EB/OL].[2018-10-01].http:∥arxiv.org/abs/1212.5701.
[14] SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.

备注/Memo

备注/Memo:
收稿日期:2018-11-09 录用日期:2019-01-10
基金项目: 国家自然科学基金(61622209,61861130364)
*通信作者:dyxiong@suda.edu.cn
更新日期/Last Update: 1900-01-01