《厦门大学学报（自然科学版）》

隐式篇章关系识别的主要挑战是如何表示两个文本单元的语义信息.由于句子的语义信息往往由语法树中的信息焦点(谓词部分)所决定,所以关注信息焦点可以提升篇章关系识别的效果.为了增强信息焦点的作用,引入树状长短时记忆(tree-structured long short-term memory,Tree-LSTM)网络,使用其遗忘门的特性区别对待不同孩子节点的信息.最后利用神经张量网络(neural tensor network,NTN)来计算两个句子语义向量之间的关系.基于PDTB2.0(Penn Discourse Treebank)语料数据进行实验,实验结果表明混合树结构神经网络比传统的RNN模型在大部分关系中的F-score上提高了3.0%左右.

The most critural challenge of implicit discourse relation recognition lies in how to represent the semantic information of each discourse argument.However,the semantic value of the sentence is mainly decided by its specific information focus in linguistics.Therefore,the discourse relation may mostly depend on links between information focuses.Intuitively,we cannot give equal treatment to every phrase branches during composition up the syntactic parse tree.To resolve the problem,we introduce the tree-structured long short-term memory(Tree-LSTM)network to selectively incorporate information from each child to compute the distributed semantic representation of two arguments.Consequently,it can emphasize those informative predicative branches that indicate the "focus" of a sentence.Then the neural tensor network(NTN)is used to predict the semantic correlation between these two discourse arguments across multiple dimensions.Experimental results on PDTB corpus show that our model has achieved some improvement on the task of discourse relation recognition.

引言
1 模型的提出
2 实验
3 讨论
4 结论

图1 解析树中的信息焦点<br/>Fig.1 An example of information focus in the parse tree

图1 解析树中的信息焦点
Fig.1 An example of information focus in the parse tree

图2 基于混合树神经网络的隐式篇章关系识别模型<br/>Fig.2 The framework of hybrid tree structured neural network for implicit discourse relation recognition

图2 基于混合树神经网络的隐式篇章关系识别模型
Fig.2 The framework of hybrid tree structured neural network for implicit discourse relation recognition

图3 二元Tree-LSTMs单元<br/>Fig.3 A binary Tree-LSTM unit

图3 二元Tree-LSTMs单元
Fig.3 A binary Tree-LSTM unit

图4 NTN的结构视图<br/>Fig.4 Visualization of NTN for discourse relation classificationg

图4 NTN的结构视图
Fig.4 Visualization of NTN for discourse relation classificationg

表2 与基准系统的比较<br/>Tab.2 Comparison results with various baseline systems

表2 与基准系统的比较
Tab.2 Comparison results with various baseline systems

表1 数据设置
Tab.1 Data setup

表3 与其他有监督和半监督学习模型的比较<br/>Tab.3 Comparison results with the competitive systems%

表3 与其他有监督和半监督学习模型的比较
Tab.3 Comparison results with the competitive systems%

[1] NARASIMHAN K,BARZILAY R.Machine compre-hension with discourse relations[C]∥Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Beijing:ACL,2015:1253-1262.
[2] VERBERNE S,BOVES L,OOSTDIJK N,et al.Evaluating discourse-based answer extraction for why-question answering[C]∥International ACM SIGIR Conference on Research and Development in Information Retrieval.Amsterdam:SIGIR,2007:735-736.
[3] LOUIS A,JOSHI A,NENKOVA A.Discourse indicators for content selection in summarization[C]∥Meeting of the Special Interest Group on Discourse and Dialogue.Uppsala:ACL,2010:147-156.
[4] MARCU D,ECHIHABI A.An unsupervised approach to recognizing discourse relations[C]∥Proceedings of the 40th Annual Meeting on Association for Computational Linguistics.Philadelphia:ACL,2002:368-375.
[5] SAITO M,YAMAMOTO K,SEKINE S.Using phrasal patterns to identify discourse relations[C]∥Proceedings of the Human Language Technology Conference of the NAACL.Sydney:ACL,2006:133-136.
[6] BLAIR-GOLDENSOHN S,MCKEOWN K,RAMBOW O.Building and refining rhetorical-semantic relation models[C]∥The Conference of the North American Chapter of the Association for Computational Linguistics:Human Language.Rochester:HLT-NAACL,2007:428-435.
[7] CARLSON L,MARCU D,OKUROWSKI M E.Building a discourse-tagged corpus in the framework of rhetorical structure theory[C]∥Proceedings of the Second SIGdial Workshop on Discourse and Dialogue.Aalborg:Sigdial,2001:1-10.
[8] PRASAD R,MILTSAKAKI E,DINESH N,et al.The penn discourse treebank 2.0 annotation manual[J].Proceedings of Lrec,2007,24(1):2961-2968.
[9] WOLF F,GIBSON E.Representing discourse coherence:a corpus-based study[J].Computational Linguistics,2005,31(2):249-287.
[10] PITLER E,LOUIS A,NENKOVA A.Automatic sense prediction for implicit discourse relations in text[C]∥Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.Suntec:ACL,2009:683-691.
[11] LIN Z,KAN M Y,NG H T.Recognizing implicit discourse relations in the Penn Discourse Treebank[C]∥Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing.Suntec:ACL,2009:343-351.
[12] ZHOU Z M,LAN M,NIU Z Y,et al.The effects of discourse connectives prediction on implicit discourse relation recognition[C]∥Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Uppsala:ACL,2010:139-146.
[13] WANG W T,SU J,TAN C L.Kernel based discourse relation recognition with temporal ordering information[C]∥Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.Uppsala:ACL,2010:710-719.
[14] HERNAULT H,BOLLEGALA D,ISHIZUKA M.A semi-supervised approach to improve classification of infrequent discourse relations using feature vector extension[C]∥Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing.Uppsala:ACL,2010:399-409.
[15] HERNAULT H,BOLLEGALA D,ISHIZUKA M.Semi-supervised discourse relation classification with structural learning[C]∥International Conference on Intelligent Text Processing and Computational Linguistics.Berlin:Springer,2011:340-352.
[16] LAN M,XU Y,NIU Z Y.Leveraging synthetic discourse data via multi-task learning for implicit discourse relation recognition[C]∥Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.Sofia:ACL,2013:476-485.
[17] PARK J,CARDIE C.Improving implicit discourse relation recognition through feature set optimization[C]∥Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Jeju Island:ACL,2012:108-112.
[18] BIRAN O,MCKEOWN K.Aggregated word pair features for implicit discourse relation disambiguation[C]∥Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.Sofia:ACL,2013:69-73.
[19] FISHER R,SIMMONS R G.Spectral semi-supervised discourse relation classification[C]∥Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Beijing:ACL,2015:89-93.
[20] RUTHERFORD A,XUE N.Discovering implicit discourse relations through brown cluster pair representation and coreference patterns[C]∥Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistic.Gothenburg:ACL,2014:645-654.
[21] 徐凡,朱巧明,周国栋.基于树核的隐式篇章关系识别[J].软件学报,2013,24(5):1022-1035.
[22] 洪宇,严为绒,车婷婷,等.平行推理机制:一种隐式篇章关系检测方法[J].软件学报,2014,25(11):2528-2555.
[23] 刘初,陈锦秀.基于组合特征的自训练隐式篇章关系的识别技术[J].厦门大学学报(自然科学版),2014,53(2):182-189.
[24] 李生,孔芳,周国栋.基于PDTB体系的隐式篇章关系识别[J].中文信息学报,2016,30(4):81-89.
[25] 朱珊珊,洪宇,丁思远,等.基于训练样本集扩展的隐式篇章关系分类[J].中文信息学报,2016,30(5):111-120.
[26] RUMELHART D E,HINTON G E,WILLIAMS R J.Learning representations by back-propagating errors[J].Cognitive Modeling,1986,323(6088):533-536.
[27] FOLTZ P W,KINTSCH W,LANDAUER T K.The measurement of textual coherence with latent semantic analysis[J].Discourse Processes,1998,25(2/3):285-307.
[28] MIKOLOV T.Statistical language models based on neural networks[D].Brno University of Technology,2012.
[29] SOCHER R,LIN C C,MANNING C,et al.Parsing natural scenes and natural language with recursive neural networks[C]∥Proceedings of the 28th International Conference on Machine Learning(ICML-11).Bellevue:ICML,2011:129-136.
[30] TAI K S,SOCHER R,MANNING C D.Improved semantic representations from tree-structured long short-term memory networks[J].Computer Science,2015,5(1):36.
[31] JI Y,EISENSTEIN J.One vector is not enough:entity-augmented distributional semantics for discourse relations[J].Transactions of the Association for Computational Linguistics,2015,3(1):329-344.
[32] LAI S,LIU K,HE S,ET AL.How to generate a good word embedding[J].IEEE Intelligent Systems,2016,31(6):5-14.
[33] SOCHER R,CHEN D,MANNING C D,et al.Reasoning with neural tensor networks for knowledge base completion[C]∥International Conference on Intelligent Control and Information Processing.Beijing:ICICIP,2013:926-934.
[34] DUCHI J,HAZAN E,SINGER Y.Adaptive subgradient methods for online learning and stochastic optimization[J].Journal of Machine Learning Research,2011,12(7):2121-2159.

备注

引言

1 模型的提出

2 实验

3 讨论

4 结论

学报简介

备注

引言

1 模型的提出

2 实 验

3 讨 论

4 结 论

学报简介

2 实验

3 讨论

4 结论