基于混合树结构神经网络的隐式篇章关系识别

(厦门大学信息科学与技术学院,福建 厦门 361005)

隐式篇章关系识别; 信息焦点; 树状长短时记忆网络; 神经张量网

A Hybrid Tree Structured Neural Network for Implicit Discourse Relation Recognition
ZHENG Jianglong,CHEN Jingxiu*

(School of Information Science and Engineering,Xiamen University,Xiamen 361005,China)

implicit discourse relation recognition; specific information; tree-structured long short-term memory(Tree-LSTM); neural tensor network(NTN)

DOI: 10.6043/j.issn.0438-0479.201701010

备注

隐式篇章关系识别的主要挑战是如何表示两个文本单元的语义信息.由于句子的语义信息往往由语法树中的信息焦点(谓词部分)所决定,所以关注信息焦点可以提升篇章关系识别的效果.为了增强信息焦点的作用,引入树状长短时记忆(tree-structured long short-term memory,Tree-LSTM)网络,使用其遗忘门的特性区别对待不同孩子节点的信息.最后利用神经张量网络(neural tensor network,NTN)来计算两个句子语义向量之间的关系.基于PDTB2.0(Penn Discourse Treebank)语料数据进行实验,实验结果表明混合树结构神经网络比传统的RNN模型在大部分关系中的F-score上提高了3.0%左右.

The most critural challenge of implicit discourse relation recognition lies in how to represent the semantic information of each discourse argument.However,the semantic value of the sentence is mainly decided by its specific information focus in linguistics.Therefore,the discourse relation may mostly depend on links between information focuses.Intuitively,we cannot give equal treatment to every phrase branches during composition up the syntactic parse tree.To resolve the problem,we introduce the tree-structured long short-term memory(Tree-LSTM)network to selectively incorporate information from each child to compute the distributed semantic representation of two arguments.Consequently,it can emphasize those informative predicative branches that indicate the "focus" of a sentence.Then the neural tensor network(NTN)is used to predict the semantic correlation between these two discourse arguments across multiple dimensions.Experimental results on PDTB corpus show that our model has achieved some improvement on the task of discourse relation recognition.