基于语言模型及循环卷积神经网络的事件检测

(厦门大学信息科学与技术学院,福建 厦门 361005)

事件检测; 语言模型词嵌入; 长短期记忆网络; 动态多池化卷积神经网络; 注意力机制

Event detection via recurrent and convolutional networks based on language model
SHI Zheer,CHEN Jinxiu*

(School of Information Science and Engineering,Xiamen University,Xiamen 361005,China)

event detection;embeddings from language models(ELMo); long short-term memory neural network(LSTM); dynamic multi-pooling convolutional neural networks(DMCNN); attention mechanism

DOI: 10.6043/j.issn.0438-0479.201901008

备注

目前,事件检测的难点在于一词多义和多事件句的检测.为了解决这些问题,提出了一个新的基于语言模型的带注意力机制的循环卷积神经网络模型(recurrent and convolutional neural network with attention based on language models,LM-ARCNN).该模型利用语言模型计算输入句子的词向量,将句子的词向量输入长短期记忆网络获取句子级别的特征,并使用注意力机制捕获句子级别特征中与触发词相关性高的特征,最后将这两部分的特征输入到包含多个最大值池化层的卷积神经网络,提取更多上下文有效组块.在ACE2005英文语料库上进行实验,结果表明,该模型的F1值为74.4%,比现有最优的文本嵌入增强模型(DEEB)高0.4%.

Now main difficulties of event detection lie in polysemy and multi-event detection.To overcome these difficulties,we propose a novel recurrent and convolutional network with attention based on language model(LM-ARCNN).The model first learns word embeddings from Language Models(ELMo),and places these learned embeddings into a long-short term memory neural network(LSTM)which can capture sentence-level features.Then it utilizes attention mechanism to learn information from the learned sentence features to find the features which are more closely relative to candidate trigger words.Finally,it places these learned sentence features and attention features into a multi-pooling convolutional networks(DMCNN)which uses a dynamic multi-pooling layer according to event trigger to reserve more crucial context chunks.Experiments in ACE2005 English corpus show that the model achieves the state-of-the-art performance with F1 value is 74.4%.