|本期目录/Table of Contents|

[1]姜雨帆,李 北,林 野,等.面向语言模型的神经元连接自动学习方法[J].厦门大学学报(自然科学版),2019,58(02):225-230.[doi:10.6043/j.issn.0438-0479.201811032]
 JIANG Yufan,LI Bei,LIN Ye,et al.Automatic learning method of neuron connections for language models[J].Journal of Xiamen University(Natural Science),2019,58(02):225-230.[doi:10.6043/j.issn.0438-0479.201811032]
点击复制

面向语言模型的神经元连接自动学习方法(PDF/HTML)
分享到:

《厦门大学学报(自然科学版)》[ISSN:0438-0479/CN:35-1070/N]

卷:
58卷
期数:
2019年02期
页码:
225-230
栏目:
自然语言处理计算方法
出版日期:
2019-03-27

文章信息/Info

Title:
Automatic learning method of neuron connections for language models
文章编号:
0438-0479(2019)02-0225-06
作者:
姜雨帆李 北林 野李垠桥肖 桐*朱靖波
东北大学计算机科学与工程学院,自然语言处理实验室,辽宁 沈阳 110819
Author(s):
JIANG YufanLI BeiLIN YeLI YinqiaoXIAO Tong*ZHU Jingbo
Natural Language Processing Laboratory,School of Computer Science and Engineering,Northeastern University,Shenyang 110819,China
关键词:
语言模型 神经元连接 剪枝
Keywords:
language model neuron connection pruning
分类号:
TP 391
DOI:
10.6043/j.issn.0438-0479.201811032
文献标志码:
A
摘要:
在自然语言处理中,由于神经网络的结构需要人工设计,容易导致复杂的神经网络结构中存在大量冗余.为了减少冗余,人们常采用剪枝等模型压缩方法,但是这类方法通过一些与训练过程无关的指标直接对模型进行裁剪时往往造成性能损失.因此探索了一种神经网络中神经元连接的自动学习方法,通过在训练中对神经元连接进行动态生长和删除的方法,可以更好地对网络连接进行动态操作,从而得到更紧凑、高效的网络结构.使用该方法在神经语言模型上进行自动生长和消去,在保证网络性能不变的前提下,网络规模可缩小49%.
Abstract:
In the field of natural language processing,the structure of the neural network requires manual design,which leads to a large amount of redundancy in the complex neural network structure.For the purpose of reducing the redundant model parameters,researchers often adopt model compression methods such as pruning.However,these methods directly compress the model by taking some indicators that are not related to the training process,resulting in the performance loss.This paper explores an automatic learning method of neural connection in neural network.This method can dynamically grow and delete the neuron connection during training,which can better operate the network connection dynamically,thus achieving more compact and efficient network structures.Using this method,we perform automatic growth and elimination on the neural language model,and the network scale can be further reduced by 49% while maintaining the original network performance.

参考文献/References:

[1] BENGIO Y,DUCHARME R,VINCENT P,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3:1137-1155.
[2] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[EB/OL].[2018-10-22].https:∥arxiv.org/pdf/1706.03762.
[3] GRAVES A,MOHAMED A R,HINTON G.Speech recognition with deep recurrent neural networks[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.Vancouver:IEEE,2013:6645-6649.
[4] HARACLICK R M.Texture features for image classification[J].IEEE Trans Smc,1973,3(6):610-621.
[5] DEAN J,CORRADO G S,MONGA R,et al.Large scale distributed deep networks[C]∥International Conference on Neural Information Processing Systems.Lake Tahoe:Curran Associates Inc,2012:1223-1231.
[6] CUN Y L,DENKER J S,SOLLA S A.Optimal brain damage[C]∥International Conference on Neural Information Processing Systems.Cambridge:MIT Press,1989:598-605.
[7] THODBERG H H.Improving generalization of neural networks through pruning[J].International Journal of Neural Systems,1991,1(4):317-326.
[8] HAN S,MAO H,DALLY W J.Deep compression:compressing deep neural networks with pruning,trained quantization and huffman coding[EB/OL].[2018-10-22].https:∥arxiv.org/pdf/1510.00149.
[9] NABHAN T M,ZOMAYA A Y.Toward generating neural network structures for function approximation[J].Neural Networks,1994,7(1):89-99.
[10] KADETOTAD D,ARUNACHALAM S,CHAKRABARTI C,et al.Efficient memory compression in deep neural networks using coarse-grain sparsification for speech applications[C]∥Proceedings of the 35th International Conference on Computer-Aided Design.Austin:ACM,2016:78.
[11] MéZARD M,NADAL J P.Learning in feedforward layered networks:the tiling algorithm[J].Journal of Physics A:Mathematical and General,1989,22(12):2191-2203.
[12] DAI X,YIN H,JHA N K.NeST:a neural network synthesis tool based on a grow-and-prune paradigm[EB/OL].[2018-10-22].https:∥arxiv.org/pdf/1711.02017.[1] BENGIO Y,DUCHARME R,VINCENT P,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3:1137-1155.
[2] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[EB/OL].[2018-10-22].https:∥arxiv.org/pdf/1706.03762.
[3] GRAVES A,MOHAMED A R,HINTON G.Speech recognition with deep recurrent neural networks[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.Vancouver:IEEE,2013:6645-6649.
[4] HARACLICK R M.Texture features for image classification[J].IEEE Trans Smc,1973,3(6):610-621.
[5] DEAN J,CORRADO G S,MONGA R,et al.Large scale distributed deep networks[C]∥International Conference on Neural Information Processing Systems.Lake Tahoe:Curran Associates Inc,2012:1223-1231.
[6] CUN Y L,DENKER J S,SOLLA S A.Optimal brain damage[C]∥International Conference on Neural Information Processing Systems.Cambridge:MIT Press,1989:598-605.
[7] THODBERG H H.Improving generalization of neural networks through pruning[J].International Journal of Neural Systems,1991,1(4):317-326.
[8] HAN S,MAO H,DALLY W J.Deep compression:compressing deep neural networks with pruning,trained quantization and huffman coding[EB/OL].[2018-10-22].https:∥arxiv.org/pdf/1510.00149.
[9] NABHAN T M,ZOMAYA A Y.Toward generating neural network structures for function approximation[J].Neural Networks,1994,7(1):89-99.
[10] KADETOTAD D,ARUNACHALAM S,CHAKRABARTI C,et al.Efficient memory compression in deep neural networks using coarse-grain sparsification for speech applications[C]∥Proceedings of the 35th International Conference on Computer-Aided Design.Austin:ACM,2016:78.
[11] MéZARD M,NADAL J P.Learning in feedforward layered networks:the tiling algorithm[J].Journal of Physics A:Mathematical and General,1989,22(12):2191-2203.
[12] DAI X,YIN H,JHA N K.NeST:a neural network synthesis tool based on a grow-and-prune paradigm[EB/OL].[2018-10-22].https:∥arxiv.org/pdf/1711.02017.

备注/Memo

备注/Memo:
收稿日期:2018-11-16 录用日期:2019-01-12
基金项目:国家自然科学基金(61432013,61732005,61876035); 中央高校基本科研业务费专项(N161604007); 辽宁省高等学校创新人才支持计划(LR20170606)
*通信作者:xiaotong@mail.neu.edu.cn
更新日期/Last Update: 1900-01-01