|本期目录/Table of Contents|

[1]吴佳雯,刘沁婷,曾德炉,等.一种基于非参数贝叶斯理论的语音增强算法[J].厦门大学学报(自然科学版),2017,56(03):423-428.[doi:10.6043/j.issn.0438-0479.201702026]
 WU Jiawen,LIU Qinting,ZENG Delu,et al.Speech Enhancement Based on Nonparametric Bayesian Method[J].Journal of Xiamen University(Natural Science),2017,56(03):423-428.[doi:10.6043/j.issn.0438-0479.201702026]
点击复制

一种基于非参数贝叶斯理论的语音增强算法(PDF/HTML)
分享到:

《厦门大学学报(自然科学版)》[ISSN:0438-0479/CN:35-1070/N]

卷:
56卷
期数:
2017年03期
页码:
423-428
栏目:
研究论文
出版日期:
2017-05-24

文章信息/Info

Title:
Speech Enhancement Based on Nonparametric Bayesian Method
文章编号:
0438-0479(2017)03-0423-06
作者:
吴佳雯1刘沁婷1曾德炉2丁兴号1李琳1*
1.厦门大学信息科学与技术学院,福建厦门361005;2.华南理工大学数学学院,广东广州510641
Author(s):
WU Jiawen1LIU Qinting1ZENG Delu2DING Xinghao1LI Lin1*
1.School of Information Science and Engineering,Xiamen University,Xiamen 361005,China;2.School of Mathematics,South China University of Technology,Guangzhou 510641,China
关键词:
稀疏表示非参数贝叶斯SpikeSlab先验自适应字典语音增强
Keywords:
sparse representationnonparametric Bayesian estimationSpikeSlab prioridictionary learningspeech enhancement
分类号:
TN 912
DOI:
10.6043/j.issn.0438-0479.201702026
文献标志码:
A
摘要:
提出一种基于非参数贝叶斯理论的语音增强算法,在稀疏表示的框架下,把字典学习、稀疏系数表示和噪声方差估计融合成一个贝叶斯后验估计的过程,并利用SpikeSlab先验加强稀疏性.首先,将带噪语音分解为干净语音、高斯噪声和残余噪声3个子信号,分别对该3种子信号采用不同的先验概率模型表达,接着采用马尔科夫链蒙特卡洛算法计算出3个模型中每个参数对应的后验概率,最后基于稀疏表示的框架重构出干净语音.实验数据使用NOIZEUS语音库,采用PESQ和SegSNR作为质量评价指标,分别在信噪比为0,5和10 dB的高斯白噪声、火车噪声和街道噪声上验证了其可行性,并与多种常用语音增强方法进行对比,发现其在低信噪比非平稳噪声情况下的增强效果更为理想.
Abstract:
A new speech enhancement strategy is proposed by utilizing a nonparametric Bayesian method with SpikeSlab priori (NBSP).As a sparse representation framework,the dictionary learning,sparse coefficients representation and noise variance estimation are replaced by a single procedure of Bayesian posterior estimation.First,the noisy speech is divided into clean speech,Gaussian noise and rest noise.Then,each part is modeled with a certain priori distribution.Finally,upon the adoption of Markov Chain Monte Carlo sampling algorithm,the posterior distribution can be obtained,as the clean speech and all other parameters.Without knowing the noise variance,NBSPcould be performed directly on the noisy speech to infer the sparsity of the speech.Experiments were executed on NOIZEUS database.Experiments are executed on noisy speeches from NOIZEUS database with SNR ranging from 0 dB to 10 dB,which contain three types of noise (white,train and street).And the subjective and objective measures like PESQ score and the output SegSNR are implemented to evaluate the performance of NBSP and the other stateoftheart methods.Corresponding results show that NBSP achieves better performances,especially in conditions of nonstationary noise with low input SNR.

参考文献/References:

[1] BOLL,S.Suppression of acoustic noise in speech using spectral subtraction[J].IEEE Transactions on Acoustics,Speech and Signal Processing,1979,27(2):113-120.
[2] SCALART,P.Speech enhancement based on a priori signal to noise estimation[C]∥1996 IEEE International Conference on Acoustics,Speech,and Signal Processing(ICASSP).Atlanta,Georgia:IEEE,1996:629-632.
[3] VETTERLI M.Wavelets,approximation,and compression[J].IEEE Signal Processing Magazine,2001,18(5):59-73.
[4] DE M B.The singular value decomposition and long and short spaces of noisy matrices[J].IEEE Transactions on Signal Processing,1993,41(9):2826-2838.
[5] COHEN I.Speech enhancement using a noncausal a priori SNR estimator[J].IEEE Signal Processing Letters,2004,11(9):725-728.
[6] LU Y,LOIZOU P C.Estimators of the magnitude-squared spectrum and methods for incorporating SNR uncertainty[J].IEEE Transactions on Audio,Speech,and Language Processing,2011,19(5):1123-1137.
[7] ZHAO N,XU X,YANG Y.Sparse representations for speech enhancement[J].Chinese Journal of Electronics,2011,19(2):268-272.
[8] MOHAMMADIHA N,SMARAGDIS P,LEIJON A.Supervised and unsupervised speech enhancement using nonnegative matrix factorization[J].IEEE Transactions on Audio,Speech,and Language Processing,2013,21(10):2140-2151.
[9] SIGG C D,DIKK T,BUHMANN J M.Speech enhancement using generative dictionary learning[J].IEEE Transactions on Audio,Speech,and Language Processing,2012,20(6):1698-1712.
[10] BABY D,VIRTANEN T,BARKER T.Coupled dictionary training for exemplar-based speech enhancement[C]∥2014 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).Florence,Italy:IEEE,2014:2883-2887.
[11] SUN C,ZHU Q,WAN M.A novel speech enhancement method based on constrained low-rank and sparse matrix decomposition[J].Speech Communication,2014,60(5):44-55.
[12] ZHUANG P,HUANG Y,ZENG D,et al.Mixed noise removal based on a novel non-parametric Bayesian sparse outlier model[J].Neurocomputing,2016,174:858-865.
[13] MITCHELL T J,BEAUCHAMP J J.Bayesian variable selection in linear regression[J].Journal of the American Statistical Association,1988,83(404):1023-1032.
[14] GEORGE E I,MCCULLOCH R E.Variable selection via Gibbs sampling[J].Journal of the American Statistical Association,1993,88(423):881-889.
[15] DING X,MI Z,HUANG Y,et al.Robust RVM based on spike-slab prior[J].Journal of Electronics(China),2012,29(6):593-597.
[16] HAMMERSLEY J M,HANDSCOMB D C.Monte Carlo methods[M].London:Chapman and Hall,2013.
[17] JACKMAN S.Estimation and inference via Bayesian simulation:an introduction to Markov chain Monte Carlo[J].American Journal of Political Science,2000,44(44):375-404.
[18] SUN S.A review of deterministic approximate inference techniques for Bayesian machine learning[J].Neural Computing and Applications,2013,23(7):2039-2050.
[19] HU Y,LOIZOU P C.Subjective comparison of speech enhancement algorithms[C]∥2006 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).Toulouse:IEEE,2006:I153-I156.
[20] TIPPING M E.Sparse Bayesian learning and the relevance vector machine[J].Journal of Machine Learning Research,2001,1(3):211-244.
[21] RIX A W,BEERENDS J,HOLLIER M,et al.Perceptual evaluation of speech quality(PESQ),an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs:P862[S].Geneva:ITU-T Recommendation,2001:1-19.
[22] TASHEV I,LOVITT A,ACERO A.Unified framework for single channel speech enhancement[C]∥2009 IEEE Pacific Rim Conference on Communications,Computers and Signal Processing.Victoria,BC:IEEE,2009:883-888.
[23] GAROFOLO J S.Getting started with the DARPA TIMIT CD-ROM:An acoustic phonetic continuous speech database[R].Gaithersburgh,MD:National Institute of Standards and Technology(NIST),1988.
[24] VARGA A,STEENEKEN H J M.Assessment for automatic speech recognition:II.NOISEX-92:A database and an experiment to study the effect of additive noise on speech recognition systems[J].Speech Communication,1993,12(3):247-251.

备注/Memo

备注/Memo:
收稿日期:2017-02-17 录用日期:2017-04-11
*通信作者:lilin@xmu.edu.cn
引文格式:吴佳雯,刘沁婷,曾德炉,等.一种基于非参数贝叶斯理论的语音增强算法[J].厦门大学学报(自然科学版),2017,56(3):423-428.
Citation:WU J W,LIU Q T,ZENG D L,et al.Speech enhancement based on nonparametric Bayesian method[J].J Xiamen Univ Nat Sci,2017,56(3):423-428.(in Chinese)
更新日期/Last Update: 1900-01-01