基于t分布随机邻域嵌入的阿尔茨海默症诊断模型

(厦门大学信息科学与技术学院,福建 厦门 361005)

支持向量机; t分布随机邻域嵌入; 集成学习; 阿尔茨海默症

Alzheimer Diagnosis Model Based on t-Distributed Stochastic Neighbor Embedding
CHENG Chao,YANG Chenhui*

(School of Information Science and Engineering,Xiamen University,Xiamen 361005,China)

DOI: 10.6043/j.issn.0438-0479.201601008

备注

对大脑皮层厚度数据进行建模从而实现阿尔茨海默症的诊断.在训练样本少,数据复杂且非线性的情况下,相比于BP神经网络和k最近邻等算法,支持向量机算法表现出更优良的特性.针对支持向量机算法受数据高维度的影响,将t分布随机邻域嵌入算法引入到支持向量机模型.t分布随机邻域嵌入算法既能撷取原始高维数据的局部信息,也能揭示全局结构.t分布随机邻域嵌入算法先将这些非线性数据降维到低维空间,支持向量机算法再将这数据映射到新的高维空间,通过寻找最佳分类超平面的方法,使分类效果达到最佳水平.最后将集成学习算法AdaBoost的思想融入模型,可以使模型的分类准确率得到提升,而且变得鲁棒性更强.

Model based on cortex thickness data is used to implement Alzheimer's diseasediagnosis(AD).Comparing with back propagation(BP)and k-nearest neighboralgorithms(kNN),support vector machine(SVM)algorithm exhibits more excellent characteristics in the case that the number of training samples is smaller and the data is complex and nonlinear.Considering that the performance of SVM algorithm is influenced by high dimension,we combine thet-distributed stochastic neighbor embeddingalgorithm(t-SNE)with SVM algorithm model.t-SNE is capable of capturing much of the local structure of the high-dimensional data very well,while also revealing global structure,such as the presence of clusters at several scales.The t-SNE algorithm reduces the dimension of nonlinear data,and then SVM algorithm maps Low-dimensional data to a high-dimensional space.Afterwards,SVM looks for the best hyperplane to make the best classification results.Finally,the ideal of an ensemble learning algorithm AdaBoost is used,which can improve the classification accuracy of the model and make the model more robust.