自适应稀疏模糊聚类模型

(1.厦门大学航空航天学院,福建 厦门 361102; 2.集美大学信息工程学院,福建 厦门 361021)

模糊聚类; 稀疏性; 正则化; 软阈值

Adaptive sparse fuzzy clustering model
GAO Yunlong1,LAI Wenxin1,PAN Jinyan2*,KANG Liwen1

(1.School of Aerospace Engineering,Xiamen University,Xiamen 361102,China; 2.School of Information Engineering,Jimei University,Xiamen 361021,China)

fuzzy clustering; sparse; regularization; soft threshold

DOI: 10.6043/j.issn.0438-0479.202003053

备注

传统的C均值聚类算法是一种硬划分聚类方法,对初始聚类中心的设置敏感,具有聚类中心趋同性问题.为了克服该问题,模糊C均值(FCM)聚类算法被提出.然而,FCM中模糊隶属度的拖尾和翘尾特征却带来了新的问题:一方面,聚类结果更易受噪声和离群点的影响; 另一方面,数据簇的可分性下降,聚类结果泛化性差.针对这些问题提出了一种新的具有自适应性的模糊聚类算法,该算法采用正则化技术与软阈值法,模糊隶属度具有明显的稀疏性结构特征; 引入了虚拟类,有效降低异常点与离群点对聚类结果的影响,并且解决了FCM所存在的翘尾问题,提高数据簇可分性与类内聚程度.对比相关算法,在人造数据集和UCI数据集,以及图像分割问题上的实验结果验证了该算法的有效性.
As a hard clustering method,the traditional C-means algorithm appears sensitive to the setting of initial clustering centers and is often troubled by the convergence problem of clustering centers.For the purpose of overcoming this defect,fuzzy C-means(FCM)clustering algorithm has been proposed.However,trailing and warping features of fuzzy membership degree in FCM endure new problems.On one hand,clustering results are more susceptible to noise outliers; on the other hand,the separability of clustering decreases and clustering results have a poor generalization ability.In this article,aiming at these problems,we propose a new adaptive fuzzy clustering algorithm,in which the regularization technology and soft threshold are adopted.This algorithm is characterized with obvious sparse structures.Due to the introduction of virtual noise class,the algorithm effectively reduces the influence of outliers and outliers on clustering results,solves the warp-tail problem existing in FCM,and greatly enhances the separability and class cohesion.Comparing to relevant algorithms,experimental results on synthetic datasets,UCI datasets and image segmentation indicate the effectiveness of the proposed algorithm.