Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering

Xiaofei Zhu; Khoi Duy Do; Jiafeng Guo; Jun Xu; Stefan Dietze

doi:10.1007/s11063-020-10375-9

Details

Original language	English
Number of pages	16
Journal	Neural processing letters
Volume	53
Publication status	Published - Feb 2021
Externally published	Yes

Abstract

Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.

Keywords

Clustering, Deep neural networks, Manifold constraint, Stacked autoencoder

ASJC Scopus subject areas

Computer Science(all)
Software
Neuroscience(all)
General Neuroscience
Computer Science(all)
Computer Networks and Communications
Computer Science(all)
Artificial Intelligence

Cite this

Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering. / Zhu, Xiaofei; Do, Khoi Duy; Guo, Jiafeng et al.
In: Neural processing letters, Vol. 53, 02.2021.

Research output: Contribution to journal › Article › Research › peer review

Zhu, X, Do, KD, Guo, J, Xu, J & Dietze, S 2021, 'Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering', Neural processing letters, vol. 53. https://doi.org/10.1007/s11063-020-10375-9

Zhu, X., Do, K. D., Guo, J., Xu, J., & Dietze, S. (2021). Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering. Neural processing letters, 53. https://doi.org/10.1007/s11063-020-10375-9

Zhu X, Do KD, Guo J, Xu J, Dietze S. Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering. Neural processing letters. 2021 Feb;53. doi: 10.1007/s11063-020-10375-9

Zhu, Xiaofei ; Do, Khoi Duy ; Guo, Jiafeng et al. / Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering. In: Neural processing letters. 2021 ; Vol. 53.

Download

@article{7e496f6d47754c99899fdd1001d5a1af,

title = "Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering",

abstract = "Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.",

keywords = "Clustering, Deep neural networks, Manifold constraint, Stacked autoencoder",

author = "Xiaofei Zhu and Do, {Khoi Duy} and Jiafeng Guo and Jun Xu and Stefan Dietze",

note = "Funding Information: The work was partially supported by the National Natural Science Foundation of China (No. 61722211), the Federal Ministry of Education and Research (No. 01LE1806A), the Natural Science Foundation of Chongqing (No. cstc2017jcyjBX0059), and the Beijing Academy of Artificial Intelligence (No. BAAI2019ZD0306). ",

year = "2021",

month = feb,

doi = "10.1007/s11063-020-10375-9",

language = "English",

volume = "53",

journal = "Neural processing letters",

issn = "1370-4621",

publisher = "Springer Netherlands",

}

Download

TY - JOUR

T1 - Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering

AU - Zhu, Xiaofei

AU - Do, Khoi Duy

AU - Guo, Jiafeng

AU - Xu, Jun

AU - Dietze, Stefan

N1 - Funding Information: The work was partially supported by the National Natural Science Foundation of China (No. 61722211), the Federal Ministry of Education and Research (No. 01LE1806A), the Natural Science Foundation of Chongqing (No. cstc2017jcyjBX0059), and the Beijing Academy of Artificial Intelligence (No. BAAI2019ZD0306).

PY - 2021/2

Y1 - 2021/2

N2 - Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.

AB - Clustering is an essential data analysis technique and has been studied extensively over the last decades. Previous studies have shown that data representation and data structure information are two critical factors for improving clustering performance, and it forms two important lines of research. The first line of research attempts to learn representative features, especially utilizing the deep neural networks, for handling clustering problems. The second concerns exploiting the geometric structure information within data for clustering. Although both of them have achieved promising performance in lots of clustering tasks, few efforts have been dedicated to combine them in a unified deep clustering framework, which is the research gap we aim to bridge in this work. In this paper, we propose a novel approach, Manifold regularized Deep Embedded Clustering (MDEC), to deal with the aforementioned challenge. It simultaneously models data generating distribution, cluster assignment consistency, as well as geometric structure of data in a unified framework. The proposed method can be optimized by performing mini-batch stochastic gradient descent and back-propagation. We evaluate MDEC on three real-world datasets (USPS, REUTERS-10K, and MNIST), where experimental results demonstrate that our model outperforms baseline models and obtains the state-of-the-art performance.

KW - Clustering

KW - Deep neural networks

KW - Manifold constraint

KW - Stacked autoencoder

UR - http://www.scopus.com/inward/record.url?scp=85092801329&partnerID=8YFLogxK

U2 - 10.1007/s11063-020-10375-9

DO - 10.1007/s11063-020-10375-9

M3 - Article

AN - SCOPUS:85092801329

VL - 53

JO - Neural processing letters

JF - Neural processing letters

SN - 1370-4621

ER -

Research@Leibniz University

Exploring Implicit and Explicit Geometrical Structure of Data for Deep Embedded Clustering

Authors

External Research Organisations

Details

Abstract

Keywords

ASJC Scopus subject areas

Cite this