Details (Don't) Matter: Isolating Cluster Information in Deep Embedded Spaces

Details (Don't) Matter: Isolating Cluster Information in Deep Embedded Spaces

Abstract

Deep clustering techniques combine representation learning with clustering objectives to improve their performance. Among existing deep clustering techniques, autoencoder-based methods are the most prevalent ones. While they achieve promising clustering results, they suffer from an inherent conflict between preserving details, as expressed by the reconstruction loss, and finding similar groups by ignoring details, as expressed by the clustering loss. This conflict leads to brittle training procedures, dependence on trade-off hyperparameters and less interpretable results. We propose our framework, ACe/DeC, that is compatible with Autoencoder Centroid based Deep Clustering methods and automatically learns a latent representation consisting of two separate spaces. The clustering space captures all cluster-specific information and the shared space explains general variation in the data. This separation resolves the above mentioned conflict and allows our method to learn both detailed reconstructions and cluster specific abstractions. We evaluate our framework with extensive experiments to show several benefits: (1) cluster performance – on various data sets we outperform relevant baselines; (2) no hyperparameter tuning – this improved performance is achieved without introducing new clustering specific hyperparameters; (3) interpretability – isolating the cluster specific information in a separate space is advantageous for data exploration and interpreting the clustering results; and (4) dimensionality of the embedded space – we automatically learn a low dimensional space for clustering. Our ACe/DeC framework isolates cluster information, increases stability and interpretability, while improving cluster performance.

Grafik Top
Authors
  • Miklautz, Lukas
  • Bauer, Lena G. M.
  • Mautz, Dominik
  • Tschiatschek, Sebastian
  • Böhm, Christian
  • Plant, Claudia
Grafik Top
Shortfacts
Category
Paper in Conference Proceedings or in Workshop Proceedings (Paper)
Event Title
Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21)
Divisions
Data Mining and Machine Learning
Event Location
Montreal, Canada
Event Type
Conference
Event Dates
19.-27.08.2021
Series Name
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence Survey Track
ISSN/ISBN
978-0-9992411-9-6
Page Range
pp. 2826-2832
Date
19 August 2021
Export
Grafik Top