Published in

2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

DOI: 10.1109/allerton.2016.7852287

Links

Tools

Export citation

Search in Google Scholar

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n,m → ∞$ and $α = m/n$ stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of $α$ and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, $r > 4 + 2 \sqrt{\alpha}$, there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed. ; Comment: 8 pages, 3 figures, conference