Published in

Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium

DOI: 10.1109/ijcnn.2000.860735

Links

Tools

Export citation

Search in Google Scholar

A Model-Based Distance for Clustering

Journal article published in 2000 by Magnus Rattray ORCID
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

A Riemannian distance is defined which is appropriate for clustering multivariate data. This distance requires that data is first fitted with a differentiable density model allowing the definition of an appropriate Riemannian metric. A tractable approximation is developed for the case of a Gaussian mixture model and the distance is tested on artificial data, demonstrating an ability to deal with differing length scales and linearly inseparable data clusters. Further work is required to investigate performance on larger data sets. 1 Introduction Finding clusters in multivariate data is a difficult task in general, perhaps not least because the very definition of a cluster is ambiguous. A number of clustering algorithms have been developed in the literature which aim to minimise some sort of partitioning error measure (see, for example, [1]). Some notion of distance is required for these algorithms and typically the Euclidean distance is used. Recently, Tipping introduced a novel Rieman...