Dissemin is shutting down on January 1st, 2025

Published in

Elsevier, Information Sciences, (257), p. 369-387, 2014

DOI: 10.1016/j.ins.2013.05.038

Links

Tools

Export citation

Search in Google Scholar

Subtractive clustering for seeding non-negative matrix factorizations

Journal article published in 2013 by Gabriella Casalino ORCID, Del Buono Nicoletta, Mencar Corrado
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Non-negative matrix factorization is a multivariate analysis method which is proven to be useful in many areas such as bio-informatics, molecular pattern discovery, pattern recognition, document clustering and so on. It seeks a reduced representation of a multivariate data matrix into the product of basis and encoding matrices possessing only non-negative elements, in order to learn the so called part-based representations of data. All algorithms for computing non-negative matrix factorization are iterative, therefore particular emphasis must be placed on a proper initialization of NMF because of its local convergence. The problem of selecting appropriate starting matrices becomes more complex when data possess special meaning as in document clustering. In this paper, we propose the adoption of the subtractive clustering algorithm as a scheme to generate initial matrices for non-negative matrix factorization algorithms. Comparisons with other commonly adopted initializations of non-negative matrix factorization algorithms have been performed and the proposed scheme reveals to be a good trade-off between effectiveness and speed. Moreover, the effectiveness of the proposed initialization to suggest a number of basis for NMF, when data distances are estimated, is illustrated when NMF is used for solving clustering problems where the number of groups in which the data are grouped is not known a priori. The influence of a proper rank factor on the interpretability and the effectiveness of the results are also discussed.