Dissemin is shutting down on January 1st, 2025

Published in

Oxford University Press, Bioinformatics, 1(39), 2022

DOI: 10.1093/bioinformatics/btac751

Links

Tools

Export citation

Search in Google Scholar

GeoWaVe: geometric median clustering with weighted voting for ensemble clustering of cytometry data

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Abstract Motivation Clustering is an unsupervised method for identifying structure in unlabelled data. In the context of cytometry, it is typically used to categorize cells into subpopulations of similar phenotypes. However, clustering is greatly dependent on hyperparameters and the data to which it is applied as each algorithm makes different assumptions and generates a different ‘view’ of the dataset. As such, the choice of clustering algorithm can significantly influence results, and there is often not one preferred method but different insights to be obtained from different methods. To overcome these limitations, consensus approaches are needed that directly address the effect of competing algorithms. To the best of our knowledge, consensus clustering algorithms designed specifically for the analysis of cytometry data are lacking. Results We present a novel ensemble clustering methodology based on geometric median clustering with weighted voting (GeoWaVe). Compared to graph ensemble clustering methods that have gained popularity in single-cell RNA sequencing analysis, GeoWaVe performed favourably on different sets of high-dimensional mass and flow cytometry data. Our findings provide proof of concept for the power of consensus methods to make the analysis, visualization and interpretation of cytometry data more robust and reproducible. The wide availability of ensemble clustering methods is likely to have a profound impact on our understanding of cellular responses, clinical conditions and therapeutic and diagnostic options. Availability and implementation GeoWaVe is available as part of the CytoCluster package https://github.com/burtonrj/CytoCluster and published on the Python Package Index https://pypi.org/project/cytocluster. Benchmarking data described are available from https://doi.org/10.5281/zenodo.7134723. Supplementary information Supplementary data are available at Bioinformatics online.