JIND: joint integration and discrimination for automated single-cell annotation

Goyal, Mohit; Serrano, Guillermo; Argemi, Josepmaria; Shomorony, Ilan; Hernaez, Mikel; Ochoa, Idoia

Published in

Oxford University Press, Bioinformatics, 9(38), p. 2488-2495, 2022

DOI: 10.1093/bioinformatics/btac140

Tools

Export citation

Search in Google Scholar

JIND: joint integration and discrimination for automated single-cell annotation

Journal article published in 2022 by Mohit Goyal

, Guillermo Serrano

, Josepmaria Argemi, Ilan Shomorony, Mikel Hernaez, Idoia Ochoa

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

AbstractMotivationAn important step in the transcriptomic analysis of individual cells involves manually determining the cellular identities. To ease this labor-intensive annotation of cell-types, there has been a growing interest in automated cell annotation, which can be achieved by training classification algorithms on previously annotated datasets. Existing pipelines employ dataset integration methods to remove potential batch effects between source (annotated) and target (unannotated) datasets. However, the integration and classification steps are usually independent of each other and performed by different tools. We propose JIND (joint integration and discrimination for automated single-cell annotation), a neural-network-based framework for automated cell-type identification that performs integration in a space suitably chosen to facilitate cell classification. To account for batch effects, JIND performs a novel asymmetric alignment in which unseen cells are mapped onto the previously learned latent space, avoiding the need of retraining the classification model for new datasets. JIND also learns cell-type-specific confidence thresholds to identify cells that cannot be reliably classified.ResultsWe show on several batched datasets that the joint approach to integration and classification of JIND outperforms in accuracy existing pipelines, and a smaller fraction of cells is rejected as unlabeled as a result of the cell-specific confidence thresholds. Moreover, we investigate cells misclassified by JIND and provide evidence suggesting that they could be due to outliers in the annotated datasets or errors in the original approach used for annotation of the target batch.Availability and implementationImplementation for JIND is available at https://github.com/mohit1997/JIND and the data underlying this article can be accessed at https://doi.org/10.5281/zenodo.6246322.Supplementary informationSupplementary data are available at Bioinformatics online.

Published in

Links

Tools

JIND: joint integration and discrimination for automated single-cell annotation

Abstract