Data Availability of Open T-Cell Receptor Repertoire Data, a Systematic Assessment

Huang, Yu-Ning; Patel, Naresh Amrat; Mehta, Jay Himanshu; Ginjala, Srishti; Brodin, Petter; Gray, Clive M.; Patel, Yesha M.; Cowell, Lindsay G.; Burkhardt, Amanda M.; Mangul, Serghei

Published in

Frontiers Media, Frontiers in Systems Biology, (2), 2022

DOI: 10.3389/fsysb.2022.918792

Tools

Export citation

Search in Google Scholar

Data Availability of Open T-Cell Receptor Repertoire Data, a Systematic Assessment

Journal article published in 2022 by Yu-Ning Huang, Naresh Amrat Patel, Jay Himanshu Mehta, Srishti Ginjala, Petter Brodin, Clive M. Gray, Yesha M. Patel, Lindsay G. Cowell, Amanda M. Burkhardt, Serghei Mangul

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Modern data-driven research has the power to promote novel biomedical discoveries through secondary analyses of raw data. Therefore, it is important to ensure data-driven research with great reproducibility and robustness for promoting a precise and accurate secondary analysis of the immunogenomics data. In scientific research, rigorous conduct in designing and conducting experiments is needed, specifically in scientific writing and reporting results. It is also crucial to make raw data available, discoverable, and well described or annotated in order to promote future re-analysis of the data. In order to assess the data availability of published T cell receptor (TCR) repertoire data, we examined 11,918 TCR-Seq samples corresponding to 134 TCR-Seq studies ranging from 2006 to 2022. Among the 134 studies, only 38.1% had publicly available raw TCR-Seq data shared in public repositories. We also found a statistically significant association between the presence of data availability statements and the increase in raw data availability (p = 0.014). Yet, 46.8% of studies with data availability statements failed to share the raw TCR-Seq data. There is a pressing need for the biomedical community to increase awareness of the importance of promoting raw data availability in scientific research and take immediate action to improve its raw data availability enabling cost-effective secondary analysis of existing immunogenomics data by the larger scientific community.

Published in

Links

Tools

Data Availability of Open T-Cell Receptor Repertoire Data, a Systematic Assessment

Abstract