Published in

American Society of Clinical Oncology, JCO Clinical Cancer Informatics, 6, 2022

DOI: 10.1200/cci.22.00014

Links

Tools

Export citation

Search in Google Scholar

Natural Language Processing of Computed Tomography Reports to Label Metastatic Phenotypes With Prognostic Significance in Patients With Colorectal Cancer

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Red circle
Preprint: archiving forbidden
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

PURPOSE Natural language processing (NLP) applied to radiology reports can help identify clinically relevant M1 subcategories of patients with colorectal cancer (CRC). The primary purpose was to compare the overall survival (OS) of CRC according to American Joint Committee on Cancer TNM staging and explore an alternative classification. The secondary objective was to estimate the frequency of metastasis for each organ. METHODS Retrospective study of CRC who underwent computed tomography (CT) chest, abdomen, and pelvis between July 1, 2009, and March 26, 2019, at a tertiary cancer center, previously labeled for the presence or absence of metastasis by an NLP prediction model. Patients were classified in M0, M1a, M1b, and M1c (American Joint Committee on Cancer), or an alternative classification on the basis of the metastasis organ number: M1, single; M2, two; M3, three or more organs. Cox regression models were used to estimate hazard ratios; Kaplan-Meier curves were used to visualize survival curves using the two M1 subclassifications. RESULTS Nine thousand nine hundred twenty-eight patients with a total of 48,408 CT chest, abdomen, and pelvis reports were included. On the basis of NLP prediction, the median OS of M1a, M1b, and M1c was 4.47, 1.72, and 1.52 years, respectively. The median OS of M1, M2, and M3 was 4.24, 2.05, and 1.04 years, respectively. Metastases occurred most often in liver (35.8%), abdominopelvic lymph nodes (32.9%), lungs (29.3%), peritoneum (22.0%), thoracic nodes (19.9%), bones (9.2%), and pelvic organs (7.5%). Spleen and adrenal metastases occurred in < 5%. CONCLUSION NLP applied to a large radiology report database can identify clinically relevant metastatic phenotypes and be used to investigate new M1 substaging for CRC. Patients with three or more metastatic disease organs have the worst prognosis, with an OS of 1 year.