Links

Tools

Export citation

Search in Google Scholar

Developing a cardiovascular disease risk-factors annotated corpus of Chinese electronic medical records

Published in 2016 by Jia Su, Bin He, Yi Guan, Jingchi Jiang, Jinfeng Yang
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

Objective The goal of this study was to build a corpus of cardiovascular disease (CVD) risk-factor annotations based on Chinese electronic medical records (CEMRs). This corpus is intended to be used to develop a risk-factor information extraction system that, in turn, can be applied as a foundation for the further study of the progress of risk-factors and CVD. Materials and Methods We designed a light-annotation-task to capture CVD-risk-factors with indicators, temporal attributes and assertions explicitly displayed in the records. The task included: 1) preparing data; 2) creating guidelines for capturing annotations (these were created with the help of clinicians); 3) proposing annotation method including building the guidelines draft, training the annotators and updating the guidelines, and corpus construction. Results The outcome of this study was a risk-factor-annotated corpus based on de-identified discharge summaries and progress notes from 600 patients. Built with the help of specialists, this corpus has an inter-annotator agreement (IAA) F1-measure of 0.968, indicating a high reliability. Discussion Our annotations included 12 CVD-risk-factors such as Hypertension and Diabetes. The annotations can be applied as a powerful tool to the management of these chronic diseases and the prediction of CVD. Conclusion Guidelines for capturing CVD-risk-factor annotations from CEMRs were proposed and an annotated corpus was established. The obtained document-level annotations can be applied in future studies to monitor risk-factors and CVD over the long term. ; Comment: 29 pages, 3 figures, 3 tables