Towards a Framework for Developing Semantic Relatedness Reference Standards

Pakhomov, Serguei V. S.; Pedersen, Ted; McInnes, Bridget; Melton, Genevieve B.; Ruggieri, Alexander; Chute, Christopher G.

Published in

Elsevier, Journal of Biomedical Informatics, 2(44), p. 251-265, 2011

DOI: 10.1016/j.jbi.2010.10.004

Tools

Export citation

Search in Google Scholar

Towards a Framework for Developing Semantic Relatedness Reference Standards

Journal article published in 2010 by Serguei V. S. Pakhomov, Ted Pedersen, Bridget McInnes, Genevieve B. Melton

, Alexander Ruggieri, Christopher G. Chute

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Our objective is to develop a framework for creating reference standards for functional testing of computerized measures of semantic relatedness. Currently, research on computerized approaches to semantic relatedness between biomedical concepts relies on reference standards created for specific purposes using a variety of methods for their analysis. In most cases, these reference standards are not publicly available and the published information provided in manuscripts that evaluate computerized semantic relatedness measurement approaches is not sufficient to reproduce the results. Our proposed framework is based on the experiences of medical informatics and computational linguistics communities and addresses practical and theoretical issues with creating reference standards for semantic relatedness. We demonstrate the use of the framework on a pilot set of 101 medical term pairs rated for semantic relatedness by 13 medical coding experts. While the reliability of this particular reference standard is in the "moderate" range; we show that using clustering and factor analyses offers a data-driven approach to finding systematic differences among raters and identifying groups of potential outliers. We test two ontology-based measures of relatedness and provide both the reference standard containing individual ratings and the R program used to analyze the ratings as open-source. Currently, these resources are intended to be used to reproduce and compare results of studies involving computerized measures of semantic relatedness. Our framework may be extended to the development of reference standards in other research areas in medical informatics including automatic classification, information retrieval from medical records and vocabulary/ontology development.

Published in

Links

Tools

Towards a Framework for Developing Semantic Relatedness Reference Standards

Abstract