Semantically linking molecular entities in literature through entity relationships

Sofie, Van Landeghem; Yves, Van de Peer; Bj{̈o}rne, J.; Björne, Jari; Abeel, Thomas; De Baets, Bernard; Salakoski, Tapio

Published in

BioMed Central, BMC Bioinformatics, S11(13), 2012

DOI: 10.1186/1471-2105-13-s11-s6

Tools

Export citation

Search in Google Scholar

Semantically linking molecular entities in literature through entity relationships

Journal article published in 2012 by Van Landeghem Sofie, Van de Peer Yves, J. Bj{̈o}rne, Jari Björne, Thomas Abeel

, Bernard De Baets

, Tapio Salakoski

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Abstract Background Text mining tools have gained popularity to process the vast amount of available research articles in the biomedical literature. It is crucial that such tools extract information with a sufficient level of detail to be applicable in real life scenarios. Studies of mining non-causal molecular relations attribute to this goal by formally identifying the relations between genes, promoters, complexes and various other molecular entities found in text. More importantly, these studies help to enhance integration of text mining results with database facts. Results We describe, compare and evaluate two frameworks developed for the prediction of non-causal or 'entity' relations (REL) between gene symbols and domain terms. For the corresponding REL challenge of the BioNLP Shared Task of 2011, these systems ranked first (57.7% F-score) and second (41.6% F-score). In this paper, we investigate the performance discrepancy of 16 percentage points by benchmarking on a related and more extensive dataset, analysing the contribution of both the term detection and relation extraction modules. We further construct a hybrid system combining the two frameworks and experiment with intersection and union combinations, achieving respectively high-precision and high-recall results. Finally, we highlight extremely high-performance results (F-score > 90%) obtained for the specific subclass of embedded entity relations that are essential for integrating text mining predictions with database facts. Conclusions The results from this study will enable us in the near future to annotate semantic relations between molecular entities in the entire scientific literature available through PubMed. The recent release of the EVEX dataset, containing biomolecular event predictions for millions of PubMed articles, is an interesting and exciting opportunity to overlay these entity relations with event predictions on a literature-wide scale.

Published in

Links

Tools

Semantically linking molecular entities in literature through entity relationships

Abstract