Published in

F1000Research, F1000Research, (3), p. 96, 2014

DOI: 10.12688/f1000research.3216.1

F1000Research, F1000Research

DOI: 10.12688/f1000research.3456

Links

Tools

Export citation

Search in Google Scholar

An analysis on the entity annotations in biological corpora

Journal article published in 2014 by Mariana Neves ORCID
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Red circle
Preprint: archiving forbidden
Red circle
Postprint: archiving forbidden
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Collection of documents annotated with semantic entities and relationships are crucial resources to support development and evaluation of text mining solutions for the biomedical domain. Here I present an overview of 36 corpora and show an analysis on the semantic annotations they contain. Annotations for entity types were classified into six semantic groups and an overview on the semantic entities which can be found in each corpus is shown. Results show that while some semantic entities, such as genes, proteins and chemicals are consistently annotated in many collections, corpora available for diseases, variations and mutations are still few, in spite of their importance in the biological domain.