Published in

Elsevier, Current Plant Biology, (2), p. 1-11, 2015

DOI: 10.1016/j.cpb.2014.12.002

Links

Tools

Export citation

Search in Google Scholar

The KnownLeaf literature curation system captures knowledge about Arabidopsis leaf growth and development and facilitates integrated data mining

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Red circle
Postprint: archiving forbidden
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The information that connects genotypes and phenotypes is essentially embedded in research articleswritten in natural language. To facilitate access to this knowledge, we constructed a framework for thecuration of the scientific literature studying the molecular mechanisms that control leaf growth anddevelopment in Arabidopsis thaliana (Arabidopsis). Standard structured statements, called relations, weredesigned to capture diverse data types, including phenotypes and gene expression linked to genotypedescription, growth conditions, genetic and molecular interactions, and details about molecular entities.Relations were then annotated from the literature, defining the relevant terms according to standardbiomedical ontologies. This curation process was supported by a dedicated graphical user interface, calledLeaf Knowtator. A total of 283 primary research articles were curated by a community of annotators,yielding 9947 relations monitored for consistency and over 12,500 references to Arabidopsis genes. Thisinformation was converted into a relational database (KnownLeaf) and merged with other public Ara-bidopsis resources relative to transcriptional networks, protein–protein interaction, gene co-expression,and additional molecular annotations. Within KnownLeaf, leaf phenotype data can be searched togetherwith molecular data originating either from this curation initiative or from external public resources.Finally, we built a network (LeafNet) with a portion of the KnownLeaf database content to graphicallyrepresent the leaf phenotype relations in a molecular context, offering an intuitive starting point forknowledge mining. Literature curation efforts such as ours provide high quality structured informationaccessible to computational analysis, and thereby to a wide range of applications. ; Integrated Project AGRON-OMICS, in the Sixth Framework Programme of the European Commission (LSHG-CT-2006-037704). TiMet - Linking the Clock to Metabolism, a Collaborative Project (Grant Agreement 245143) funded by the European Commission FP7, in response to call FP7-KBBE-2009-3.Research Foundation Flanders (FWO). TransPLANT project (funded by the European Commission within its 7th Framework Programme under the thematic area ‘Infrastructures’, contractnumber 283496). Agency for Innovation by Science and Technology in Flanders (IWT-Vlaanderen), (project no. 111164). ; http://www.elsevier.com/locate/cpb