Published in

Elsevier, Zoologischer Anzeiger, (256), p. 54-60, 2015

DOI: 10.1016/j.jcz.2015.03.004

Links

Tools

Export citation

Search in Google Scholar

Peeking behind the page: using Natural Language Processing to identify and explore the characters used to classify sea anemones

Journal article published in 2015 by Marymegan Daly, Lorena A. Endara ORCID, John Gordon Burleigh
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Although most phylogenetic investigations are motivated by questions about the evolution of morphological attributes, morphological data are increasingly rare as a source of characters for reconstructing phylogeny, in part because these attributes are time consuming to collect. Here we describe methods to mine the information contained in classifications as a source of phylogenetic characters, using the classification of actiniarian sea anemones (Cnidaria: Anthozoa) as our exemplar system. Our Natural Language Processing pipeline recovers more than 400 characters in the most widely-used classification of sea anemones. However, the majority of these are problematic, reflecting semantic or logical inconsistencies or being scored for only a single taxon and thus inappropriate for phylogenetic reconstruction. Although the classification cannot be directly translated into a phylogenetic matrix, the exposure of the characters that underlie a classification provide important perspective into the basis and limits of a classification system and offer a valuable starting point for the creation of a phylogenetic matrix.