Links

Tools

Export citation

Search in Google Scholar

Representing nested semantic information in a linear string of text using XML.

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a Natural Language Processor. Traditional approaches to this problem separate the structured information from the actual text mark up. This paper introduces an alternative solution, which tightly integrates the semantic structure with the text. The resulting XML markup preserves the linearity of the medical texts and can therefore be easily expanded with additional types of information.