Links

Tools

Export citation

Search in Google Scholar

The Pitfalls of Thesaurus Ontologization – the Case of the NCI Thesaurus

Journal article published in 2010 by Stefan Schulz, Daniel Schober ORCID, Ilinca Tudose, Holger Stenzhorn
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

Thesauri that are “ontologized” into OWL-DL semantics are highly amenable to modeling errors resulting from falsely interpreting existential restrictions. We investigated the OWL-DL representation of the NCI Thesaurus (NCIT) in order to assess the correctness of existential restrictions. A random sample of 354 axioms using the someValuesFrom operator was taken. According to a rating performed by two domain experts, roughly half of these examples, and in consequence more than 76,000 axioms in the OWL-DL version, make incorrect assertions if interpreted according to description logics semantics. These axioms therefore constitute a huge source for unintended models, rendering most logic-based reasoning unreliable. After identifying typical error patterns we discuss some possible improvements. Our recommendation is to either amend the problematic axioms in the OWL-DL formalization or to consider some less strict representational format.