Published in

ACM SIGAPP Applied Computing Review, 2(12), p. 64-77

DOI: 10.1145/2340416.2340422

Links

Tools

Export citation

Search in Google Scholar

Querying RDF dictionaries in compressed space

Journal article published in 2012 by Miguel A. Martínez Prieto, Javier D. Fernández, Rodrigo Cánovas ORCID
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

The use of dictionaries is a common practice among those applications performing on huge RDF datasets. It allows long terms occurring in the RDF triples to be replaced by short IDs which reference them. This decision greatly compacts the dataset and mitigates the scalability issues underlying to its management. However, the dictionary size is not negligible and the techniques used for its representation also suffer from scalability limitations. This paper focuses on this scenario by adapting compression techniques for string dictionaries to the case of RDF. We propose a novel technique: Dcomp, which can be tuned to represent the dictionary in compressed space (22--64%) and to perform basic lookup operations in a few microseconds (1--50μs). In addition, we propose Dcomp as a basis for specific SPARQL query optimizations leveraging its ability for early FILTER resolution.