TripleProv: Efficient processing of lineage queries in a native RDF store

Wylot, Marcin; Cudre-Mauroux, Philippe; Groth, Paul

Published in

Proceedings of the 23rd international conference on World wide web - WWW '14

DOI: 10.1145/2566486.2568014

Tools

Export citation

Search in Google Scholar

TripleProv: Efficient processing of lineage queries in a native RDF store

Proceedings article published in 2014 by Marcin Wylot, Philippe Cudre-Mauroux, Paul Groth

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Given the heterogeneity of the data one can find on the Linked Data cloud, being able to trace back the provenance of query results is rapidly becoming a must-have feature of RDF systems. While provenance models have been extensively discussed in recent years, little attention has been given to the efficient implementation of provenance-enabled queries inside data stores. This paper introduces TripleProv: a new system extending a native RDF store to efficiently handle such queries. TripleProv implements two different storage models to physically co-locate lineage and instance data, and for each of them implements algorithms for tracing provenance at two granularity levels. In the following, we present the overall architecture of our system, its different lineage storage models, and the various query execution strategies we have implemented to efficiently answer provenance-enabled queries. In addition, we present the results of a comprehensive empirical evaluation of our system over two different datasets and workloads.

Published in

Links

Tools

TripleProv: Efficient processing of lineage queries in a native RDF store

Abstract