DAW: Duplicate-AWare Federated Query Processing over the Web of Data

Saleem, Muhammad; Ngonga, Axel-Cyrille; Xavier Parreira, Josiane; Parreira, Josiane Xavier; Ngomo, Axel-Cyrille Ngonga; Deus, Helena F.; Hauswirth, Manfred

Published in

Springer Verlag, Lecture Notes in Computer Science, p. 574-590

DOI: 10.1007/978-3-642-41335-3_36

Tools

Export citation

Search in Google Scholar

DAW: Duplicate-AWare Federated Query Processing over the Web of Data

Proceedings article published in 2013 by Muhammad Saleem, Axel-Cyrille Ngonga, Josiane Xavier Parreira, Josiane Xavier Parreira, Axel-Cyrille Ngonga Ngomo, Helena F. Deus, Manfred Hauswirth

This paper is available in a repository.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Conference paper ; Over the last years the Web of Data has developed into a large compendium of interlinked data sets from multiple domains. Due to the decentralised architecture of this compendium, several of these datasets contain duplicated data. Yet, so far, only little attention has been paid to the effect of duplicated data on federated querying. This work presents DAW, a novel duplicate-aware approach to feder- ated querying over the Web of Data. DAW is based on a combination of min-wise independent permutations and compact data summaries. It can be directly combined with existing federated query engines in or- der to achieve the same query recall values while querying fewer data sources. We extend three well-known federated query processing engines DARQ, SPLENDID, and FedX with DAW and compare our exten- sions with the original approaches. The comparison shows that DAW can greatly reduce the number of queries sent to the endpoints, while keeping high query recall values. Therefore, it can significantly improve the performance of federated query processing engines. Moreover, DAW provides a source selection mechanism that maximises the query recall, when the query processing is limited to a subset of the sources. ; Science Foundation Ireland - Grant No. SFI/08/CE/I1380 (Lion-II) & Grant No. SFI/12/RC/2289 (INSIGHT) ; peer-reviewed

Published in

Links

Tools

DAW: Duplicate-AWare Federated Query Processing over the Web of Data

Abstract