Not logged in.

Contribution Details

Type Conference or Workshop Paper
Scope Discipline-based scholarship
Published in Proceedings Yes
Title Provenance for Nested Subqueries
Organization Unit
Authors
  • B Glavic
  • G Alonso
Editors
  • M L Kersten
  • B Novikov
  • J Teubner
  • V Polutin
  • S Manegold
Presentation Type paper
Item Subtype Original Work
Refereed Yes
Status Published in final form
Language
  • English
ISBN 978-1-60558-422-5
Page Range 982 - 993
Event Title 12th International Conference on Extending Database Technology
Event Type conference
Event Location Saint Petersburg, Russia
Event Start Date March 24 - 2009
Event End Date March 26 - 2009
Series Name ACM International Conference Proceeding Series (AICPS)
Number 12
Place of Publication Saint Petersburg, Russia
Publisher ACM
Abstract Text Data provenance is essential in applications such as scientific computing, curated databases, and data warehouses. Several systems have been developed that provide provenance functionality for the relational data model. These systems support only a subset of SQL, a severe limitation in practice since most of the application domains that benefit from provenance information use complex queries. Such queries typically involve nested subqueries, aggregation and/or user defined functions. Without support for these constructs, a provenance management system is of limited use. In this paper we address this limitation by exploring the problem of provenance derivation when complex queries are involved. More precisely, we demonstrate that the widely used definition of Why-provenance fails in the presence of nested subqueries, and show how the definition can be modified to produce meaningful results for nested subqueries. We further present query rewrite rules to transform an SQL query into a query propagating provenance. The solution introduced in this paper allows us to track provenance information for a far wider subset of SQL than any of the existing approaches. We have incorporated these ideas into the Perm provenance management system engine and used it to evaluate the feasibility and performance of our approach.
Official URL http://dblp.uni-trier.de/db/conf/edbt/edbt2009.html
Digital Object Identifier 10.1145/1516360.1516472
Other Identification Number merlin-id:257
PDF File Download from ZORA
Export BibTeX
EP3 XML (ZORA)
Keywords provenance, query rewrite, nested subqueries