Not logged in.

Contribution Details

Type Journal Article
Scope Discipline-based scholarship
Title ABC of order dependencies
Organization Unit
Authors
  • Pei Li
  • Jaroslaw Szlichta
  • Michael Böhlen
  • Divesh Srivastava
Item Subtype Original Work
Refereed Yes
Status Published in final form
Language
  • English
Journal Title The VLDB Journal
Publisher Springer
Geographical Reach international
ISSN 1066-8888
Volume 31
Number 5
Page Range 825 - 849
Date 2022
Abstract Text Band order dependencies (ODs) enhance constraint-based data quality by modeling the semantics of attributes that are monotonically related to small variations without an intrinsic violation of semantics. The class of approximate band conditional ODs (abcODs) generalizes band ODs to make them more relevant to real-world applications by relaxing them to hold approximately with some exceptions (abODs) and conditionally on subsets of the data. We study the automatic dependency discovery of abcODs to avoid human burden. First, we propose a more efficient algorithm to discover abODs than in recent prior work that is based on a new optimization to compute a longest monotonic band via dynamic programming and decreases the runtime from O(n2) to O(nlogn). We then devise a dynamic programming algorithm for abcOD discovery that determines the optimal solution in polynomial time. To optimize the performance (without losing optimality), we adapt the algorithm to cheaply identify consecutive tuples that are guaranteed to belong to the same band. For generality, we extend our algorithms to discover bidirectional abcODs. Finally, we perform a thorough experimental evaluation of our techniques over real-world and synthetic datasets.
Official URL https://link.springer.com/article/10.1007/s00778-021-00696-z
Digital Object Identifier 10.1007/s00778-021-00696-z
Other Identification Number merlin-id:23074
PDF File Download from ZORA
Export BibTeX
EP3 XML (ZORA)
Keywords Data quality, Data profiling, Data discovery