Not logged in.
Quick Search - Contribution
Contribution Details
Type | Journal Article |
Scope | Discipline-based scholarship |
Title | ABC of order dependencies |
Organization Unit | |
Authors |
|
Item Subtype | Original Work |
Refereed | Yes |
Status | Published in final form |
Language |
|
Journal Title | The VLDB Journal |
Publisher | Springer |
Geographical Reach | international |
ISSN | 1066-8888 |
Volume | 31 |
Number | 5 |
Page Range | 825 - 849 |
Date | 2022 |
Abstract Text | Band order dependencies (ODs) enhance constraint-based data quality by modeling the semantics of attributes that are monotonically related to small variations without an intrinsic violation of semantics. The class of approximate band conditional ODs (abcODs) generalizes band ODs to make them more relevant to real-world applications by relaxing them to hold approximately with some exceptions (abODs) and conditionally on subsets of the data. We study the automatic dependency discovery of abcODs to avoid human burden. First, we propose a more efficient algorithm to discover abODs than in recent prior work that is based on a new optimization to compute a longest monotonic band via dynamic programming and decreases the runtime from O(n2) to O(nlogn). We then devise a dynamic programming algorithm for abcOD discovery that determines the optimal solution in polynomial time. To optimize the performance (without losing optimality), we adapt the algorithm to cheaply identify consecutive tuples that are guaranteed to belong to the same band. For generality, we extend our algorithms to discover bidirectional abcODs. Finally, we perform a thorough experimental evaluation of our techniques over real-world and synthetic datasets. |
Official URL | https://link.springer.com/article/10.1007/s00778-021-00696-z |
Digital Object Identifier | 10.1007/s00778-021-00696-z |
Other Identification Number | merlin-id:23074 |
PDF File | Download from ZORA |
Export |
BibTeX
EP3 XML (ZORA) |
Keywords | Data quality, Data profiling, Data discovery |