Not logged in.

Contribution Details

Type Conference or Workshop Paper
Scope Discipline-based scholarship
Published in Proceedings Yes
Title Continuous Imputation of Missing Values in Streams of Pattern-Determining Time Series
Organization Unit
Authors
  • Kevin Wellenzohn
  • Michael Hanspeter Böhlen
  • Anton Dignös
  • Johann Gamper
  • Hannes Mitterer
Presentation Type paper
Item Subtype Original Work
Refereed Yes
Status Published in final form
Language
  • English
ISBN 978-3-89318-073-8
Page Range 330 - 341
Event Title Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017
Event Type conference
Event Location Venice, Italy
Event Start Date March 21 - 2017
Event End Date March 24 - 2017
Abstract Text Time series data is ubiquitous but often incomplete, e.g., due to sensor failures and transmission errors. Since many applications require complete data, missing values must be imputed before further data processing is possible. We propose Top-k Case Matching (TKCM) to impute missing values in streams of time series data. TKCM defines for each time series a set of reference time series and exploits similar historical situations in the reference time series for the imputation. A situation is characterized by the anchor point of a pattern that consists of l consecutive measurements over the reference time series. A missing value in a time series s is derived from the values of s at the anchor points of the k most similar patterns. We show that TKCM imputes missing values consistently if the reference time series pattern-determine time series s, i.e., the pattern of length l at time tn is repeated at least k times in the reference time series and the corresponding values of s at the anchor time points are similar to each other. In contrast to previous work, we support time series that are not linearly correlated but, e.g., phase shifted. TKCM is resilient to consecutively missing values, and the accuracy of the imputed values does not decrease if blocks of values are missing. The results of an exhaustive experimental evaluation using real-world and synthetic data shows that we outperform the state-of-the-art solutions.
Digital Object Identifier 10.5441/002/edbt.2017.30
Other Identification Number merlin-id:14742
PDF File Download from ZORA
Export BibTeX
EP3 XML (ZORA)