Not logged in.

Contribution Details

Type Conference or Workshop Paper
Scope Discipline-based scholarship
Published in Proceedings Yes
Title “Semantics Inside!” But let’s not tell the Data Miners: Intelligent Support for Data Mining
Organization Unit
  • Jörg-Uwe Kietz
  • Floarea Serban
  • Simon Fischer
  • Abraham Bernstein
  • Fabien Gandon
  • Claudia D'Amato
Presentation Type paper
Item Subtype Original Work
Refereed Yes
Status Published in final form
  • English
ISBN 978-3-319-07443-6
ISSN 0302-9743
Page Range 706 - 720
Event Title European Semantic Web Conference ESWC 2014
Event Type conference
Event Location Crete, Greece
Event Start Date May 25 - 2014
Event End Date May 29 - 2014
Series Name Lecture Notes in Computer Science
Number 8465
Publisher Springer
Abstract Text Knowledge Discovery in Databases (KDD) has evolved significantly over the past years and reached a mature stage offering plenty of operators to solve complex data analysis tasks. User support for building data analysis workflows, however, has not progressed sufficiently: the large number of operators currently available in KDD systems and interactions between these operators complicates successful data analysis. To help Data Miners we enhanced one of the most used open source data mining tools—RapidMiner—with semantic technologies. Specifically, we first annotated all elements involved in the Data Mining (DM) process—the data, the operators, models, data mining tasks, and KDD workflows—semantically using our eProPlan modelling tool that allows to describe operators and build a task/method decomposition grammar to specify the desired workflows embedded in an ontology. Second, we enhanced RapidMiner to employ these semantic annotations to actively support data analysts. Third, we built an Intelligent Discovery Assistant, eIda, that leverages the semantic annotation as well as HTN planning to automatically support KDD process generation. We found that the use of Semantic Web approaches and technologies in the KDD domain helped us to lower the barrier to data analysis. We also found that using a generic ontology editor overwhelmed KDD-centric users. We, therefore, provided them with problem-centric extensions to Protege. Last and most surprising, we found that our semantic modeling of the KDD domain served as a rapid prototyping approach for several hard-coded improvements of RapidMiner, namely correctness checking of workflows and quick-fixes, reinforcing the finding that even a little semantic modeling can go a long way in improving the understanding of a domain even for domain experts.
Related URLs
Digital Object Identifier 10.1007/978-3-319-07443-6_47
Other Identification Number merlin-id:9300
PDF File Download from ZORA
Export BibTeX
Additional Information The original publication is available at