Not logged in.

Contribution Details

Type Conference or Workshop Paper
Scope Discipline-based scholarship
Published in Proceedings Yes
Title Replicating Mining Studies with SOFAS
Organization Unit
Authors
  • Giacomo Ghezzi
  • Harald Gall
Presentation Type paper
Item Subtype Original Work
Refereed Yes
Status Published in final form
Language
  • English
Event Title 10th Working Conference on Mining Software Repositories
Event Type conference
Event Location San Francisco
Event Start Date May 18 - 2013
Event End Date May 19 - 2013
Place of Publication Washington, DC
Publisher IEEE Computer Society
Abstract Text The replication of studies in mining software repositories (MSR) is essential to compare different mining techniques or assess their findings across many projects. However, it has been shown that very few of these studies can be easily replicated. Their replication is just as fundamental as the studies themselves and is one of the main threats to validity that they suffer from. In this paper, we show how we can alleviate this problem with our SOFAS framework. SOFAS is a platform that enables a systematic and repeatable analysis of software projects by providing extensible and composable analysis workflows. These workflows can be applied on a multitude of software projects, facilitating the replication and scaling of mining studies. In this paper, we show how and to which degree replication can be achieved. We investigated the mining studies of MSR from 2004 to 2011 and found that from 88 studies published in the MSR proceedings so far, we can fully replicate 25 empirical studies. Additionally, we can replicate 27 additional studies to a large extent. These studies account for 30% and 32%, respectively, of the mining studies published. To support our claim we describe in detail one large study that we replicated and discuss how replication with SOFAS works for the other studies investigated. To discuss the potential of our platform we also characterise how studies can be easily enriched to deliver even more comprehensive answers by extending the analysis workflows provided by the platform.
PDF File Download
Export BibTeX
EP3 XML (ZORA)