Not logged in.

Contribution Details

Type Conference or Workshop Paper
Scope Discipline-based scholarship
Published in Proceedings Yes
Title Machines Tuning Machines: Configuring Distributed Stream Processors with Bayesian Optimization
Organization Unit
  • Lorenz Fischer
  • Shen Gao
  • Abraham Bernstein
Presentation Type paper
Item Subtype Original Work
Refereed Yes
Status Published in final form
  • English
Event Title 2015 IEEE International Conference on Cluster Computing (CLUSTER 2015)
Event Type conference
Event Location Chicago, Illinois, USA
Event Start Date September 8 - 2015
Event End Date September 11 - 2015
Publisher IEEE Computer Society
Abstract Text Modern distributed computing frameworks such as Apache Hadoop, Spark, or Storm distribute the workload of applications across a large number of machines. Whilst they abstract the details of distribution they do require the programmer to set a number of configuration parameters before deployment. These parameter settings (usually) have a substantial impact on execution efficiency. Finding the right values for these parameters is considered a difficult task and requires domain, application, and framework expertise. In this paper, we propose a machine learning approach to the problem of configuring a distributed computing framework. Specifically, we propose using Bayesian Optimization to find good parameter settings. In an extensive empirical evaluation, we show that Bayesian Optimization can effectively find good parameter settings for four different stream processing topologies implemented in Apache Storm resulting in significant gains over a parallel linear approach.
Digital Object Identifier 10.1109/CLUSTER.2015.13
Other Identification Number merlin-id:12241
PDF File Download from ZORA
Export BibTeX