Not logged in.

Contribution Details

Type Conference or Workshop Paper
Scope Discipline-based scholarship
Published in Proceedings Yes
Title Random-walk triplerush: asynchronous graph querying and sampling
Organization Unit
  • Philip Stutz
  • Bibek Paudel
  • Coralia-Mihaela Verman
  • Abraham Bernstein
Presentation Type paper
Item Subtype Original Work
Refereed Yes
Status Published in final form
  • English
ISBN 978-1-4503-3469-3
Page Range 1034 - 1044
Event Title 24th International World Wide Web Conference (WWW 2015)
Event Type conference
Event Location Florence, Italy
Event Start Date May 18 - 2015
Event End Date May 22 - 2015
Publisher International World Wide Web Conferences Steering Committee Republic and Canton of Geneva
Abstract Text Most Semantic Web applications rely on querying graphs, typically by using SPARQL with a triple store. Increasingly, applications also analyze properties of the graph structure to compute statistical inferences. The current Semantic Web infrastructure, however, does not efficiently support such operations. Hence, developers have to painstakingly retrieve the relevant data for statistical post-processing. In this paper we propose to rethink query execution in a triple store as a highly parallelized asynchronous graph exploration on an active index data structure. This approach also allows to integrate SPARQL-querying with the sampling of graph properties. To evaluate this architecture we implemented Random Walk TripleRush, which is built on a distributed graph processing system and operates by routing query and path descriptions through a novel active index data structure. In experiments we find that our architecture can be used to build a competitive distributed graph store. It can often return first results quickly, thanks to its asynchronous architecture. We show that our architecture supports the execution of various types of random walks with restarts that sample interesting graph properties. We also evaluate the scalability and show that the architecture supports fast answer times even on a dataset with more than a billion triples.
Related URLs
Digital Object Identifier 10.1145/2736277.2741687
Other Identification Number merlin-id:11663
PDF File Download from ZORA
Export BibTeX