Peter Gloor, Patrick De Boer, Wei Lo, Stefan Wagner, Keichii Nemoto, Cultural anthropology through the lens of Wikipedia - A comparison of historical leadership networks in the English, Chinese, And Japanese Wikipedia, In: COINS15, Collaborative Innovation Networks, Keio University, Japan, 2015-03-12. (Conference or Workshop Paper published in Proceedings)
 
In this paper we study the differences in historical worldviews between Western and Eastern cultures, represented through the English, Chinese and Japanese Wikipedia. In particular, we analyze the historical networks of the world’s leaders since the beginning of written history, comparing them in the three different language versions of Wikipedia. |
|
Manuel Gugger, CrowdProcessDesigner. A Visual Design Interface for Crowd Computing., University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2015. (Master's Thesis)
 
This work contributes an approach on graphical notation for crowd processes and a proof-of-concept implementation of an IDE to dynamically compose processes and run them with graphical feedback.
Two sample processes, Find-Fix-Verify and an Image-labeling algorithm, are implemented to show the capabilities of CrowdProcessDesigner.
Among the basic capabilities such as divide-and-conquer it additionally supports iteration through loop constructs. Further, the tool supports parameter recombination in order to facilitate evaluation of several process prototype variants. The tool is extensible via OSGI modules and initially supports processes provided by PPLib with Mechanical Turk and CrowdFlower as available portals. A preliminary evaluation of the tool has
been done with 6 software engineers.
|
|
Daniel Hegglin, Distributed scheduling using DCOPs in Signal/Collect, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2015. (Master's Thesis)
 
Distributed constraint optimization allows to solve problems in domains like scheduling, traffic flow management or sensor network management. It is a well-researched field and various algorithms have been proposed. However, the dynamic nature of some of these problems in the real world have been overlooked by researchers and problems are often assumed to be static during the course of the computation. The benchmarking of distributed constraint optimization algorithms (DCOP) with changing problem definitions currently lacks a solid theoretical foundation and standardized protocols. This thesis aimed to measure the performance of different types of DCOP algorithms on dynamic problems with a focus on local-iterative algorithms and especially on the MaxSum algorithm and possibly contribute to the field. A complete, a local-iterative message-passing and a local-iterative approximate best-response algorithm for distributed constraint optimization have been implemented for comparison. In the implementation of the MaxSum algorithm, a variation of the usual graph structure has been attempted. As a real-world use case for benchmarking, the meeting scheduling problem has been mapped as distributed constraint optimization problem. A framework has been designed that allows dynamic changes to constraints, variables and the problem domain during run-time. The algorithms have been benchmarked in a static, as well as in a dynamic environment with various parameters and with a focus on solution quality over time. This thesis further proposes a solution to store, further process and monitor the results of the computation in real-time without affecting the performance of the algorithms. |
|
Markus Christen, Rezension : Birgit Beck (2013) Ein neues Menschenbild? Der Anspruch der Neurowissenschaften auf Revision unseres Selbstverständnisses, Ethik in der Medizin, Vol. 27 (3), 2015. (Journal Article)
 
|
|
Markus Christen, Sohaila Bastami, Martina Gloor, Tanja Krones, Resolving some, but not all informed consent issues in DCDD—the Swiss experiences, The American Journal of Bioethics, Vol. 15 (8), 2015. (Journal Article)
 
|
|
Markus Christen, Sabine Müller, Effects of brain lesions on moral agency: Ethical dilemmas in investigating moral behavior, In: Ethical Issues in Behavioural Neuroscience, Springer, Berlin, p. 1 - 30, 2015. (Book Chapter)
 
|
|
Sabine Müller, Rita Riedmüller, Henrik Walter, Markus Christen, An ethical evaluation of stereotactic neurosurgery for anorexia nervosa, AJOB Neuroscience, Vol. 6 (4), 2015. (Journal Article)
 
Anorexia nervosa (AN) is one of several neuropsychiatric disorders that are increasingly tackled experimentally using stereotactic neurosurgery (deep brain stimulation and ablative procedures). We analyze all 27 such cases published between 1990 and 2014. The majority of the patients benefitted significantly from neurosurgical treatments, in terms of both weight restoration and psychiatric morbidity. A remission of AN was reported in 61% of patients treated with DBS and 100% of patients treated with ablative surgery. Unfortunately, information on side effects is insufficient, and after DBS, severe side effects occurred in some cases. Altogether, the risk–benefit evaluation is positive, particularly for ablative stereotactic procedures. However, fundamental ethical issues are raised. We discuss whether neurosurgery can be justified for treating psychiatric disorders of the will that are seemingly self-inflicted, such as addiction or AN, and where cultural factors contribute significantly to their development. We suggest that although psychosocial factors determine the onset of AN, this is not a legitimate argument for banning neurosurgical treatments, since in AN, a vicious circle develops that deeply affects the brain, undermines the will, and prevents ceasing the self-destructive behavior. Three confounding issues provide ethical challenges for research in neurosurgery for AN: first, a scarce information base regarding risks and benefits of the intervention; second, doubtful capabilities for autonomous decision making; and third, the minor age of many patients. We recommend protective measures to ensure that stereotactic neurosurgery research can proceed with respect for the patients' autonomy and orientation to the beneficence principle. |
|
Steffen Hölldobler, Ausgezeichnete Informatikdissertationen 2014, Köllen Druck + Verlag GmBH, Bonn, 2015. (Book/Research Monograph)

Die Gesellschaft für Informatik e.V. (GI) vergibt gemeinsam mit der Schweizer Informatik
Gesellschaft (SI), der Österreichischen Computergesellschaft (OCG) und dem German
Chapter of the ACM (GChACM) jährlich einen Preis für eine hervorragende Dissertation
im Bereich der Informatik. Hierzu zählen nicht nur Arbeiten, die einen Fortschritt in
der Informatik bedeuten, sondern auch Arbeiten aus dem Bereich der Anwendungen in
anderen Disziplinen und Arbeiten, die die Wechselwirkungen zwischen Informatik und
Gesellschaft untersuchen. Die Auswahl dieser Dissertationen stützt sich auf die von den
Universitäten und Hochschulen für diesen Preis vorgeschlagenen Dissertationen. Jede dieser
Hochschulen kann jedes Jahr nur eine Dissertation vorschlagen. Somit sind die im
Auswahlverfahren vorgeschlagenen Kandidatinnen und Kandidaten bereits „Preisträger“
ihrer Hochschule. |
|
Philip Stutz, Scalable Graph Processing With SIGNAL/COLLECT, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2015. (Dissertation)
 
Our ability to process large amounts of data and the size and number of data sets are growing at an incredible pace. This development presents us with the opportunity to build systems that perform complex analyses of increasingly dense networks of data. These opportunities include computing recommendations, analysing social networks, finding patterns in transaction networks, scheduling tasks, or inferencing probabilistic models. Many of these tasks involve processing data that has a natural graph representation.
Whilst the opportunities are there in the form of access to processing resources and data sets, the way we write software has largely not caught up. Many use MapReduce for scalable processing, but this abstraction has shortcomings with regard to processing graph structured data, especially with iterative and asynchronous processing.
This thesis introduces the SIGNAL/COLLECT programming model and framework for efficient parallel and distributed large-scale graph processing. We show that this abstraction captures the essence of many algorithms on graphs in a concise and elegant way. Beyond that, we also show implementations of two complex systems built on SIGNAL/COLLECT: The first system is TripleRush, a distributed in-memory triple store with a novel architecture. The second system is foxPSL, a distributed proba- bilistic inferencing system. Our evaluations show that the SIGNAL/COLLECT framework can efficiently execute simple graph algorithms such as PageRank and that the two complex systems also have competitive performance relative to the respective state-of-the-art.
For this reason we believe that SIGNAL/COLLECT is more generally suitable for designing scalable dynamic and complex systems that process large networks of data. |
|
Lorenz Fischer, Efficient Distributed Stream Processing: Optimization Approaches and Applications, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2015. (Dissertation)
 
As more aspects of our daily lives are being computerized, ever larger amounts of data are being produced at ever greater speeds. In this data lies great value, and we need technologies that enable us to extract this value. This thesis is concerned with one type of technology that allows us to do this: Distributed Stream Processing Systems (DSPS) are systems consisting of many computers that jointly process, and hence extract value from, large amounts of data at high speeds.
This dissertation consists of three research projects that investigate two aspects of DSPS: In two projects, different approaches to increase the efficiency of DSPS were studied and in one project, the value of increased efficiency in stream processing was evaluated. All of these projects have been conducted on real computer systems and they are all of quantitative nature. In the first study, a graph partitioning algorithm was leveraged to schedule the workload within a DSPS. This reduced the communication load between hosts, while maintaining or increasing the throughput of the system. The second study was concerned with the auto-configuration of DSPS. We used a probabilistic black-box optimization strategy called Bayesian Optimization to increase throughput performance of DSPSs through configuration. In the third study, we investigated the value of increased efficiency of a DSPS. This was done by building a DSPS based entity ranking system and by evaluating the effect of timely data processing on the quality of the generated rankings. |
|
Cosmin Basca, Federated SPARQL Query Processing Reconciling Diversity, Flexibility and Performance on the Web of Data, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2015. (Dissertation)
 
Querying the ever-growing Web of Data poses a significant challenge in today’s Semantic Web. The complete lack of any centralised control leads to potentially arbitrary data distribution, high variability of latency between hosts participating in query answering, and, in the extreme, even the (sudden) unavailability of some hosts during query execution. In this thesis we address the question of how to efficiently query the Web of Data while taking into account its scale, diversity and unreliable and uncontrollable nature. We begin by first introducing Avalanche, a federated SPARQL engine which: 1) makes no assumptions about RDF data distribution to SPARQL endpoints, 2) is adaptive to changing network conditions, i.e, can adapt to slow network connections or endpoint unavailability, 3) retrieves up-to-date results from SPARQL endpoints, and 4) is flexible by making limiting assumptions about the structure of participating triple stores.
Tailored to address the semantic heterogeneity derived from the Web of Data’s rich and broad semantic diversity, coupled with its characteristic lack of guarantees, Avalanche employs a fragmented query planning approach, under a concurrent and parallel execution model. By fragmented execution, we refer to the fact that the original SPARQL query is rewritten as the union of all fragments which comprise it. A query fragment is defined as the conjunction of all query triple patterns, where a triple pattern can be resolved by only one endpoint.
As the Web of Data continues to grow, we postulate that so is the likelihood that large numbers of endpoints will index data, sharing the same vocabularies, thus forming semantically homogenous partitions of the Semantic Web. Focusing on this scenario and in order to address some of Avalanche’s limitations, we introduce x-Avalanche an extension of our original system. Here, we add support for disjunctions by using a distributed union operator capable of scaling to hundreds or thousands of endpoints. Furthermore, we enhance the distributed state management with: a) remote caches aimed to reduce the high latency typical of SPARQL endpoints, b) multicast parallel bind-joins exploiting the SPARQL 1.1 VALUES clause, and c) proxy based execution of x-Avalanche operators.
Finally, in x-Avalanche, we introduce a novel and parallel-friendly optimisation paradigm designed not only to offer an optimal tradeoff between total query execution time and fast first results, but also to consider an extended planning space unexplored so far, thus taking the fragmented execution model first introduced in Avalanche to its logical conclusion. Combined, x-Avalanche’s enhancements and optimisations can lead to dramatic performance improvements over top performing state of the art federated SPARQL engines. To conclude, our results show that on average x-Avalanche can be more than one order of magnitude faster when executing SPARQL queries. |
|
Cristina Sarasua, Elena Simperl, Natasha Noy, Abraham Bernstein, Jan Marco Leimeister, Crowdsourcing and the semantic web: a research manifesto, Human Computation, Vol. 2 (1), 2015. (Journal Article)
 
Our goal with this research manifesto is to define a roadmap to guide the evolution of the new research field that is emerging at the intersection between crowdsourcing and the Semantic Web. We analyze the confluence of these two disciplines by exploring their relationship. First, we focus on how the application of crowdsourcing techniques can enhance the machine-driven execution of Semantic Web tasks. Second, we look at the ways in which machine-processable semantics can benefit the design and management of crowdsourcing projects. As a result, we are able to describe a list of successful or promising scenarios for both perspectives, identify scientific and technological challenges, and compile a set of recommendations to realize these scenarios effectively. This research manifesto is an outcome of the Dagstuhl Seminar 14282: Crowdsourcing and the Semantic Web. |
|
Sara Magliacane, Philip Stutz, Paul Groth, Abraham Bernstein, FoxPSL: a fast, optimized and extended psl implementation, International Journal of Approximate Reasoning, Vol. 67, 2015. (Journal Article)
 
In this paper, we describe foxPSL, a fast, optimized and extended implementation of Probabilistic Soft Logic (PSL) based on the distributed graph processing framework Signal/Collect. PSL is one of the leading formalism of statistical relational learning, a recently developed field of machine learning that aims at representing both uncertainty and rich relational structures, usually by combining logical representations with probabilistic graphical models. PSL can be seen as both a probabilistic logic and a template language for hinge-loss Markov Random Fields, a type of continuous Markov Random fields (MRF) in which Maximum a Posteriori inference is very efficient, since it can be formulated as a constrained convex minimization problem, as opposed to a discrete optimization problem for standard MRFs. From the logical perspective, a key feature of PSL is the capability to represent soft truth values, allowing the expression of complex domain knowledge, like degrees of truth, in parallel with uncertainty.
foxPSL supports the full PSL pipeline from problem definition to a distributed solver that implements the Alternating Direction Method of Multipliers (ADMM) consensus optimization. It provides a Domain Specific Language that extends standard PSL with a class system and existential quantifiers, allowing for efficient grounding. Moreover, it implements a series of configurable optimizations, like optimized grounding of constraints and lazy inference, that improve grounding and inference time.
We perform an extensive evaluation, comparing the performance of foxPSL to a state-of-the-art implementation of ADMM consensus optimization in GraphLab, and show an improvement in both inference time and solution quality. Moreover, we evaluate the impact of the optimizations on the execution time and discuss the trade-offs related to each optimization. |
|
Khadija Elbedweihy, Fabio Ciravegna, Dorothee Reinhard, Abraham Bernstein, Evaluating Semantic Search Systems to Identify Future Directions of Research, In: The Semantic Web: ESWC 2012 Satellite Events, Springer, Heidelberg, p. 148 - 162, 2015. (Book Chapter)

|
|
Lucas Jacques, Implementing Support for SPARQL Filters in TripleRush, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2015. (Bachelor's Thesis)
 
The Semantic Web - a web of data formed by interlinked RDF data - has seen a steady increase in size. Triple stores are data management systems for RDF data and offer support for the SPARQL Protocol and RDF Query Language (SPARQL). With SPARQL, RDF data can be retrieved which satisfy user-defined criteria.
TripleRush is such a triple store, using a graph-based architecture to efficiently answer SPARQL queries. This thesis discusses the implementation of SPARQL filter support in TripleRush. We discuss how the filters are represented after they have been parsed and describe how they are checked during query execution. |
|
Michael Feldman, Abraham Bernstein, Cognition-based Task Routing:Towards Highly-Effective Task-Assignments in Crowdsourcing Settings, In: 35th International Conference on Information Systems (ICIS 2014), s.n., Auckland, New Zealand, 2014-12-14. (Conference or Workshop Paper published in Proceedings)
 
In recent years the rising popularity of outsourcing work to crowds has led to increasing importance to find an effective assignment of suitable workers with tasks. Even though attempts have been made in related areas such as expertise identification most crowdsocuring jobs today are assigned without any predefined policy. Whilst some have investigated assigning jobs based on availability or experience no dominant method has been identified so far. We propose an assignment of tasks to crowd-workers based on their cognitive capability, by conducting a set of cognitive tests and comparing them with performance on typical crowdtasks. Moreover, we examine different setups to predict task performance where a) cognitive abilities, b) performance on previous crowdtasks, or c) both of them, are partially known. Preliminary results show that cognition-based task assignment leads to an improvement in task performance prediction and may pave the way to more intelligent crowd-worker recruitment. |
|
Florian Schüpfer, Linked Raster Data, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2014. (Bachelor's Thesis)
 
The Semantic Web and Linked Data open huge possibilities for the integration of knowledge from different domains. In the spatial domain, there are already approaches like linkedgeodata.org and GeoSPARQL, which integrate spatial data with georeferenced entities. These projects operate on vector data like polygons. In this explorative work, we discuss the differences between the integration of vector and raster data into the Semantic Web. Further we find, discuss and implement a method for linking raster data to georeferenced entities in the SPARQL query language. We show how geographic operations on raster data can be described in RDF and how we can load raster files from remote servers by implementing service calls using the WMS protocol. We evaluate our approach by measuring and comparing the execution time of different queries in different configurations and find that the largest bottleneck of Linked Raster Data queries is the remote endpoint and that we should fetch as few results from the remote endpoint as possible to reduce the query execution time. At the end of this thesis, we conclude that we achieved our goals defined at the beginning, altough we had to find some workarounds because of the SPARQL engine we used. |
|
Flavio Keller, Social Network Analysis with Signal/Collect, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2014. (Bachelor's Thesis)
 
The Signal/Collect framework, developed at the University of Zurich is an approach to face the challenge of handling and passing information in large graphs. Its main power lies in its ability to work distributed on multiple machines. The main goal of this thesis is to implement Social Network Analysis measures on the Signal/Collect framework. The focus lies on centrality measures and network properties. These measures reveal what parts of a network have influence on the whole network or try to find communities. The implemented solution is an extension of an existing graph tool with a plugin where all these Social Network Analysis measures can be executed. Furthermore, a more advanced method called “Label Propagation” was implemented which is a way to detect communities in a network and makes it possible to see how these communities change over time. The implemented functionalities were evaluated on a cluster of computers for correctness of the results as well as for computation time. |
|
Fabian Christoffel, Recommending Long-Tail Items with Short Random Walks over the User-Item-Feedback Graph, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2014. (Master's Thesis)
 
We study graph vertex ranking algorithms for use in collaborative filtering-based recommender systems. In this paper we evaluate the performance of previously presented ranking algorithms in an off-line study with four different positive-only feedback datasets. Besides measuring the power to predict future user behavior (accuracy), we also consider four non-accuracy performance dimensions: intra-list diversity, item space or catalog coverage, personalization, and novelty/surprisal. We found that most recommendation lists of vertex ranking algorithms are dominated by high popularity items and give lower accuracy, coverage, personalization, and novelty/surprisal scores than lists from nearest-neighbor or latent factor model-based recommenders.
By applying a parametrized popularity-penalizing recommendation list re-ranking procedure to random walk vertex transition probability-based ranking algorithms (i.e., P3 and P5 [Cooper et al., 2014]) we observed a positive impact on coverage, personalization and novelty/surprisal. For small degrees of popularity penalization the recommender’s accuracy improved or remained constant and reached in most experiments levels comparable to the state-of-the-art non-graph-based recommenders. The re-ranking procedure reduces the dominance of high popularity items in the recommendation list and allows to optimize the trade-off between accuracy and non-accuracy performance dimensions. |
|
Markus Christen, Mark Alfano, Brian Robinson, The Semantic Space of Intellectual Humility, In: European Conference on Social Intelligence, s.n., 2014-11-03. (Conference or Workshop Paper published in Proceedings)
 
|
|