Micheal Luggen, Djellel Difallah, Cristina Sarasua, Gianluca Demartini, Philippe Cudré-Mauroux, Non-Parametric Class Completeness Estimators for Collaborative Knowledge Graphs — The Case of Wikidata, In: International Semantic Web Conference (ISWC), ISWC, Springer, 2019-10-28. (Conference or Workshop Paper published in Proceedings)
|
|
Romana Pernisch, The Butterfly Effect in Knowledge Graphs: Predicting the Impact of Changes in the Evolving Web of Data, In: Doctoral Consortium at ISWC 2019, ISWC, CEUR-WS.org, 2019-10-26. (Conference or Workshop Paper published in Proceedings)
Knowledge graphs (KGs) are at the core of numerous applications and their importance is increasing. Yet, knowledge evolves and so do KGs. PubMed, a search engine that primarily provides access to medical publications, adds an estimated 500'000 new records per year - each having the potential to require updates to a medical KG, like the National Cancer Institute Thesaurus. Depending on the applications that use such a medical KG, some of these updates have possibly wide-ranging impact, while others have only local effects. Estimating the impact of a change ex-ante is highly important, as it might make KG-engineers aware of the consequences of their actions during editing or may be used to highlight the importance of a new fragment of knowledge to be added to the KG for some application. This research description proposes a unified methodology for predicting the impact of changes in evolving KGs and introduces an evaluation framework to assess the quality of these predictions. |
|
Romana Pernisch, Daniele Dell'Aglio, Matthiew Horridge, Matthias Baumgartner, Abraham Bernstein, Toward Predicting Impact of Changes in Evolving Knowledge Graphs, In: ISWC 2019 Posters & Demonstrations, ISWC, CEUR-WS.org, 2019-10-25. (Conference or Workshop Paper)
The updates on knowledge graphs (KGs) affect the services built on top of them. However, changes are not all the same: some updates drastically change the result of operations based on knowledge graph content; others do not lead to any variation. Estimating the impact of a change ex-ante is highly important, as it might make KG engineers aware of the consequences of their action during KG editing or may be used to highlight the importance of a new fragment of knowledge to be added to the KG for some application.
The main goal of this contribution is to offer a formalization of the problem. Additionally, it presents some preliminary experiments on three different datasets considering embeddings as operation.Results show that the estimation can reach AUCs of 0.85, suggesting the feasibility of this research. |
|
Luca Rossetto, Ralph Gasser, Heiko Schuldt, Query by Semantic Sketch, In: ArXiv.org, No. :1909.1252, 2019. (Working Paper)
Sketch-based query formulation is very common in image and video retrieval as these techniques often complement textual retrieval methods that are based on either manual or machine generated annotations. In this paper, we present a retrieval approach that allows to query visual media collections by sketching concept maps, thereby merging sketch-based retrieval with the search for semantic labels. Users can draw a spatial distribution of different concept labels, such as "sky", "sea" or "person" and then use these sketches to find images or video scenes that exhibit a similar distribution of these concepts. Hence, this approach does not only take the semantic concepts themselves into account, but also their semantic relations as well as their spatial context. The efficient vector representation enables efficient retrieval even in large multimedia collections. We have integrated the semantic sketch query mode into our retrieval engine vitrivr and demonstrated its effectiveness. |
|
Katrin Affolter, Kurt Stockinger, Abraham Bernstein, A comparative survey of recent natural language interfaces for databases, VLDB Journal, Vol. 28 (5), 2019. (Journal Article)
Over the last few years, natural language interfaces (NLI) for databases have gained significant traction both in academia and industry. These systems use very different approaches as described in recent survey papers. However, these systems have not been systematically compared against a set of benchmark questions in order to rigorously evaluate their functionalities and expressive power. In this paper, we give an overview over 24 recently developed NLIs for databases. Each of the systems is evaluated using a curated list of ten sample questions to show their strengths and weaknesses. We categorize the NLIs into four groups based on the methodology they are using: keyword-, pattern-, parsing- and grammar-based NLI. Overall, we learned that keyword-based systems are enough to answer simple questions. To solve more complex questions involving subqueries, the system needs to apply some sort of parsing to identify structural dependencies. Grammar-based systems are overall the most powerful ones, but are highly dependent on their manually designed rules. In addition to providing a systematic analysis of the major systems, we derive lessons learned that are vital for designing NLIs that can answer a wide range of user questions. |
|
Clara-Maria Barth, Visualisation of Temporal Networks, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Bachelor's Thesis)
The propagation of animal diseases has been shown to be strongly affected by animal transportation networks, and therefore livestock movement databases have been created worldwide that can now be analysed. To analyse such large datasets, they can be modelled and visualised as networks. A visualisation of such a large multivariate and temporal network can be challenging and is the focus of this bachelor thesis. We explore the possibility of generating a focused subgraph of the network that helps users to understand, explore and analyse possible disease spreading paths within the network. Furthermore, we create an interactive visualisation that can be explored by the users and help them understand how the animal transports connect the different vertices/farms and highlight interactively the subsections that are interesting for them. |
|
Lucien Heitz, Diverse Political News Recommendations Design and Implementation of an Algorithm for Diverse Political News Recommendations, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Master's Thesis)
Many people nowadays read news on the Internet. The selection of available articles is often personalized and matches the interests of their respective readers. So-called recommender systems are used for this. When primarily focusing on the interests of their readers, however, these systems can lead to people receiving only one-sided news about recent events. Filter bubbles are a possible consequence of this. An algorithm for a recommender system is developed in this thesis, one that optimizes for diversity, in order to counteract this development. The focus lies on creating recommendation lists, which focus on political diversity of news articles. |
|
Christoph Weber, Online Anomaly Detection on Multivariate Data Streams, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Master's Thesis)
The number of data sources continuously producing fast-changing data streams and needing tailor-made solutions to detect unexpected events increases rapidly. Outlier detection in univariate data streams already receives considerable attention, mainly in financial data, while multivariate anomaly detection, especially without ground truth, is less explored. We present state of the art in anomaly detection in general, its adoption for data streams and techniques for evaluation without ground truth. We implement a density-based clustering algorithm that summarizes multivariate data streams with micro clusters, and we evaluate it on synthetic and real-world data sets. We propose an extension of the algorithm to incorporate data drift to distinguish between pioneers and outliers correctly. The performed experiments show a performance improvement caused by the proposed drift-influence hyperparameters and revealed a correlation between an intrinsic data property and the anomaly detection performance, which allows hyperparameter tuning without ground truth. |
|
Florian Ruosch, When the Turing Test Meets Trust: Comparing Human and AI Explanations, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Master's Thesis)
With the rise of AI, smart technology is taking over many aspects of our lives. We rely on it increasingly more often for simple and also for complex tasks. But do people really trust these smart systems or do they still prefer the old-fashioned human? To answer this question, this work explores trust in AI. We used a neural network as a representative and image classification as an example task that can be performed by a smart system. Is a user's trust in an answer influenced by knowing whether it was given by another human or by an AI? To check for a possible bias, we conducted an experiment in the form of a survey with 900 participants on the crowd-sourcing platform Amazon Mechanical Turk. It pitted labels for images and their visually represented explanations obtained from the neural network against those produced by humans. Using a multi-dimensional scale to measure trust, we gained insights for different settings. They varied regarding the available information: giving the origin of label and explanation versus withholding or disguising sources, e.g. a human-generated label and explanation is presented as coming from AI. We compared the results and found few statistically significant differences between the various setups. This led us to conclude that no clear bias exists toward AI- or human-produced results and that knowledge about the source and the availability thereof does not exhibit a distinct influence on trust of humans in AI. |
|
Jakub Lokoč, Klaus Schoeffmann, Werner Bailer, Luca Rossetto, Cathal Gurrin, Interactive Video Retrieval in the Age of Deep Learning, In: ACM International Conference on Multimedia Retrieval, ACM Press, New York, New York, USA, 2019-07-10. (Conference or Workshop Paper)
We present a tutorial focusing on video retrieval tasks, where state-of-the-art deep learning approaches still benefit from interactive decisions of users. The tutorial covers general introduction to the interactive video retrieval research area, state-of-the-art video retrieval systems, evaluation campaigns and recently observed results. Moreover, a significant part of the tutorial is dedicated to a practical exercise with three selected state-of-the-art systems in the form of an interactive video retrieval competition. Participants of this tutorial will gain a practical experience and also a general insight of the interactive video retrieval topic, which is a good start to focus their research on unsolved challenges in this area. |
|
Ralph Gasser, Luca Rossetto, Heiko Schuldt, Multimodal Multimedia Retrieval with vitrivr, In: ACM International Conference on Multimedia Retrieval, ACM Press, New York, New York, USA, 2019-07-10. (Conference or Workshop Paper)
|
|
Luca Rossetto, Ralph Gasser, Silvan Heller, Mahnaz Amiri Parian, Heiko Schuldt, Retrieval of Structured and Unstructured Data with vitrivr, In: ACM Workshop Lifelog Search Challenge, ACM Press, New York, New York, USA, 2019-07-10. (Conference or Workshop Paper)
|
|
Fabian Berns, Luca Rossetto, Klaus Schoeffmann, Christian Beecks, George Awad, V3C1 Dataset An Evaluation of Content Characteristics, In: ACM International Conference on Multimedia Retrieval, ACM Press, New York, New York, USA, 2019-07-10. (Conference or Workshop Paper)
|
|
Céline Faverjon, Abraham Bernstein, Rolf Grütter, Christina Nathues, Heiko Nathues, Cristina Sarasua, Martin Sterchi, Maria-Elena Vargas, John Berezowski, A Transdisciplinary Approach Supporting the Implementation of a Big Data Project in Livestock Production: An Example From the Swiss Pig Production Industry, Frontiers in Veterinary Science, Vol. 6, 2019. (Journal Article)
Big Data approaches offer potential benefits for improving animal health, but they have not been broadly implemented in livestock production systems. Privacy issues, the large number of stakeholders, and the competitive environment all make data sharing, and integration a challenge in livestock production systems. The Swiss pig production industry illustrates these and other Big Data issues. It is a highly decentralized and fragmented complex network made up of a large number of small independent actors collecting a large amount of heterogeneous data. Transdisciplinary approaches hold promise for overcoming some of the barriers to implementing Big Data approaches in livestock production systems. The purpose of our paper is to describe the use of a transdisciplinary approach in a Big Data research project in the Swiss pig industry. We provide a brief overview of the research project named “Pig Data,” describing the structure of the project, the tools developed for collaboration and knowledge transfer, the data received, and some of the challenges. Our experience provides insight and direction for researchers looking to use similar approaches in livestock production system research.
|
|
Deniz Sarici, Creation of a Catalog of Web Streams, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Bachelor's Thesis)
Data is increasingly published as a stream of data. The TripleWave framework was developed to facilitate the publication of such data, following Linked Data principles. Because TripleWave lacked a standard vocabulary for describing its metadata, we extend TripleWave to use VoCaLS. VoCaLS proposes a standard for describing streams on the web. With the technology for streaming Linked Data being mature enough, we develop a catalog for discovering published streams on the web. The catalog of web streams collects and stores information about web streams and explains how to connect to such a web stream. In order to fill the catalog, we stream several datasets with TripleWave and extend TripleWave to register itself at the catalog.
|
|
Felix Kieber, IncVer - An Incremental Versioning System for OBO Ontologies, University of Zurich, Faculty of Business, Economics and Informatics, 2019. (Master's Thesis)
This master thesis contains an introduction and overview on the field of ontology evolution and ontology versioning, an inspection of the ontology change detection tool COntoDiff and an implementation of the incremental version generation tool IncVer. The fields of ontology evolution and impact analysis are interested in the changes that occur in an ontology. As such, snapshots in time, or versions, are of great interest to researchers. Many ontologies, however, provide only few versions, if at all, and these are often far apart in time and contain hundreds to thousands of changes. These large changes only allow rough analysis of their nature and impact. IncVer is a tool which allows the generation of detailed evolution datasets, taking two input ontology versions and detecting and grouping the changes between these versions. Then, incremental versions are built, one per change action, building from the old version to the new version. IncVer is built on top of COntoDiff and so far supports the OBO ontology format, but is designed to be extensible at its core. In order to achieve this, the IncVer architecture is separated into three components forming a pipeline: The Diff Calculator, the Ordering and the Applying component, responsible for calculating a diff, sorting the resulting diff and applying the changes in that diff, respectively. A base implementation is provided for all three components. To ensure correctness of the results, three conditions were formulated which need to be met for the generated versions to be considered correct. Applying these conditions as metrics, I was able to achieve promising results, demonstrating the applicability of IncVer to ontology versioning and its potential use to the fields of ontology evolution and impact analysis. A Jar distribution of IncVer is provided, encapsulating the base implementation of the pipeline, as well as the evaluation functionality. |
|
Martin Sterchi, Céline Faverjon, Cristina Sarasua, Maria Elena Vargas, John Berezowski, Abraham Bernstein, Rolf Grütter, Heiko Nathues, The pig transport network in Switzerland: Structure, patterns, and implications for the transmission of infectious diseases between animal holdings, PLoS ONE, Vol. 14 (5), 2019. (Journal Article)
The topology of animal transport networks contributes substantially to how fast and to what extent a disease can transmit between animal holdings. Therefore, public authorities in many countries mandate livestock holdings to report all movements of animals. However, the reported data often does not contain information about the exact sequence of transports, making it impossible to assess the effect of truck sharing and truck contamination on disease transmission. The aim of this study was to analyze the topology of the Swiss pig transport network by means of social network analysis and to assess the implications for disease transmission between animal holdings. In particular, we studied how additional information about transport sequences changes the topology of the contact network. The study is based on the official animal movement database in Switzerland and a sample of transport data from one transport company. The results show that the Swiss pig transport network is highly fragmented, which mitigates the risk of a large-scale disease outbreak. By considering the time sequence of transports, we found that even in the worst case, only 0.34% of all farm-pairs were connected within one month. However, both network connectivity and individual connectedness of farms increased if truck sharing and especially truck contamination were considered. Therefore, the extent to which a disease may be transmitted between animal holdings may be underestimated if we only consider data from the official animal movement database. Our results highlight the need for a comprehensive analysis of contacts between farms that includes indirect contacts due to truck sharing and contamination. As the nature of animal transport networks is inherently temporal, we strongly suggest the use of temporal network measures in order to evaluate individual and overall risk of disease transmission through animal transportation. |
|
Wen Zhang, Bibek Paudel, Liang Wang, Jiaoyan Chen, Hai Zhu, Wei Zhang, Abraham Bernstein, Huajun Chen, Iteratively Learning Embeddings and Rules for Knowledge Graph Reasoning, In: The Web Conference, ACM Press, New York, New York, 2019-05-13. (Conference or Workshop Paper published in Proceedings)
|
|
Suzanne Tolmeijer, Markus Kneer, Markus Christen, Trust in human-AI interaction: an empirical exploration, In: Ethical and Legal Aspects of Autonomous Security Systems Conference 2019. 2019. (Conference Presentation)
Technological advances allow progressively more autonomous systems to become part of our society. Such systems can be especially useful when time pressure and uncertainty are part of a decision-making process, e.g. in a security context.
However, by using such system, there is a risk that the output of the system does not match ethical expectation, e.g. because a suboptimal solution is selected or collateral damage occurs. This has two implications. Firstly, the actual advice or action the system performs should be as we prefer it to be. Secondly, the user needs to perceive the system as an ethical and trustworthy partner in the decision-making process, to ensure the system is actually used. This project focuses on the latter, and contributes to the further elaboration of empirical issues raised by the White Paper “Evaluation Schema for the Ethical Use of Autonomous Robotic Systems in Security Applications”.
While there has been research on autonomous systems and ethics, the field is still very much developing. To our knowledge, the following specific factors in this research have not been combined before: different levels of autonomy in search and rescue scenarios, uncertainty and time pressure in ethical decision-making, and trust.
In order to investigate the interplay of those factors, we use a multidisciplinary and experimental approach. Compared to standard experimental ethics that is usually vignette-based, we will present morally challenging scenarios to participants in a simulation. This setting allows more immersion into the ethical scenario and adds the human interaction component, which is important to research the perception and expectations of the user. Currently, an experimental setup is designed together with a simulation prototype; the experiment is going to take place with search and rescue recruits of the Swiss army. They will participate in simulations involving the use of drones controlled by the participants in two setting: a rescue mission where a limited number can be saved and a prevention mission (bringing down a terror drone) where there will be some casualties. The system will either provide decision support for a given scenario or autonomously take a decision on what to do; the user only has a veto option. After each scenario, question will be asked on ethical acceptability, ethical responsibility and trust. At the conference, we will present results of pretesting of different scenarios and we will further outline our research program.
The results of this research should ultimately shape guidelines on how to build ethically trustworthy autonomous systems.
|
|
Ivan Giangreco, Loris Sauter, Mahnaz Amiri Parian, Ralph Gasser, Silvan Heller, Luca Rossetto, Heiko Schuldt, VIRTUE - a virtual reality museum Experience, In: the 24th International Conference, ACM Press, New York, New York, USA, 2019-04-16. (Conference or Workshop Paper)
|
|