Suzanne Tolmeijer, Vicky Arpatzoglou, Luca Rossetto, Abraham Bernstein, Trolleys, crashes, and perception - a survey on how current autonomous vehicles debates invoke problematic expectations, AI and Ethics, Vol. 4 (2), 2024. (Journal Article)
Ongoing debates about ethical guidelines for autonomous vehicles mostly focus on variations of the ‘Trolley Problem’. Using variations of this ethical dilemma in preference surveys, possible implications for autonomous vehicles policy are discussed. In this work, we argue that the lack of realism in such scenarios leads to limited practical insights. We run an ethical preference survey for autonomous vehicles by including more realistic features, such as time pressure and a non-binary decision option. Our results indicate that such changes lead to different outcomes, calling into question how the current outcomes can be generalized. Additionally, we investigate the framing effects of the capabilities of autonomous vehicles and indicate that ongoing debates need to set realistic expectations on autonomous vehicle challenges. Based on our results, we call upon the field to re-frame the current debate towards more realistic discussions beyond the Trolley Problem and focus on which autonomous vehicle behavior is considered not to be acceptable, since a consensus on what the right solution is, is not reachable. |
|
Francesco Barile, Tim Draws, Oana Inel, Alisa Rieger, Shabnam Najafian, Amir Ebrahimi Fard, Rishav Hada, Nava Tintarev, Evaluating explainable social choice-based aggregation strategies for group recommendation, User modeling and user-adapted interaction, Vol. 34 (1), 2024. (Journal Article)
Social choice aggregation strategies have been proposed as an explainable way to generate recommendations to groups of users. However, it is not trivial to determine the best strategy to apply for a specific group. Previous work highlighted that the performance of a group recommender system is affected by the internal diversity of the group members’ preferences. However, few of them have empirically evaluated how the specific distribution of preferences in a group determines which strategy is the most effective. Furthermore, only a few studies evaluated the impact of providing explanations for the recommendations generated with social choice aggregation strategies, by evaluating explanations and aggregation strategies in a coupled way. To fill these gaps, we present two user studies (N=399 and N=288) examining the effectiveness of social choice aggregation strategies in terms of users’ fairness perception, consensus perception, and satisfaction. We study the impact of the level of (dis-)agreement within the group on the performance of these strategies. Furthermore, we investigate the added value of textual explanations of the underlying social choice aggregation strategy used to generate the recommendation. The results of both user studies show no benefits in using social choice-based explanations for group recommendations. However, we find significant differences in the effectiveness of the social choice-based aggregation strategies in both studies. Furthermore, the specific group configuration (i.e., various scenarios of internal diversity) seems to determine the most effective aggregation strategy. These results provide useful insights on how to select the appropriate aggregation strategy for a specific group based on the level of (dis-)agreement within the group members’ preferences. |
|
Jakub Lokoč, Stelios Andreadis, Werner Bailer, Aaron Duane, Cathal Gurrin, Zhixin Ma, Nicola Messina, Thao-Nhu Nguyen, Ladislav Peška, Luca Rossetto, Loris Sauter, Konstantin Schall, Klaus Schoeffmann, Omar Shahbaz Khan, Florian Spiess, Lucia Vadicamo, Stefanos Vrochidis, Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS, Multimedia Systems, Vol. 29 (6), 2023. (Journal Article)
This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is presented in connection with an analysis of the performance of participating teams. Specifically, both high-level performance indicators are presented with overall statistics as well as in-depth analysis of the performance of selected tools implementing result set logging. The analysis reveals evidence that the CLIP model represents a versatile tool for cross-modal video retrieval when combined with interactive search capabilities. Furthermore, the analysis investigates the effect of different users and text query properties on the performance in search tasks. Last but not least, lessons learned from search task preparation are presented, and a new direction for ad-hoc search based tasks at Video Browser Showdown is introduced. |
|
Oana Inel, Tim Draws, Lora Aroyo, Collect, measure, repeat: Reliability factors for responsible AI data collection, In: Eleventh AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2023), Association for the Advancement of Artificial Intelligence, Delft, the Netherlands, 2023-11-06. (Conference or Workshop Paper published in Proceedings)
The rapid entry of machine learning approaches in our dailyactivities and high-stakes domains demands transparency andscrutiny of their fairness and reliability. To help gauge ma-chine learning models’ robustness, research typically focuseson the massive datasets used for their deployment,e.g., cre-ating and maintaining documentation to understand theirorigin, process of development, and ethical considerations.However, data collection for AI is still typically a one-offpractice, and oftentimes datasets collected for a certain pur-pose or application are reused for a different problem. Addi-tionally, dataset annotations may not be representative overtime, contain ambiguous or erroneous annotations, or be un-able to generalize across domains. Recent research has shownthese practices might lead to unfair, biased, or inaccurate out-comes. We argue that data collection for AI should be per-formed in a responsible manner where the quality of the datais thoroughly scrutinized and measured through a systematicset of appropriate metrics. In this paper, we propose a Re-sponsible AI (RAI) methodology designed to guide the datacollection with a set of metrics for an iterative in-depth analy-sis of thefactors influencing the quality and reliabilityof thegenerated data. We propose a granular set of measurements toinform on theinternal reliabilityof a dataset and itsexternalstabilityover time. We validate our approach across nine ex-isting datasets and annotation tasks and four input modalities.This approach impacts theassessment of data robustnessusedin real world AI applications, where diversity of users andcontent is eminent. Furthermore, it deals with fairness andaccountability aspects in data collection by providing system-atic and transparent quality analysis for data collections. |
|
Loris Sauter, Tim Bachmann, Luca Rossetto, Heiko Schuldt, Spatially Localised Immersive Contemporary and Historic Photo Presentation on Mobile Devices in Augmented Reality, In: MM '23: The 31st ACM International Conference on Multimedia, ACM Digital Library, New York, NY, USA, 2023-11-02. (Conference or Workshop Paper published in Proceedings)
These days, taking a photo is the most common way of capturing a moment. Some of these photos captured in the moment are never to be seen again. Others are almost immediately shared with the world. Yet, the context of the captured moment can only be shared to a limited extent. The continuous improvement of mobile devices has not only led to higher resolution cameras and, thus, visually more appealing pictures but also to a broader and more precise range of accompanying sensor metadata. Positional and bearing information can provide context for photos and is thus an integral aspect of the captured moment. However, it is commonly only used to sort photos by time and possibly group by place. Such more precise sensor metadata, combined with the increased computing power of mobile devices, can enable more and more powerful Augmented Reality (AR) capabilities, especially for communicating the context of a captured photo. Users can thereby witness the captured moment in its real location and also experience its spatial contextualization. With the help of a suitable data augmentation, such context-preserving presentation can be extended even to non-digitally born content, including historical images. This offers new immersive ways to experience the cultural history of one's current location. In this paper, we present an approach for location-based image presentation in AR on mobile devices. With this approach, users can experience captured moments in their physical context. We demonstrate the power of this approach based on a prototype implementation and evaluate it in a user study. |
|
Oana Inel and
Nicolas Mattis and
Milda Norkute and
Alessandro Piscopo and
Timoth\'ee Schmude and
Sanne Vrijenhoek and
Krisztian Balog, QUARE: 2nd Workshop on Measuring the Quality of Explanations in Recommender Systems, In: Proceedings of the 17th ACM Conference on Recommender Systems, RecSys 2023, Singapore, Singapore, September 18-22, 2023, ACM, 2023. (Conference or Workshop Paper)
|
|
Mahnaz Parian-Scherb, Peter Uhrig, Luca Rossetto, Stephane Dupont, Heiko Schuldt, Gesture retrieval and its application to the study of multimodal communication, International journal on digital libraries, 2023. (Journal Article)
Comprehending communication is dependent on analyzing the different modalities of conversation, including audio, visual, and others. This is a natural process for humans, but in digital libraries, where preservation and dissemination of digital information are crucial, it is a complex task. A rich conversational model, encompassing all modalities and their co-occurrences, is required to effectively analyze and interact with digital information. Currently, the analysis of co-speech gestures in videos is done through manual annotation by linguistic experts based on textual searches. However, this approach is limited and does not fully utilize the visual modality of gestures. This paper proposes a visual gesture retrieval method using a deep learning architecture to extend current research in this area. The method is based on body keypoints and uses an attention mechanism to focus on specific groups. Experiments were conducted on a subset of the NewsScape dataset, which presents challenges such as multiple people, camera perspective changes, and occlusions. A user study was conducted to assess the usability of the results, establishing a baseline for future gesture retrieval methods in real-world video collections. The results of the experiment demonstrate the high potential of the proposed method in multimodal communication research and highlight the significance of visual gesture retrieval in enhancing interaction with video content. The integration of visual similarity search for gestures in the open-source multimedia retrieval stack, vitrivr, can greatly contribute to the field of computational linguistics. This research advances the understanding of the role of the visual modality in co-speech gestures and highlights the need for further development in this area. |
|
Martin Sterchi, Lorenz Hilfiker, Rolf Grütter, Abraham Bernstein, Active querying approach to epidemic source detection on contact networks, Scientific Reports, Vol. 13 (1), 2023. (Journal Article)
The problem of identifying the source of an epidemic (also called patient zero) given a network of contacts and a set of infected individuals has attracted interest from a broad range of research communities. The successful and timely identification of the source can prevent a lot of harm as the number of possible infection routes can be narrowed down and potentially infected individuals can be isolated. Previous research on this topic often assumes that it is possible to observe the state of a substantial fraction of individuals in the network before attempting to identify the source. We, on the contrary, assume that observing the state of individuals in the network is costly or difficult and, hence, only the state of one or few individuals is initially observed. Moreover, we presume that not only the source is unknown, but also the duration for which the epidemic has evolved. From this more general problem setting a need to query the state of other (so far unobserved) individuals arises. In analogy with active learning, this leads us to formulate the active querying problem. In the active querying problem, we alternate between a source inference step and a querying step. For the source inference step, we rely on existing work but take a Bayesian perspective by putting a prior on the duration of the epidemic. In the querying step, we aim to query the states of individuals that provide the most information about the source of the epidemic, and to this end, we propose strategies inspired by the active learning literature. Our results are strongly in favor of a querying strategy that selects individuals for whom the disagreement between individual predictions, made by all possible sources separately, and a consensus prediction is maximal. Our approach is flexible and, in particular, can be applied to static as well as temporal networks. To demonstrate our approach’s practical importance, we experiment with three empirical (temporal) contact networks: a network of pig movements, a network of sexual contacts, and a network of face-to-face contacts between residents of a village in Malawi. The results show that active querying strategies can lead to substantially improved source inference results as compared to baseline heuristics. In fact, querying only a small fraction of nodes in a network is often enough to achieve a source inference performance comparable to a situation where the infection states of all nodes are known. |
|
Luca Rossetto, Oana Inel, Svenja Lange, Florian Ruosch, Ruijie Wang, Abraham Bernstein, Multi-Mode Clustering for Graph-Based Lifelog Retrieval, In: ICMR '23: International Conference on Multimedia Retrieval, ACM Digital library, New York, NY, USA, 2023-07-12. (Conference or Workshop Paper published in Proceedings)
As part of the 6th Lifelog Search Challenge, this paper presents an approach to arrange Lifelog data in a multi-modal knowledge graph based on cluster hierarchies. We use multiple sequence clustering approaches to address the multi-modal nature of Lifelogs in relation to temporal, spatial, and visual factors. The resulting clusters, along with semantic metadata captions and augmentations based on OpenCLIP, provide for the semantic structure of a graph including all Lifelogs as entries. Textual queries on this hierarchical graph can be expressed to retrieve individual Lifelogs, as well as clusters of Lifelogs. |
|
Florian Spiess, Ralph Gasser, Heiko Schuldt, Luca Rossetto, The Best of Both Worlds: Lifelog Retrieval with a Desktop-Virtual Reality Hybrid System, In: ICMR '23: International Conference on Multimedia Retrieval, ACM Digital library, New York, NY, USA, 2023-07-12. (Conference or Workshop Paper published in Proceedings)
Personal lifelog data collections are becoming more common as a memory aid, as well as for analytical tasks, such as health and fitness analysis. Due to the multimodal and personal nature of lifelog data, interactive multimedia retrieval approaches are required to facilitate flexible and iterative query formulation and result exploration for retrieval and analysis. In recent years, novel user interface modalities have emerged, that allow new ways for users to interact with a retrieval system. Virtual reality, one such new modality, provides advantages as well as challenges for interactive multimedia retrieval in comparison to conventional desktop-based interfaces.
This paper describes a novel desktop-virtual reality hybrid system participating in the Lifelog Search Challenge 2023. The system, which is based on the components of the vitrivr stack, is described with a focus on query formulation in the web-based desktop user interface vitrivr-ng, and result exploration in the virtual reality-based vitrivr-VR. |
|
Cataldo Musto, Amra Delic, Oana Inel, Marco Polignano, Amon Rapp, Giovanni Semeraro, Jürgen Ziegler, 5th Workshop on Explainable User Models and Personalised Systems (ExUM), In: Adjunct Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, 2023. (Conference or Workshop Paper)
|
|
Dhivyabharathi Ramasamy, Cristina Sarasua, Alberto Bacchelli, Abraham Bernstein, Visualising data science workflows to support third-party notebook comprehension: an empirical study, Empirical Software Engineering, Vol. 28 (3), 2023. (Journal Article)
Data science is an exploratory and iterative process that often leads to complex and unstructured code. This code is usually poorly documented and, consequently, hard to understand by a third party. In this paper, we first collect empirical evidence for the non-linearity of data science code from real-world Jupyter notebooks, confirming the need for new approaches that aid in data science code interaction and comprehension. Second, we propose a visualisation method that elucidates implicit workflow information in data science code and assists data scientists in navigating the so-called garden of forking paths in non-linear code. The visualisation also provides information such as the rationale and the identification of the data science pipeline step based on cell annotations. We conducted a user experiment with data scientists to evaluate the proposed method, assessing the influence of (i) different workflow visualisations and (ii) cell annotations on code comprehension. Our results show that visualising the exploration helps the users obtain an overview of the notebook, significantly improving code comprehension. Furthermore, our qualitative analysis provides more insights into the difficulties faced during data science code comprehension. |
|
Abraham Bernstein, Anita Gohdes, Cristina Sarasua, Steffen Staab, Beth Simone Noveck, Challenges and opportunities of democracy in the digital society: report from Dagstuhl Seminar 22361, Dagstuhl Manifestos, Vol. 12 (9), 2023. (Journal Article)
Digital technologies amplify and change societal processes. So far, society and intellectuals have painted two extremes of viewing the effects of the digital transformation on democratic life. While the early 2000s to mid-2010s declared the "liberating" aspects of digital technology, the post-Brexit events and the 2016 US elections have emphasized the "dark side" of the digital revolution. Now, explicit effort is needed to go beyond tech saviorism or doom scenarios.
To this end, we organized the Dagstuhl Seminar 22361 "Challenges and Opportunities of Democracy in the Digital Society" to discuss the future of digital democracy.
This report presents a summary of the seminar, which took place in Dagstuhl in September 2022. The seminar attracted scientific scholars from various disciplines, including political science, computer science, jurisprudence, and communication science, as well as civic technology practitioners. |
|
Tim Draws, Nirmal Roy, Oana Inel, Alisa Rieger, Rishav Hada, Mehmet Orcun Yalcin, Benjamin Timmermans, Nava Tintarev, Viewpoint diversity in search results, In: European Conference on Information Retrieval, 2023. (Conference or Workshop Paper published in Proceedings)
|
|
Loris Sauter, Ralph Gasser, Silvan Heller, Luca Rossetto, Colin Saladin, Florian Spiess, Heiko Schuldt, Exploring Effective Interactive Text-Based Video Search in vitrivr, In: MultiMedia Modeling, Springer, Cham, p. 646 - 651, 2023-03-29. (Book Chapter)
vitrivr is a general purpose retrieval system that supports a wide range of query modalities. In this paper, we briefly introduce the system and describe the changes and adjustments made for the 2023 iteration of the video browser showdown. These focus primarily on text-based retrieval schemes and corresponding user-feedback mechanisms. |
|
Fynn Bachmann, Philipp Hennig, Dmitry Kobak, Wasserstein t-SNE, In: Machine Learning and Knowledge Discovery in Databases, Springer, Switzerland, p. 104 - 120, 2023-03-16. (Book Chapter)
Scientific datasets often have hierarchical structure: for example, in surveys, individual participants (samples) might be grouped at a higher level (units) such as their geographical region. In these settings, the interest is often in exploring the structure on the unit level rather than on the sample level. Units can be compared based on the distance between their means, however this ignores the within-unit distribution of samples. Here we develop an approach for exploratory analysis of hierarchical datasets using the Wasserstein distance metric that takes into account the shapes of within-unit distributions. We use t-SNE to construct 2D embeddings of the units, based on the matrix of pairwise Wasserstein distances between them. The distance matrix can be efficiently computed by approximating each unit with a Gaussian distribution, but we also provide a scalable method to compute exact Wasserstein distances. We use synthetic data to demonstrate the effectiveness of our Wasserstein t-SNE, and apply it to data from the 2017 German parliamentary election, considering polling stations as samples and voting districts as units. The resulting embedding uncovers meaningful structure in the data. |
|
Ly-Duyen Tran, Manh-Duy Nguyen, Duc-Tien Dang-Nguyen, Silvan Heller, Florian Spiess, Jakub Lokoc, Ladislav Peska, Thao-Nhu Nguyen, Omar Shahbaz Khan, Aaron Duane, Bjorn Tor Jonsson, Luca Rossetto, An-Zi Yen, Ahmed Alateeq, Naushad Alam, Minh-Triet Tran, Graham Healy, Klaus Schoeffmann, Cathal Gurrin, Comparing Interactive Retrieval Approaches at the Lifelog Search Challenge 2021, IEEE Access, Vol. 11, 2023. (Journal Article)
The Lifelog Search Challenge (LSC) is an interactive benchmarking evaluation workshop for lifelog retrieval systems. The challenge was first organised in 2018 aiming to find the system that can quickly retrieve relevant lifelog images for a given semantic query. This paper provides an analysis of the performance of all 17 systems participating in the 4th LSC workshop held at the 2021 Annual ACM International Conference on Multimedia Retrieval (ICMR). LSC’21 was the largest effort at comparing different approaches to interactive lifelog retrieval systems seen thus far. Findings from the challenge suggest that many different interactive factors contribute to the success (or otherwise) of participating teams. In this paper, we provide an overview of the LSC’21 challenge, introduce each team’s approach and explore these factors in depth and offer clues on how to develop a high-performing interactive lifelog search engine. |
|
Chandrayee Basu, Rosni Vasu, Michihiro Yasunaga, Qian Yang, Med-easi: Finely annotated dataset and models for controllable simplification of medical texts, arXiv preprint arXiv:2302.09155, 2023. (Journal Article)
Automatic medical text simplification can assist providers
with patient-friendly communication and make medical texts
more accessible, thereby improving health literacy. But curating a quality corpus for this task requires the supervision of medical experts. In this work, we present MedEASi (Medical dataset for Elaborative and Abstractive
Simplification), a uniquely crowdsourced and finely annotated dataset for supervised simplification of short medical
texts. Its expert-layman-AI collaborative annotations facilitate controllability over text simplification by marking four
kinds of textual transformations: elaboration, replacement,
deletion, and insertion. To learn medical text simplification,
we fine-tune T5-large with four different styles of inputoutput combinations, leading to two control-free and two controllable versions of the model. We add two types of controllability into text simplification, by using a multi-angle training approach: position-aware, which uses in-place annotated
inputs and outputs, and position-agnostic, where the model
only knows the contents to be edited, but not their positions.
Our results show that our fine-grained annotations improve
learning compared to the unannotated baseline. Furthermore,
position-aware control generates better simplification than
the position-agnostic one. The data and code are available at
https://github.com/Chandrayee/CTRL-SIMP. |
|
Viktor Lakics, Luca Rossetto, Abraham Bernstein, Link-Rot in Web-Sourced Multimedia Datasets, In: MultiMedia Modeling, Springer, Cham, p. 476 - 488, 2023. (Book Chapter)
The Web is increasingly used as a source for content of datasets of various types, especially multimedia content. These datasets are then often distributed as a collection of URLs, pointing to the original sources of the elements. As these sources go offline over time, the datasets experience decay in the form of link-rot. In this paper, we analyze 24 Web-sourced datasets with a combined total of over 270 million URLs and find that over 20% of the content is no longer available. We discuss the adverse effects of this decay on the reproducibility of work based on such data and make some recommendations on how they could be mediated in the future. |
|
Florian Spiess, Silvan Heller, Luca Rossetto, Loris Sauter, Philipp Weber, Heiko Schuldt, Traceable Asynchronous Workflows in Video Retrieval with vitrivr-VR, In: MultiMedia Modeling, Springer, Cham, p. 622 - 627, 2023. (Book Chapter)
Virtual reality (VR) interfaces allow for entirely new modes of user interaction with systems and interfaces. Much like in physical workspaces, documents, tools, and interfaces can be used, put aside, and used again later. Such asynchronous workflows are a great advantage of virtual environments, as they enable users to perform multiple tasks in an interleaved manner. However, VR interfaces also face new challenges, such as text input without physical keyboards, and the analysis of such asynchronous workflows. In this paper we present the version of vitrivr-VR participating in the Video Browser Showdown (VBS) 2023. We describe the current state of our system, with a focus on improvements in text input methods and logging of asynchronous workflows. |
|