Luca Rossetto, Matthias Baumgartner, Ralph Gasser, Lucien Heitz, Ruijie Wang, Abraham Bernstein, Exploring Graph-querying approaches in LifeGraph, In: ICMR '21: International Conference on Multimedia Retrieval, ACM, New York, NY, USA, 2021-09-21. (Conference or Workshop Paper published in Proceedings)
The multi-modal and interrelated nature of lifelog data makes it well suited for graph-based representations. In this paper, we present the second iteration of LifeGraph, a Knowledge Graph for Lifelog Data, initially introduced during the 3rd Lifelog Search Challenge in 2020. This second iteration incorporates several lessons learned from the previous version. While the actual graph has undergone only small changes, the mechanisms by which it is traversed during querying as well as the underlying storage system which performs the traversal have been changed. The means for query formulation have also been slightly extended in capability and made more efficient and intuitive. All these changes have the aim of improving result quality and reducing query time. |
|
Silvan Heller, Ralph Gasser, Mahnaz Parian-Scherb, Sanja Popovic, Luca Rossetto, Loris Sauter, Florian Spiess, Heiko Schuldt, Interactive Multimodal Lifelog Retrieval with vitrivr at LSC 2021, In: ICMR '21: International Conference on Multimedia Retrieval, ACM, New York, NY, USA, 2021-09-21. (Conference or Workshop Paper published in Proceedings)
|
|
Florian Spiess, Ralph Gasser, Silvan Heller, Luca Rossetto, Loris Sauter, Milan van Zanten, Heiko Schuldt, Exploring Intuitive Lifelog Retrieval and Interaction Modes in Virtual Reality with vitrivr-VR, In: ICMR '21: International Conference on Multimedia Retrieval, ACM, New York, NY, USA, 2021-09-21. (Conference or Workshop Paper published in Proceedings)
|
|
Nick R. Kipfer, Automatic Selection of Illustrative Pictures for News Articles, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Bachelor's Thesis)
In this thesis, two different models were implemented for the selection of illustrative images for news articles: the MUSE model and the Xception model. The MUSE model is based on the Multilingual Universal Sentence Encoder, while the Xception model is based on a multi-modal embedding structure building upon the MUSE model. The two models were compared and the MUSE model did perform better in terms of creating useful image recommendations. A user study was conducted for the MUSE model, which produced mixed results. From a developer perspective the DDIS use case requirements were missed, when only considering a single image recommendation. This is due to high variance in quality between the MUSE models image recommendations. If the require- ments are softened slightly, such that a small range of images could be recommended instead of single one, the MUSE model is almost guaranteed to give at least one useful prediction. |
|
Lawand Muhamad, Approximate Boolean Retrieval, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
The standard interpretation of the logical operators in the Boolean model is often either too strict or too open. A query containing several with AND connected terms is often too narrow, while a query containing several with OR connected terms is often too broad. As such, if the descriptors of the entries are incomplete or information is missing beforehand, the traditional Boolean query rarely comes close to retrieving all and only those items which are relevant to the user.
To address the limitations of the traditional Boolean model, this work presents the design and implementation of an extended Boolean model in vitrivr, a multimedia retrieval system supporting the vector space and the traditional Boolean model. Besides UI improvemens, additions made to the model consist of (i) weighted query terms, adding the possibility to weight with OR connected terms, (ii) term preferences, a functionality to set additional terms only as soft preferences rather than hard requirements, (iii) late stage weighting, a mechanism allowing to increase or decrease the weight of the Boolean score relative to other vector space features in vitrivr. Based on the HAM10000 data set consisting of dermatoscopic images with associated metadata, the extended model was evaluated by measuring the precision in retrieving the relevant results. It could be shown that the model could address many drawbacks of the traditional Boolean model and an increase in retrieving the relevant results from a Boolean query can be achieved. |
|
Amos-Madalin Neculau, Multi-Domain Media Segmentation, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
When analyzing multimedia materials, particularly audio and video, it is uncommon for a produced annotation to make reference to the whole content. Frequently, it is much more beneficial to refer to a particular section of the text. Segmentation may take place in a variety of domains, including spatial, temporal, frequency, and any combination thereof. While many segmentation methods are used in isolation for diverse purposes, there is currently no uniform representation that enables the concurrent use and mixing of various segmentation methodologies inside the same application. As a result, the present effort is focused on developing a model that "fits everything".
The thesis makes the following contributions: it studies segmentation methods in the context of several modalities (video, audio, multi-modal). Additionally, it provides an abstract segmentation paradigm that is applicable regardless of the modality utilized. Moreover, it offers a new technique of multimedia retrieval, mostly tested on video, that is based on areas of interest identified using a multitude of segmentation algorithms that are explained in detail. |
|
Romana Pernisch, Daniele Dell’Aglio, Abraham Bernstein, Beware of the hierarchy — An analysis of ontology evolution and the materialisation impact for biomedical ontologies, Journal of Web Semantics, Vol. 70, 2021. (Journal Article)
Ontologies are becoming a key component of numerous applications and research fields. But knowledge captured within ontologies is not static. Some ontology updates potentially have a wide ranging impact; others only affect very localised parts of the ontology and their applications. Investigating the impact of the evolution gives us insight into the editing behaviour but also signals ontology engineers and users how the ontology evolution is affecting other applications. However, such research is in its infancy. Hence, we need to investigate the evolution itself and its impact on the simplest of applications: the materialisation.
In this work, we define impact measures that capture the effect of changes on the materialisation. In the future, the impact measures introduced in this work can be used to investigate how aware the ontology editors are about consequences of changes. By introducing five different measures, which focus either on the change in the materialisation with respect to the size or on the number of changes applied, we are able to quantify the consequences of ontology changes. To see these measures in action, we investigate the evolution and its impact on materialisation for nine open biomedical ontologies, most of which adhere to the description logic.
Our results show that these ontologies evolve at varying paces but no statistically significant difference between the ontologies with respect to their evolution could be identified. We identify three types of ontologies based on the types of complex changes which are applied to them throughout their evolution. The impact on the materialisation is the same for the investigated ontologies, bringing us to the conclusion that the effect of changes on the materialisation can be generalised to other similar ontologies. Further, we found that the materialised concept inclusion axioms experience most of the impact induced by changes to the class inheritance of the ontology and other changes only marginally touch the materialisation. |
|
Terézia Bucková, Supervised and Unsupervised Alignment of Knowledge Graphs with pre-trained embeddings, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
Knowledge Graphs (KGs), directed graphs representing real-world objects and relations between them, have gained significant attention in the past few years, and progress has been made to construct such KGs in various contexts. However, no current KG holds the complete knowledge and in order to obtain a holistic view about an entity of interest, one must therefore gather data from multiple KGs. This usually means to align different KGs and to figure out which entities refer to the same real-world objects. The alignment algorithms often benefit from aligning a KG embeddings, in which case every entity, and possibly relation, is represented by an embedding vector. The embedding methods and embedding-based word alignment techniques in language processing have been researched for a longer period of time. This effort has led to more accurate assumptions about embedding spaces and high performance in alignment tasks in both supervised and unsupervised scenarios. In our work, we test state-of-the-art word embedding alignment methods using KG embedding spaces as input data. We show that typical word alignment methods are on par with typical KG alignment methods in terms of their hits@k score. Moreover, word alignment methods balance the results so that correctly aligned entities are mutual nearest neighbours in the aligned embedding spaces. In addition, we investigate the effect of various embedding models on KG alignment and conclude that the choice of the embedding model has a large impact on the final alignment results. At the same time, we challenge the assumption that both KGs have to be embedded by two instances of the same embedding model and show that embedding them with different models yields results up to 20 percentage points worse at hits@k. |
|
Mahnaz Parian, Claire Walzer, Luca Rossetto, Silvan Heller, Stephane Dupont, Heiko Schuldt, Gesture of Interest: Gesture Search for Multi-Person, Multi-Perspective TV Footage, In: 2021 International Conference on Content-Based Multimedia Indexing (CBMI), IEEE, 2021-07-28. (Conference or Workshop Paper published in Proceedings)
In real-world datasets, specifically in TV recordings, videos are often multi-person and multi-angle, which poses significant challenges for gesture recognition and retrieval. In addition to being of interest to linguists, gesture retrieval is a novel and challenging application for multimedia retrieval. In this paper, we propose a novel method for spatio-temporal gesture retrieval based on visual and pose information which can retrieve similar gestures in multi-person scenes through continuous shots. The attention-aware features, extracted from human pose key-points, together with a sophisticated pre-processing module, alleviate the susceptibility of gesture retrieval to background noise and occlusion. We have evaluated our method on a subset of the NewsScape Dataset. Our experimental results demonstrate the effectiveness of the proposed method in retrieving similar results in occluded scenes as measured by the quality of the top 5 results. |
|
Florian Spiess, Ralph Gasser, Silvan Heller, Luca Rossetto, Loris Sauter, Heiko Schuldt, Competitive interactive video retrieval in virtual reality with vitrivr-VR, In: International Conference on Multimedia Modeling, Springer, 2021-07-22. (Conference or Workshop Paper published in Proceedings)
Virtual Reality (VR) has emerged and developed as a new modality to interact with multimedia data. In this paper, we present vitrivr-VR, a prototype of an interactive multimedia retrieval system in VR based on the open source full-stack multimedia retrieval system vitrivr. We have implemented query formulation tailored to VR: Users can use speech-to-text to search collections via text for concepts, OCR and ASR data as well as entire scene descriptions through a video-text co-embedding feature that embeds sentences and video sequences into the same feature space. Result presentation and relevance feedback in vitrivr-VR leverages the capabilities of virtual spaces.
Keywords
Video Browser Showdown Virtual Reality Interactive video retrieval |
|
Luca Rossetto, Matthias Baumgartner, Narges Ashena, Florian Ruosch, Romana Pernisch, Lucien Heitz, Abraham Bernstein, VideoGraph – Towards Using Knowledge Graphs for Interactive Video Retrieval, In: International Conference on Multimedia Modeling, Springer, 2021-07-22. (Conference or Workshop Paper published in Proceedings)
Video is a very expressive medium, able to capture a wide variety of information in different ways. While there have been many advances in the recent past, which enable the annotation of semantic concepts as well as individual objects within video, their larger context has so far not extensively been used for the purpose of retrieval. In this paper, we introduce the first iteration of VideoGraph, a knowledge graph-based video retrieval system. VideoGraph combines information extracted from multiple video modalities with external knowledge bases to produce a semantically enriched representation of the content in a video collection, which can then be retrieved using graph traversal. For the 2021 Video Browser Showdown, we show the first proof-of-concept of such a graph-based video retrieval approach.
Keywords
Interactive video retrieval Knowledge-graphs Multi-modal graphs |
|
Luca Rossetto, Ralph Gasser, Loris Sauter, Abraham Bernstein, Heiko Schuldt, A System for Interactive Multimedia Retrieval Evaluations, In: International Conference on Multimedia Modeling, Springer, 2021-07-22. (Conference or Workshop Paper published in Proceedings)
The evaluation of the performance of interactive multimedia retrieval systems is a methodologically non-trivial endeavour and requires specialized infrastructure. Current evaluation campaigns have so far relied on a local setting, where all retrieval systems needed to be evaluated at the same physical location at the same time. This constraint does not only complicate the organization and coordination but also limits the number of systems which can reasonably be evaluated within a set time frame. Travel restrictions might further limit the possibility for such evaluations. To address these problems, evaluations need to be conducted in a (geographically) distributed setting, which was so far not possible due to the lack of supporting infrastructure. In this paper, we present the Distributed Retrieval Evaluation Server (DRES), an open-source evaluation system to facilitate evaluation campaigns for interactive multimedia retrieval systems in both traditional on-site as well as fully distributed settings which has already proven effective in a competitive evaluation. |
|
Alexander Theus, Scene Text Extraction for Retrieval of Visual Multimedia, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Bachelor's Thesis)
The expansion of multimedia collections has made the quest for accessing the knowledge contained within them ever more onerous, and has rendered prior annotation unfeasible. As a consequence, vitrivr was developed which enables content-based retrieval via methods such as Query-by-Sketch, Query-by-Example, and many more. A yet unexplored piece of knowledge contained in visual multimedia is scene text. Textual information embedded in visual multimedia provides high-level semantic information about the content and context of the media, and can be leveraged for superior retrieval. For this purpose, this thesis explored and evaluated existing methods for scene text extraction in still images. Furthermore, a novel scene text extractor for videos called HyText was developed, which achieved state-of-the-art performance in my evaluation. The novelty of the proposed method relies on hybridizing tracking-by-detection and particle filtering to allow for enhanced inference time. The proposed method is implemented in vitrivr to enable the extraction and retrieval of scene text. |
|
Badrie Leonardas Persaud, Human Perception of Privacy: Visualizing Epsilon for Differential Privacy, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
Privacy is becoming increasingly important, especially when handling data for dataanalytics tasks. Differential Privacy is often provided as a solution to guarantee privacy and is used by many large companies today. The parameter epsilon (") is at the core of Differential Privacy and controls the trade-off between utility and privacy. The goal of this study is two-fold, firstly, to designed and come up with representations of epsilon for the layman, secondly, to investigate what range of epsilon values are preferred for different scenarios. To achieve this goal, we ran an online survey with 29 participants and found that people are motivated by personal financial incentives more than financial gain of their community and care more about their own privacy than the people around them. |
|
Romana Pernisch, Daniele Dell' Aglio, Abraham Bernstein, Beware of the Hierarchy - An Analysis of Ontology Evolution and the Materialisation Impact for Biomedical Ontologies, Journal of Web Semantics, Vol. 70C, 2021. (Journal Article)
Ontologies are becoming a key component of numerous applications and research fields. But knowledge captured within ontologies is not static. Some ontology updates potentially have a wide ranging impact; others only affect very localised parts of the ontology and their applications. Investigating the impact of the evolution gives us insight into the editing behaviour but also signals ontology engineers and users how the ontology evolution is affecting other applications. However, such research is in its infancy. Hence, we need to investigate the evolution itself and its impact on the simplest of applications: the materialisation. In this work, we define impact measures that capture the effect of changes on the materialisation. In the future, the impact measures introduced in this work can be used to investigate how aware the ontology editors are about consequences of changes. By introducing five different measures, which focus either on the change in the materialisation with respect to the size or on the number of changes applied, we are able to quantify the consequences of ontology changes. To see these measures in action, we investigate the evolution and its impact on materialisation for nine open biomedical ontologies, most of which adhere to the EL++ description logic. Our results show that these ontologies evolve at varying paces but no statistically significant difference between the ontologies with respect to their evolution could be identified. We identify three types of ontologies based on the types of complex changes which are applied to them throughout their evolution. The impact on the materialisation is the same for the investigated ontologies, bringing us to the conclusion that the effect of changes on the materialisation can be generalised to other similar ontologies. Further, we found that the materialised concept inclusion axioms experience most of the impact induced by changes to the class inheritance of the ontology and other changes only marginally touch the materialisation. |
|
Jakub Lokoč, Patrik Veselý, František Mejzlík, Gregor Kovalčík, Tomáš Souček, Luca Rossetto, Klaus Schoeffmann, Werner Bailer, Cathal Gurrin, Loris Sauter, Jaeyub Song, Stefanos Vrochidis, Jiaxin Wu, Björn þóR Jónsson, Is the Reign of Interactive Search Eternal? Findings from the Video Browser Showdown 2020, ACM Transactions on Multimedia Computing Communications and Applications, Vol. 17 (3), 2021. (Journal Article)
|
|
Jan Alexander Fischer, Andres Palechor, Daniele Dell’Aglio, Abraham Bernstein, Claudio Tessone, The Complex Community Structure of the Bitcoin Address Correspondence Network, Frontiers in Physics, Vol. 9, 2021. (Journal Article)
Bitcoin is built on a blockchain, an immutable decentralized ledger that allows entities (users) to exchange Bitcoins in a pseudonymous manner. Bitcoins are associated with alpha-numeric addresses and are transferred via transactions. Each transaction is composed of a set of input addresses (associated with unspent outputs received from previous transactions) and a set of output addresses (to which Bitcoins are transferred). Despite Bitcoin was designed with anonymity in mind, different heuristic approaches exist to detect which addresses in a specific transaction belong to the same entity. By applying these heuristics, we build an Address Correspondence Network: in this representation, addresses are nodes are connected with edges if at least one heuristic detects them as belonging to the same entity. In this paper, we analyze for the first time the Address Correspondence Network and show it is characterized by a complex topology, signaled by a broad, skewed degree distribution and a power-law component size distribution. Using a large-scale dataset of addresses for which the controlling entities are known, we show that a combination of external data coupled with standard community detection algorithms can reliably identify entities. The complex nature of the Address Correspondence Network reveals that usage patterns of individual entities create statistical regularities; and that these regularities can be leveraged to more accurately identify entities and gain a deeper understanding of the Bitcoin economy as a whole. |
|
Matthias Baumgartner, Daniele Dell'Aglio, Abraham Bernstein, Entity Prediction in Knowledge Graphs with Joint Embeddings, In: Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15), ACL Anthology, Mexico City, Mexico, 2021. (Conference or Workshop Paper published in Proceedings)
Knowledge Graphs (KGs) have become increasingly popular in the recent years. However, as knowledge constantly grows and changes, it is inevitable to extend existing KGs with entities that emerged or became relevant to the scope of the KG after its creation. Research on updating KGs typically relies on extracting named entities and relations from text. However, these approaches cannot infer entities or relations that were not explicitly stated. Alternatively, embedding models exploit implicit structural regularities to predict missing relations, but cannot predict missing entities. In this article, we introduce a novel method to enrich a KG with new entities given their textual description. Our method leverages joint embedding models, hence does not require entities or relations to be named explicitly. We show that our approach can identify new concepts in a document corpus and transfer them into the KG, and we find that the performance of our method improves substantially when extended with techniques from association rule mining, text mining, and active learning. |
|
Patrick Muntwyler, Continuous Semi-Supervised Binary Classi cation of Data Streams, University of Zurich, Faculty of Business, Economics and Informatics, 2021. (Master's Thesis)
The number of data streams is growing every day, and so is their importance in our daily lives. It is important to be able to analyze data streams automatically, for example to find suspicious activities in a system or to filter interesting data points. Many systems today rely on supervised approaches. However, these have the disadvantage that they
cannot adapt to new trends in the data streams. Semi-supervised stream approaches are needed for this. However, this area is not yet well explored. We therefore develop SSDenStream. SSDenStream is based on DenStream, an unsupervised density-based stream clustering algorithm, and is able to perform online classi cation. We give an overview of density-based stream clustering and semi-supervised extensions of it. We perform several experiments on synthetic and real-world data sets to prove the functionality of SSDenStream. The experiments show that SSDenStream is able to handle overlapping clusters and performs well on real-world data. |
|
Bibek Paudel, Abraham Bernstein, Random Walks with Erasure: Diversifying Personalized Recommendations on Social and Information Networks, In: Proceedings of the Web Conference 2021 (WWW '2021), Association for Computing Machinery, New York, NY, USA, 2021. (Conference or Workshop Paper published in Proceedings)
Most existing personalization systems promote items that match a user’s previous choices or those that are popular among similar users. This results in recommendations that are highly similar to the ones users are already exposed to, resulting in their isolation inside familiar but insulated information silos. In this context, we develop a novel recommendation framework with a goal of improving information diversity using a modified random walk exploration of the user-item graph. We focus on the problem of political content recommendation, while addressing a general problem applicable to personalization tasks in other social and information networks.
For recommending political content on social networks, we first propose a new model to estimate the ideological positions for both users and the content they share, which is able to recover ideological positions with high accuracy. Based on these estimated positions, we generate diversified personalized recommendations using our new random-walk based recommendation algorithm. With experimental evaluations on large datasets of Twitter discussions, we show that our method based on random walks with erasure is able to generate more ideologically diverse recommenda- tions. Our approach does not depend on the availability of labels regarding the bias of users or content producers. With experiments on open benchmark datasets from other social and information networks, we also demonstrate the effectiveness of our method in recommending diverse long-tail items. |
|