Not logged in.
Quick Search - Contribution
Contribution Details
Type | Book Chapter |
Scope | Discipline-based scholarship |
Title | Link-Rot in Web-Sourced Multimedia Datasets |
Organization Unit | |
Authors |
|
Editors |
|
Item Subtype | Original Work |
Refereed | Yes |
Status | Published in final form |
Language |
|
Booktitle | MultiMedia Modeling |
Series Name | Lecture Notes in Computer Science |
ISBN | 978-3-031-27076-5 (P) 978-3-031-27077-2 (E) |
ISSN | 0302-9743 |
Number | 13833 |
Place of Publication | Cham |
Publisher | Springer |
Page Range | 476 - 488 |
Date | 2023 |
Abstract Text | The Web is increasingly used as a source for content of datasets of various types, especially multimedia content. These datasets are then often distributed as a collection of URLs, pointing to the original sources of the elements. As these sources go offline over time, the datasets experience decay in the form of link-rot. In this paper, we analyze 24 Web-sourced datasets with a combined total of over 270 million URLs and find that over 20% of the content is no longer available. We discuss the adverse effects of this decay on the reproducibility of work based on such data and make some recommendations on how they could be mediated in the future. |
Digital Object Identifier | 10.1007/978-3-031-27077-2_37 |
Other Identification Number | merlin-id:23568 |
PDF File | Download from ZORA |
Export |
BibTeX
EP3 XML (ZORA) |