Not logged in.

Contribution Details

Type Journal Article
Scope Discipline-based scholarship
Title Cracking double-blind review: Authorship attribution with deep learning
Organization Unit
Authors
  • Leonard Bauersfeld
  • Angel Romero
  • Manasi Muglikar
  • Davide Scaramuzza
Item Subtype Original Work
Refereed Yes
Status Published in final form
Language
  • English
Journal Title PLoS ONE
Publisher Public Library of Science (PLoS)
Geographical Reach international
ISSN 1932-6203
Volume 18
Number 6
Page Range e0287611
Date 2023
Abstract Text Double-blind peer review is considered a pillar of academic research because it is perceived to ensure a fair, unbiased, and fact-centered scientific discussion. Yet, experienced researchers can often correctly guess from which research group an anonymous submission originates, biasing the peer-review process. In this work, we present a transformer-based, neural-network architecture that only uses the text content and the author names in the bibliography to attribute an anonymous manuscript to an author. To train and evaluate our method, we created the largest authorship-identification dataset to date. It leverages all research papers publicly available on arXiv amounting to over 2 million manuscripts. In arXiv-subsets with up to 2,000 different authors, our method achieves an unprecedented authorship attribution accuracy, where up to 73% of papers are attributed correctly. We present a scaling analysis to highlight the applicability of the proposed method to even larger datasets when sufficient compute capabilities are more widely available to the academic community. Furthermore, we analyze the attribution accuracy in settings where the goal is to identify all authors of an anonymous manuscript. Thanks to our method, we are not only able to predict the author of an anonymous work but we also provide empirical evidence of the key aspects that make a paper attributable. We have open-sourced the necessary tools to reproduce our experiments.
Free access at DOI
Digital Object Identifier 10.1371/journal.pone.0287611
PubMed ID 37390072
PDF File Download from ZORA
Export BibTeX
EP3 XML (ZORA)