Not logged in.
Quick Search - Contribution
Contribution Details
Type | Master's Thesis |
Scope | Discipline-based scholarship |
Title | The Impact of Pre-training on Automated Code Revision After Review |
Organization Unit | |
Authors |
|
Supervisors |
|
Language |
|
Institution | University of Zurich |
Faculty | Faculty of Business, Economics and Informatics |
Date | 2023 |
Abstract Text | Code review is a process in which developers assess code changes submitted by their peers. Despite its numerous benefits, code review is a time-consuming and costly endeavor for both the reviewers and the code author. Reviewers are tasked with meticulously scrutinizing the author’s code and offering natural language comments to identify functional or non-functional issues. Meanwhile, the author must comprehend the review feedback and revise the submitted changes accordingly, a task referred to as ‘Code Revision After Review’ (CRA). Existing research has explored methods to automate the CRA task, by pre-training large language models (LLMs), such as CodeBERT and CodeT5 on source code data and fine-tuning them to generate revised code. Although these models utilize distinct pre-training strategies, the impact of these strategies on the CRA task has yet to be investigated. In this paper, we present an empirical study aimed at investigating the effects and efficacy of various pre-training strategies on the CRA task. In this context, we also introduce and evaluate CodeRef—a novel ensemble of pre-training strategies that substantially surpasses baseline performance, achieving at least four times greater likelihood of producing perfectly revised code. Our findings underscore the significance of pre-training in achieving optimal performance and offer insights into various pre-training strategies that may be applicable to other code refinement tasks. |
PDF File | Download |
Export | BibTeX |