Not logged in.

Contribution Details

Type Conference or Workshop Paper
Scope Discipline-based scholarship
Published in Proceedings Yes
Title Replicating Parser Behavior using Neural Machine Translation
Organization Unit
Authors
  • Carol Alexandru
  • Sebastiano Panichella
  • Harald Gall
Presentation Type paper
Item Subtype Original Work
Refereed Yes
Status Published electronically before print/final form (Epub ahead of print)
Event Title 25th IEEE International Conference on Program Comprehension (ICPC)
Event Type conference
Event Location Buenos Aires, Argentina
Event Start Date May 22 - 2017
Event End Date May 23 - 2017
Place of Publication Buenos Aires, Argentina
Abstract Text More than other machine learning techniques, neural networks have been shown to excel at tasks where humans traditionally outperform computers: recognizing objects in images, distinguishing spoken words from background noise or playing ``Go''. These are hard problems, where hand-crafting solutions is rarely feasible due to their inherent complexity. Higher level program comprehension is not dissimilar in nature: while a compiler or program analysis tool can quickly extract certain facts from (correctly written) code, it has no intrinsic `understanding' of the data and for the majority of real-world problems in program comprehension, a human developer is needed - for example to find and fix a bug or to summarize the bahavior of a method. We perform a pilot study to determine the suitability of neural networks for processing plain-text source code. We find that, on one hand, neural machine translation is too fragile to accurately tokenize code, while on the other hand, it can precisely recognize different types of tokens and make accurate guesses regarding their relative position in the local syntax tree. Our results suggest that neural machine translation may be exploited for annotating and enriching out-of-context code snippets to support automated tooling for code comprehension problems. We also identify several challenges in applying neural networks to learning from source code and determine key differences between the application of existing neural network models to source code instead of natural language.
PDF File Download
Export BibTeX
EP3 XML (ZORA)