Chanathip Pornprasit, Chakkrit Tantithamthavorn, Patanamon Thongtanunam, Chunyang Chen
International Conference on Software Analysis, Evolution and Reengineering (SANER)
Code review is a software quality assurance practice, yet remains time-consuming (e.g., due to slow feedback from reviewers). Recent Neural Machine Translation (NMT)-based code transformation approaches were proposed to automatically generate an approved version of changed methods for a given submitted patch. The existing approaches could change code tokens in any area in a changed method. However, not all code tokens need to be changed. Intuitively, the changed code tokens in the method should be paid more attention to than the others as they are more prone to be defective. In this paper, we present an NMT-based Diff-Aware Code Transformation approach (D-ACT) by leveraging token-level change information to enable the NMT models to better focus on the changed tokens in a changed method. We evaluate our D-ACT and the baseline approaches based on a time-wise evaluation (that is ignored by the existing work) with 5,758 changed methods. Under the time-wise evaluation scenario, our results show that (1) D-ACT can correctly transform 107 - 245 changed methods, which is at least 62% higher than the existing approaches; (2) the performance of the existing approaches drops by 57% to 94% when the time-wise evaluation is ignored; and (3) D-ACT is improved by 17%- 82% with an average of 29% when considering the token-level change information. Our results suggest that (1) NMT-based code transformation approaches for code review should be evaluated under the time-wise evaluation; and (2) the token-level change information can substantially improve the performance of NMT-based code transformation approaches for code review.