mbrukman
diff --git a/‎_publications/kharkar2022learning.markdown‎
Lines changed: 12 additions & 0 deletions b/‎_publications/kharkar2022learning.markdown‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎_publications/lu2022reacc.markdown‎
Lines changed: 12 additions & 0 deletions b/‎_publications/lu2022reacc.markdown‎
Lines changed: 12 additions & 0 deletions
@@ -0,0 +1,12 @@
+---
+layout: publication
+title: "Learning to Reduce False Positives in Analytic Bug Detectors"
+authors: Anant Kharkar, Roshanak Zilouchian Moghaddam, Matthew Jin, Xiaoyu Liu, Xin Shi, Colin Clement, Neel Sundaresan
+conference: ICSE
+year: 2022
+bibkey: kharkar2022learning
+additional_links:
+   - {name: "ArXiV", url: "https://arxiv.org/abs/2203.09907"}
+tags: ["Transformer", "static analysis"]
+---
+Due to increasingly complex software design and rapid iterative development, code defects and security vulnerabilities are prevalent in modern software. In response, programmers rely on static analysis tools to regularly scan their codebases and find potential bugs. In order to maximize coverage, however, these tools generally tend to report a significant number of false positives, requiring developers to manually verify each warning. To address this problem, we propose a Transformer-based learning approach to identify false positive bug warnings. We demonstrate that our models can improve the precision of static analysis by 17.5%. In addition, we validated the generalizability of this approach across two major bug types: null dereference and resource leak. 
@@ -0,0 +1,12 @@
+---
+layout: publication
+title: "ReACC: A Retrieval-Augmented Code Completion Framework"
+authors: Shuai Lu, Nan Duan, Hojae Han, Daya Guo, Seung-won Hwang, Alexey Svyatkovskiy
+conference:
+year: 2022
+bibkey: lu2022reacc
+additional_links:
+   - {name: "ArXiV", url: "https://arxiv.org/abs/2203.07722"}
+tags: ["Transformer", "autocomplete"]
+---
+Code completion, which aims to predict the following code token(s) according to the code context, can improve the productivity of software development. Recent work has proved that statistical language modeling with transformers can greatly improve the performance in the code completion task via learning from large-scale source code datasets. However, current approaches focus only on code context within the file or project, i.e. internal context. Our distinction is utilizing "external" context, inspired by human behaviors of copying from the related code snippets when writing code. Specifically, we propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We adopt a stage-wise training approach that combines a source code retriever and an auto-regressive language model for programming language. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.