wangdeze18
diff --git a/‎_publications/bieber2022static.markdown‎
Lines changed: 13 additions & 0 deletions b/‎_publications/bieber2022static.markdown‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎_publications/guo2022unixcoder.markdown‎
Lines changed: 12 additions & 0 deletions b/‎_publications/guo2022unixcoder.markdown‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎index.md‎
Lines changed: 1 addition & 1 deletion b/‎index.md‎
Lines changed: 1 addition & 1 deletion
@@ -0,0 +1,13 @@
+---
+layout: publication
+title: "Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions"
+authors: David Bieber, Rishab Goel, Daniel Zheng, Hugo Larochelle, Daniel Tarlow
+conference:
+year: 2022
+bibkey: bieber2022static
+additional_links:
+   - {name: "ArXiV", url: "https://arxiv.org/abs/2203.03771"}
+   - {name: "Dataset", url: "https://github.com/google-research/runtime-error-prediction"}
+tags: ["dataset", "defect"]
+---
+The execution behavior of a program often depends on external resources, such as program inputs or file contents, and so cannot be run in isolation. Nevertheless, software developers benefit from fast iteration loops where automated tools identify errors as early as possible, even before programs can be compiled and run. This presents an interesting machine learning challenge: can we predict runtime errors in a "static" setting, where program execution is not possible? Here, we introduce a real-world dataset and task for predicting runtime errors, which we show is difficult for generic models like Transformers. We approach this task by developing an interpreter-inspired architecture with an inductive bias towards mimicking program executions, which models exception handling and "learns to execute" descriptions of the contents of external resources. Surprisingly, we show that the model can also predict the location of the error, despite being trained only on labels indicating the presence/absence and kind of error. In total, we present a practical and difficult-yet-approachable challenge problem related to learning program execution and we demonstrate promising new capabilities of interpreter-inspired machine learning models for code. 
@@ -0,0 +1,12 @@
+---
+layout: publication
+title: "UniXcoder: Unified Cross-Modal Pre-training for Code Representation"
+authors: Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, Jian Yin
+conference:
+year: 2022
+bibkey: guo2022unixcoder
+additional_links:
+   - {name: "ArXiV", url: "https://arxiv.org/abs/2203.03850"}
+tags: ["Transformer"]
+---
+Pre-trained models for programming languages have recently demonstrated great success on code intelligence. To support both code-related understanding and generation tasks, recent works attempt to pre-train unified encoder-decoder models. However, such encoder-decoder framework is sub-optimal for auto-regressive tasks, especially code completion that requires a decoder-only manner for efficient inference. In this paper, we present UniXcoder, a unified cross-modal pre-trained model for programming language. The model utilizes mask attention matrices with prefix adapters to control the behavior of the model and leverages cross-modal contents like AST and code comment to enhance code representation. To encode AST that is represented as a tree in parallel, we propose a one-to-one mapping method to transform AST in a sequence structure that retains all structural information from the tree. Furthermore, we propose to utilize multi-modal contents to learn representation of code fragment with contrastive learning, and then align representations among programming languages using a cross-modal generation task. We evaluate UniXcoder on five code-related tasks over nine datasets. To further evaluate the performance of code fragment representation, we also construct a dataset for a new task, called zero-shot code-to-code search. Results show that our model achieves state-of-the-art performance on most tasks and analysis reveals that comment and AST can both enhance UniXcoder. 
@@ -27,7 +27,7 @@ and programming language communities.
   {% assign ttags = publication.tags  %}  
   {% assign rawtags = rawtags | concat: ttags %}  
 {% endfor %}
-{% assign rawtags = rawtags | uniq | sort %}
+{% assign rawtags = rawtags | uniq | sort_natural %}
 {% for tag in rawtags %}<tag><a href="/tags.html#{{ tag }}">{{ tag }}</a></tag> {% endfor %}
 
 ### About This Site