Skip to content

Commit 4827e99

Browse files
committed
A massive addition of tags to each paper and related functionality
to the website.
1 parent 9ee0287 commit 4827e99

File tree

203 files changed

+276
-20
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

203 files changed

+276
-20
lines changed

_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,6 @@ collections:
1010
output: true
1111
permalink: /:collection/:path/
1212

13-
plugins:
13+
plugins_dir:
1414
- jekyll-sitemap
1515
- jekyll-seo-tag

_includes/head.html

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,6 @@
1616
<!-- Enable responsiveness on mobile devices-->
1717
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1">
1818

19-
{% seo %}
20-
2119
<!-- CSS -->
2220
<link rel="stylesheet" href="{{ site.baseurl }}/public/css/poole.css">
2321
<link rel="stylesheet" href="{{ site.baseurl }}/public/css/syntax.css">

_includes/sidebar.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ <h1>
1212
<nav class="sidebar-nav">
1313
<div class="sidebar-item"><p style="font-size: 12px">Search related work <input type='text' id='searchTarget' size="16"/> <button onClick="search();">Go</button></p></div>
1414
<a class="sidebar-nav-item{% if page.url == "/papers.html" %} active{% endif %}" href="{% link papers.html %}">List of Papers</a>
15+
<a class="sidebar-nav-item{% if page.url == "/tags.html" %} active{% endif %}" href="{% link tags.html %}">Papers by Tag</a>
1516

1617
<a class="sidebar-nav-item{% if page.url == "/base-taxonomy/" %} active{% endif %}" href="{% link base-taxonomy/index.md %}">Core Taxonomy</a>
1718
<a class="sidebar-nav-item-small" href="{% link base-taxonomy/generative.html %}">Code Generating Models</a>

_layouts/publication.html

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,10 @@ <h5>{{ page.authors }}. {{ page.conference }} {{ page.year }}</h5>
1212
&nbsp;<a href='http://scholar.google.com/scholar?q={{ page.title }}' target="_blank"><img style="display: inline; margin: 0;" src="/public/media/google-scholar.png"/></a>
1313
&nbsp;<a href='https://www.semanticscholar.org/search?q={{ page.title }}' target="_blank"><img style="display: inline; margin: 0;" src="/public/media/semscholar.png"/></a>
1414
&nbsp;<a href='http://academic.microsoft.com/#/search?iq={{ page.title | uri_escape }}' target="_blank"><img style="display: inline; margin: 0;" src="/public/media/ms-academic.png"/></a>
15+
<br/>
16+
{% for tag in page.tags %}
17+
<tag><a href="/tags.html#{{ tag }}">{{ tag }}</a></tag>
18+
{% endfor %}
1519
</p>
1620

1721

_publications/aggarwal2015using.markdown

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,6 @@ authors: K. Aggarwal, M. Salameh, and A. Hindle
55
conference:
66
year: 2015
77
bibkey: aggarwal2015using
8+
tags: ["migration"]
89
---
910
In this paper, we have tried to use Statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors so as to ascertain differences with previous case. We have described a pilot study on modeling programming languages as natural language to build translation models on the lines of natural languages. This can be further worked on to translate between versions of a programming language or cross-programming-languages code translation.

_publications/allamanis2013mining.markdown

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,6 @@ additional_links:
99
- {name: "PDF", url: "http://homepages.inf.ed.ac.uk/csutton/publications/msr2013.pdf"}
1010
- {name: "data", url: "http://groups.inf.ed.ac.uk/cup/javaGithub/"}
1111
- {name: "data@ Edinburgh DataShare", url: "http://datashare.is.ed.ac.uk/handle/10283/2334"}
12+
tags: ["language model"]
1213
---
1314
The tens of thousands of high-quality open source software projects on the Internet raise the exciting possibility of studying software development by finding patterns across truly large source code repositories. This could enable new tools for developing code, encouraging reuse, and navigating large projects. In this paper, we build the first giga-token probabilistic language model of source code, based on 352 million lines of Java. This is 100 times the scale of the pioneering work by Hindle et al. The giga-token model is significantly better at the code suggestion task than previous models. More broadly, our approach provides a new “lens” for analyzing software projects, enabling new complexity metrics based on statistical analysis of large corpora. We call these metrics data-driven complexity metrics. We propose new metrics that measure the complexity of a code module and the topical centrality of a module to a software project. In particular, it is possible to distinguish reusable utility classes from classes that are part of a program’s core logic based solely on general information theoretic criteria.

_publications/allamanis2014learning.markdown

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ additional_links:
1010
- {name: "ArXiV", url: "http://arxiv.org/abs/1402.4182"}
1111
- {name: "website", url: "http://groups.inf.ed.ac.uk/naturalize/"}
1212
- {name: "code", url: "https://github.com/mast-group/naturalize"}
13+
tags: ["naming", "language model", "style"]
1314
---
1415
Every programmer has a characteristic style, ranging from preferences
1516
about identifier naming to preferences about object relationships and

_publications/allamanis2014mining.markdown

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,6 @@ additional_links:
99
- {name: "PDF", url: "http://homepages.inf.ed.ac.uk/csutton/publications/idioms.pdf"}
1010
- {name: "ArXiV", url: "http://arxiv.org/abs/1404.0417"}
1111
- {name: "data", url: "http://groups.inf.ed.ac.uk/cup/idioms/haggisClassUsersDataset.zip"}
12+
tags: ["pattern mining", "grammar", "AST"]
1213
---
1314
We present the first method for automatically mining code idioms from a corpus of previously written, idiomatic software projects. We take the view that a code idiom is a syntactic fragment that recurs across projects and has a single semantic purpose. Idioms may have metavariables, such as the body of a for loop. Modern IDEs commonly provide facilities for manually defining idioms and inserting them on demand, but this does not help programmers to write idiomatic code in languages or using libraries with which they are unfamiliar. We present Haggis, a system for mining code idioms that builds on recent advanced techniques from statistical natural language processing, namely, nonparametric Bayesian probabilistic tree substitution grammars. We apply Haggis to several of the most popular open source projects from GitHub. We present a wide range of evidence that the resulting idioms are semantically meaningful, demonstrating that they do indeed recur across software projects and that they occur more frequently in illustrative code examples collected from a Q&A site. Manual examination of the most common idioms indicate that they describe important program concepts, including object creation, exception handling, and resource management.

_publications/allamanis2015bimodal.markdown

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ bibkey: allamanis2015bimodal
88
additional_links:
99
- {name: "Supplementary Material", url: "https://miltos.allamanis.com/publicationfiles/allamanis2015bimodal/supplementary.pdf"}
1010
- {name: "Presentation Video", url: "http://videolectures.net/icml2015_allamanis_natural_language/"}
11+
tags: ["search", "grammar", "AST", "bimodal"]
1112
---
1213
We consider the problem of building probabilistic models that jointly
1314
model short natural language utterances and source code snippets. The

_publications/allamanis2015suggesting.markdown

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ bibkey: allamanis2015suggesting
88
additional_links:
99
- {name: "PDF", url: "http://homepages.inf.ed.ac.uk/csutton/publications/accurate-method-and-class.pdf"}
1010
- {name: "website", url: "http://groups.inf.ed.ac.uk/cup/naturalize"}
11+
tags: ["naming"]
1112
---
1213
Descriptive names are a vital part of readable, and hence maintainable, code. Recent progress on automatically suggesting names for local variables tantalizes with the prospect of replicating that success with method and class names. However, suggesting names for methods and classes is much more difficult. This is because good method and class names need to be functionally descriptive, but suggesting such names requires that the model goes beyond local context. We introduce a neural probabilistic language model for source code that is specifically designed for the method naming problem. Our model learns which names are semantically similar by assigning them to locations, called embeddings, in a high-dimensional continuous space, in such a way that names with similar embeddings tend to be used in similar contexts. These embeddings seem to contain semantic information about tokens, even though they are learned only from statistical co-occurrences of tokens. Furthermore, we introduce a variant of our model
1314
that is, to our knowledge, the first that can propose neologisms, names that have not appeared in the training corpus. We obtain state of the art results on the method, class, and even the simpler variable naming tasks. More broadly, the continuous embeddings that are learned by our model have the potential for wide application within software engineering.

0 commit comments

Comments
 (0)