Skip to content

Commit 2032912

Browse files
committed
Merge pull request learnbigcode#3 from meitalbensinai/master
Code similarity via natural language descriptions
2 parents c829241 + cf70b9a commit 2032912

File tree

2 files changed

+25
-0
lines changed

2 files changed

+25
-0
lines changed

challenges/index.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,3 +31,18 @@ More information: [coming]
3131
Dataset used: <a href="/datasets#estimatingTypesDataset">[Estimating Types in Stripped Binaries Dataset]</a><br>
3232
</p>
3333
</div>
34+
35+
<div class="highlightitem">
36+
<h2>Establishing similarity of code fragments</h2>
37+
38+
<p>Code similarity is a central challenge in many programming related applications, such as code search, automatic translation, and programming education.<p>
39+
40+
<p>There are many approaches for establishing code similarity and clone detection.
41+
However, most of these cannot capture similarity across programs using different APIs or algorithms, let alone programming languages.
42+
Furthermore, in some cases, equivalence is not what we are looking for.</p>
43+
44+
<p>The goal is to capture connections between code fragments, such as semantic similarity or relatedness, which are more relaxed notions than strict equivalence.<p>
45+
46+
<p>Dataset used: <a href="/datasets#like2dropsData">[Like2DropsData]</a></p>
47+
<p>Crowd-sourcing system used to collect data: <a href="http://like2drops.com">[Like2Drops]</a><br></p>
48+
</div>

datasets/index.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,3 +51,13 @@ The dataset is provided in the form of a VM containing a Mongo database holding
5151

5252
<a href="http://1drv.ms/1J5h6Rl">[download dataset]</a></p>
5353
</div>
54+
55+
<div class="highlightitem">
56+
<h1 id="like2dropsData">Similarity of code fragments Dataset</h1>
57+
58+
<p>This dataset includes 3 different collections that provide pairs of code fragments with our tool's similarity score, the users' similarity score and the code's meta-data</p>
59+
<p>The dataset is provided as a MongoExport database holding the data (see README for further details).
60+
<br/>
61+
62+
<a href="http://check.useast.appfog.ctl.io/download">[download dataset]</a></p>
63+
</div>

0 commit comments

Comments
 (0)