+We cast contextualized code search as a learning problem, where the goal is to learn a distribution function computing the likelihood that each database code completes the program, and propose a neural model for predicting which database code is likely to be most useful. Because it will be prohibitively expensive to apply a neural model to each code in a database of millions or billions of codes at search time, one of our key technical concerns is ensuring a speedy search. We address this by learning a 'reverse encoder' that can be used to reduce the problem of evaluating each database code to computing a convolution of two normal distributions, making it possible to search a large database of codes in a reasonable time.
0 commit comments