CodeX
Published in

CodeX

A Replication of “DeepBugs: A Learning Approach to Name-based Bug Detection”

Photo by Alina Grubnyak on Unsplash

Original paper

DeepBugs converts source code into Abstract Syntax Trees (ASTs), then to semantic-encoded vectors via Word2Vec. A neural network determines whether meanings match usage contexts. Here, a developer has written a function call for setDims(width, height) using setDims(y, x). Deep-Bugs learns that x and width are semantically similar, as are y and height, so it predicts that the arguments are swapped.
  • Identify function invocations using the program’s AST.
  • Extract the names of the variables used for each function invocation.
  • Numerically model the concept of “similar variable names” by training a Word2Vec model [5], contextualized with the name of the function being invoked (e.g. the meaning of “someHeight” and “someRadius” in the context of the invocation of the function “calcCylinderVolume”). The Word2Vec model learns the variable names that are usually passed as the first parameter, the second parameter, and so on.
  • Use this Word2Vec model to identify usages where some variable name is “unusual” as defined by the vector calculated by the trained Word2Vec model.
After calculating the Word2Vec vectors, the DeepBugs algorithm uses a small neural network as a classifier for name-based bugs.

Replication

Shared dependencies

Same dataset, some shared components, no shared code.

Clerical error

However, stable results!

On the swapped-argument case from the 150k JavaScript Dataset, our DeepBugs replication successfully captured similar performance to the original authors’ work.

Partial replication

Reflection

More information

  1. The artifact is available here, including the source code and paper.

References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
James Davis

I am a professor in ECE@Purdue. My research assistants and I blog here about research findings and engineering tips.