Diagnosing fault by Static Code Analysis and DNNs.

Muhammed Esad Unal
Sep 3, 2018 · 5 min read

Abstract — This project focuses on bug prediction by static code analysis. We use the Pylint — code analysis for Python tool and multilayer perceptron neural network classifier in order to predict whether a script or library written in Python has bug or not. We evaluated this approach by applying it to Python programming language scripts’ static code analysis application. The results illustrates that with this method MLP can successfully predict bugs within the scripts or libraries with large enough training set.

I. INTRODUCTION

In this report, we focus on the idea of predicting bugs in the codes by just using warnings and messages produced by static code analysis tools. In order to test the idea we prefer Python programming language and Pylint static code analysis for both reasons, open source buggy code samples can be easily found and it is easy to analyze Pylint warnings and messages in order to use them as a feature set. Our main theory is by providing large enough training dataset to Neural Network Classifiers we can predict buggy code pieces or libraries by just using static code analysis tools warnings and messages as input. Dividing our dataset extracted from buggy code pieces obtained from GitHub into training and test, we observed that Neural Network Classifiers have ability to predict whether there will be bugs even before running or compiling code. Experiments prove that MLP Classifier predict with around 97% accuracy. In the next section, we provide background information on working principle of MLPs and static code analysis with Pylint. In Section III, we explain our project, related challenges and our evaluation methods. In Section IV, we summarize and discuss the results. In Section V, we summarize related studies published in the literature. Finally, in Section V, we conclude the report by enlisting the lessons learned.

II. BACKGROUND

A. Pylint:

Pylint is a Python package serves as a static code analysis tool for Python programming language. It checks for major errors, possible code blocks can be refactored with respect to coding standarts, code’s complexity.It displays a number of messages, warnings and errors given below. It can also give an overall mark, based on the number and severity of the warnings and errors with its techniques. lists of some messages, warnings and errors:

C0102: Black listed name “%s”

C0103: Invalid %s name “%s”

C0111: Missing %s docstring

C0112: Empty %s docstring

C0121: Missing required attribute “%s”

C0202: Class method %s should have cls as first argument

C0301: Line too long (%s/%s)

C0302: Too many lines in module (%s)

C0303: Trailing whitespace

C0304: Final newline missing

for whole list, please wisit official Pylint website where overall information given above taken from (Pylint, 2017)

B. Neural Networks:

Abdi explains NN as basic units’ composition which are analogously similar to neurons. These interconnected units which has modifiable weight with respect learning process or algorithm used. Each unit send and receive information from its synapses to evaluate its state of activation. After these autonomous learning process whenever a input set is provided NN tries to predict result of output node with these dynamics (Abdi, 1994).

-MLP Classifier Multilayer perceptron classifier (MLPC) is one of the implementation of the feedforward artificial neural network. It has multiple layers of nodes with fully connected layers. Input layer node(s) represent the input data. All other nodes (hidden nodes) maps input layer nodes to the output layer by an activation function with weights and bias.

III. THE PROJECT WORK

We obtained buggy source codes from GitHub by checking issues opened with respect to source code. Next, by using pylint static code analysis tool, we extracted messages, warnings and errors codes for each of the source code we obtained. Code and parameters used for this process given below:

sudo find . -iname “*.py” | xargs pylint — files-output y -r n — msgtemplate=’{msg_id}’

Next, getting code representations of messages, warnings and errors gives us to ability to establish proper training and test datasets. We converted these code representations into a vector. By doing this we obtained a standard input data for our NN.

Applying this standard vectorization of codes generated for
each source code let us get rid of burden such computation and
memory usage. By this method we obtained a binary vector
which represents whether source code has output for each/any
of the 189 code representation can be generated by Pylint.

After getting vector representations, we divided dataset
into training and test. Before doing that, we shuffled the all
obtained vector sets to avoid bias from our experiments. Next,
our script creates tuples which consists of vector
representation and BugOrNo(1/0) values for each Python
source code files.

After preprocess, we created our MLP Classifier by using
sklearn library with parameters which are hidden_layer_sizes
= (50,50), max_iter = 100, solver = ‘sgd’, learning_rate_init =0.1, learning_rate =‘adaptive’. After tuning parameters for best
results. We finally obtained the optimum NN MLPClassifierto train and test our idea.

IV. RESULTS & DISCUSSION
In order to measure effectiveness of NN for this application
we have also tried different linear-nonlinear classification
techniques. Results obtained from these experiments are given
below:

As it is clearly seen that NN MLP Classifier is applicable for this method. Moreover, neural networks provides best results as we claimed. For further studies we are going to provide a benchmark obtained from NN for messages, warnigns or errors. We assume that it is important to let developers to have this knowledge in order to make their software development cycle easier.

V. RELATED WORK
Similar approach applied in the Heckman’s et.al. conference paper. In this study, they are focusing on benchmarking alerts produced by static code analysis tools. (Heckman, 2008) It differs from our approach because we are not trying to prioritize alerts, messages, warnigns or errors. Our approach
focus on using them to diagnose source code whether it will cause bugs or not.

VI. CONCLUSION
We addressed the problem of predicting whether there will
be a bug or not by just using static code analysis tools warning,
message and error outputs as input for Neural Networks. We
evaluated the effectiveness of our approach on the Python
Programming Language and one of its static code analysis tool
called Pylint. Results show that the approach is effective for
predicting possible bug arising from code pieces/libraries with
around %97 accuracy. On the other hand, we observed that
obtaining large enough buggy code pieces as dataset as
limitations. Further studies might be necessary to investigate
effect of having large enough dataset on the idea and making
experiments with other type of Neural Networks.

REFERENCES
Abdi, H. (1994). A neural network primer. Journal of
Biological Systems, 247–281.
Heckman, S. &. (2008). On establishing a benchmark for
evaluating static analysis alert prioritization and
classification techniques. . (pp. 41–50). ACM: Second
ACM-IEEE international symposium on Empirical
software engineering and measurement.
Pylint. (2017, 05 16). Pylint Wikidot WebPage. Retrieved
from http://pylint-messages.wikidot.com/all-codes:
http://pylint-messages.wikidot.com/all-codes

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade