Pangolin: An SFL-based Toolset for Feature Localization

TQRG
ASE Conference
Published in
7 min readOct 8, 2019

In software development, it is relatively easy to develop a feature — from a requirement, you design a solution and you implement it into your code.

The problems starts when the features you carefully crafted are the subject of a change. “Oh no! I don’t remember, where did I wrote that? Do I really have to go back and learn all the code, again?”

It is not uncommon for developers to spend 60% to 80% of their time looking for features and trying the comprehend the code¹. It is even worse for the managers, that have to allocate their developers time (and money) on such useless tasks.

To make things better, we developed a novel tool — Pangolin. It is an interactive, easy to use toolset that allows you to accurately locate a feature in your source code.

Pangolin performs a Spectrum-based Feature Comprehension analysis² — a state-of-the-art approach to feature localisation, which is based on Spectrum-base Fault Localization; it applies a dynamic diagnostic fault analysis to identify which modules in the source code compose a particular feature.

What is under the hood?

We built Pangolin on the perspective that locating features is very similar to diagnosing software faults. From this idea, a new approach was developed to locate features — the Spectrum-based Feature Comprehension (SFC)² — that uses concepts and techniques from the fault localisation domain.

This new approach maps the task of locating features with Spectrum-based Fault Localization (SFL)³, which uses the coverage information from each test case to calculate how likely a software component is from being faulty. It assumes that components that are more involved in failing tests are more likely to contain the fault than all successful components.

This approach, exploits run-time information from system executions (transactions) to identify dependencies between components, helping software engineers to understand how a program is structured.

SFC measures the correlation between associated transactions (executions that exercise a feature) and the component involvement in such transactions. In other words, from a set of executions (for example test cases) that affect the targeted feature, it analyses and calculates a score for the level of involvement from each component in that set.

Pangolin: Eclipse plugin providing the SFC approach

Pangolin is an Eclipse plugin that implements the SFC approach and measures the association between system executions (also called transactions) that exercise a targeted feature and all of the involved components. It is based on the GZoltar⁴ toolset which enables fault localization within the Eclipse IDE.

The results are, then, displayed using an hierarchic, interactive visualisation. Pangolin provides two new views in the Eclipse IDE: a diagnostic report view (left) and a transaction view (right).

Pangolin plugin for the Eclipse IDE

The Diagnostic Report

The view in the left, visually describes the SFC report in a tree-based, navigable visualisation called sunburst.

It summarises the current project’s structure in a hierarchical fashion, starting from the root component in the inner circle that represents the whole project, up to individual lines of code in the outer circle.

Sunburst Visualization Example

Each element in the visualisation is colour coded according to the resulting score from the SFC, representing the degree of correlation (a.k.a association measure) between a specific component to the feature under consideration.

The components are coloured dependent on the SFC correlation value. The colours range from bright green when the association measure is close to 0, which means there is a low correlation with the feature. To yellow if it is close to 0.5 and up to bright red if it is close to 1, thus having a high correlation with the feature.

Association Measure — colour spectrum

Due to the hierarchical visualisation, the parent components exhibit the maximum association measure of all their child components. You can hover the mouse on a component to get its association measure, as well as the component’s name.

From there, clicking a component prompts you to a root change, where the selected component becomes the root of the visualization, hiding all elements that are not descendants of the selected component. If you double click a particular component, Eclipse’s code editor will open and the cursor will be positioned at the start of the chosen component.

PS: To find the features, you should inspect the components that exhibit high association measures.

The transactions

Transactions are system executions associated to a specific feature we are targeting. Pangolin provides two modes of execution: a test-based feature detection, that considers the project test cases as the transactions; and a participatory feature detection, where the users manually record their interactions with the system and label them as either associated or dissociated (meaning whether it is related or not to the feature).

Test-based Feature Detection

This mode of execution (represented on the right), uses the project’s JUnit test cases as the system transactions.

After running all the test cases, Pangolin’s transactions view (see bellow) is populated with all executed test cases. The list of tests is shown hierarchically where every test from the same enclosing test class is grouped together. Similarly, the test classes from the same package are also clustered.

Test-based feature detection — transactions view

In this mode, in order to perform feature localisation, you are prompted to identify which system transactions exercise the feature under consideration (i.e., which transactions are associated). Therefore, you click the respective checkboxes of the tests that exercise the targeted feature. Every time a checkbox is selected or deselected, PANGOLIN’s SFC engine is re-run, and the updated diagnostic report is shown in the sunburst visualization.

By default, test failures are automatically set as associated transactions. This means that PANGOLIN can also be used as a fault localization tool, thus helping prioritise the inspection of components that are likely to explain the observed failures. You can override this default behavior by deselecting the associated transactions.

Participatory Feature Detection

Locating features using test cases is great, however what if your project doesn’t have a thorough test suite? — or, what if you have no clue about how the features map the tests?

Although it is a good practice to test the system and to maintain a test-feature mapping, in real scenarios these properties are not frequently available during software development. It ends up limiting the use of tests as transactions for SFC.

To solve this limitation, we added a new mode of execution to Pangolin — the Participatory Feature Detection (PFD)³. It allows you to record manual interactions with the system and label each of those interactions as associated or dissociated with the feature.

In addition, with the minimal impact on performance, Pangolin was enhanced to feature user participation in an online fashion. It makes you part of the analysis loop, where after labelling each interaction, it gives you immediate feedback through the sunburst visualisation (as you can see bellow).

Sunburst visualisation updated after each labelling

To capture each associated and dissociated transaction, we have extended Pangolin to display an extra window during runtime. It starts recording 📹 as you press the Start Transaction button and you can stop and label the interaction using the End Associated Transaction button — for feature related interaction — or the End Dissociated Transaction button — for all other interactions.

Pangolin’s PFD control menu

Hands-on Pangolin: A practical example

As an example, we are using Pangolin to detect a specific feature of the Rhino project, a Javascript engine written entirely in Java. The goal is to pinpoint the components that implement the Continuation feature — responsible for storing and resuming a Javascript runtime.

Give Pangolin a try :D→ https://tqrg.github.io/pangolin/ 🔗

--

--