Interactive Program Synthesis by Augmented Examples

Tianyi Zhang
ACM UIST
Published in
5 min readOct 22, 2020

This blog summarizes the UIST 2020 paper “Interactive Program Synthesis by Augmented Examples” by Tianyi Zhang, London Lowmanstone, Xinyu Wang, and Elena Glassman. [Paper] [Code] [Video Preview]

Programming-by-example (PBE) has become an increasingly popular component in software development tools, human-robot interaction, and end-user programming. The user gives the system one or more examples of an input and an output for a program they have in mind, and the system returns one or more candidates that would return the same outputs for those inputs. If the user does not like any of the candidates, the user can add more examples that hopefully help the computer guess the desired program — or a sufficiently equivalent program — that was in the user’s head.

The Programming-by-Example Workflow

It’s not an easy task for the human to pick examples that help their non-human collaborator in this exercise, the computer, guess the right program! There are a lot of reasons for this:

(1) The computer is not another human whose thought process we can imagine, with a shared context we can assume.

(2) It’s cognitively effortful for the human to analyze the input space and how that maps to the outputs they want and which small set of examples most fully specifies what they want.

(3) A handful of examples, which is all the human really wants to come up with, can be generalized into different programs in many different ways!

(4) Regardless of whether the user is an experienced programmer or an inexperienced end-user, programs can be difficult to read and are definitely effortful to mentally apply to the current and hypothetical future inputs.

In this work, we demonstrate how a new interface between the human and the synthesizer can address the latter three challenges. Our key insight is that it is more natural for the user to communicate their intent by annotating the examples they themselves came up with, rather than annotating the programs synthesized by the PBE system. We call this new interaction model as synthesis with augmented examples. Our model provides two types of example augmentations:

1) Semantic augmentation where a user can specify which parts of an input example should be treated verbatim and which parts should be generalized to a specific character family via light-weight annotations.

2) Data augmentation where the synthesizer generates additional examples, including some corner cases, to help users understand how a synthesized program would perform on possible future inputs, so they can decide whether it does what they want.

We implemented and demonstrated this interaction model in the domain of regular expressions, since it is a popular mechanism for text processing and data wrangling and is often considered hard to master even for experienced programmers. We have developed a regex synthesizer that performs a top-down enumerative search over the space of possible regexes defined by the following DSL. For instance, to synthesize concat(<num>,<let>), the synthesizer generates concat(?,?) with two placeholders first, and then concretize the placeholders with <num> and <let> respectively.

The Domain-Specific Language for Regular Expressions

Semantic annotations on input examples are parsed to literal values and general character classes first and then added to an include or exclude set. Elements in the include set will be assigned with higher priorities to try out during synthesis, while elements in the exclude set will never be considered.

To automatically generate additional inputs, our tool first converts the user-selected regex to a deterministic finite automaton (DFA), which decides the class of strings represented by the regex. In the DFA, each node represents a state in the string matching process and the node edge is labeled with the characters accepted at the current state. By traversing this DFA, we can generate all kinds of strings accepted or rejected by the regular expression. You can find the detailed algorithms in our full paper.

We did a within-subject user study with 12 CS students to evaluate the effectiveness and usability of this interaction model. In prior work, annotating programs (Peleg et al. ICSE 2018) was offered as a mechanism for helping users clarify what they want from the synthesizer, and in virtually all program synthesis by example systems, the user always has the opportunity to add additional examples to clarify their intent. In the control condition of our user study, we allowed users to use both existing modalities to clarify what they wanted. In the experimental condition, the interface was extended to support semantic augmentation and data augmentation.

Compared with only inspecting and annotating synthesized programs, interacting by augmented examples significantly increased the success rate of finishing a programming task in less time. In the experimental condition, all 12 participants finished tasks with an average of 3.3 synthesis iterations, while only 4 participants finished the tasks with significantly more synthesis iterations (7.7 on average) in the control condition.

The high success rate of the experimental group is largely attributed to the availability of semantic and data augmentations. All participants in the experiment condition made an average of 3 annotations to their input examples to achieve the task; they provided these annotations in roughly one out of every two iterations. All participants in the experiment condition provided an average of 3 annotations to their input examples to guide the synthesizer on how to generalize their examples; In roughly 2 out of 3 iterations, participants asked the interface to generate additional examples and corner cases to help them understand the behavior of the synthesized programs and recognize any undesired behavior. Seven of 12 participants directly reused these automatically generated inputs (3.4 inputs on average) as counterexamples in the next iteration.

Furthermore, when given access to semantic and data augmentation in the interface, participants felt twice as confident in the synthesis result and also perceived less mental demand and frustration (measured by NASA TLX) during the task. We also did a case study with 20 regex tasks asked in Stack Overflow and demonstrated that this interaction model can be used to solve various realistic tasks effectively. Please read our full paper for details.

User study participants felt twice as confident in the synthesis result when synthesizing with augmented examples, and perceived less mental demand and frustration during the task.

This article is written by Elena, Xinyu, and Tianyi together.

--

--