Describing Algorithms: Introduction

Symbolic AI for algorithm state-space exploration

Denis Ignatovich
Jun 14, 2019 · 9 min read

Image for post
Image for post

Intuition

The term Region Decomposition (RD) refers to a (geometrically inspired) “slicing” of an algorithm’s state-space into distinct regions where the behavior of the algorithm is invariant, i.e., where it behaves “the same.” Each “slice” or region of the state-space describes a family of inputs and the output of the algorithm upon them, both of which may be symbolic and represent many concrete instances. We’ll dive into examples below which will make everything clear, but for now remember this – each region of the state-space represents a “distinct” behavior of the system, and a decomposition is a collection of regions covering the entire space, including the so-called “edge-cases.”

is inspired by Cylindrical Algebraic Decomposition (CAD) but Imandra “lifts” these techniques to algorithms at large. There are many recent advances in formal verification, mathematics and computers science that are packed into Imandra to make this happen.

Describing algorithms’ behaviors this wayhas many exciting applications in automated software testing, reinforcement learning, autonomous systems and many other domains. In this and the subsequent posts we will go into more detail with examples and applications.

Let’s jump in and start to build the intuition!

Consider the following code fragment:

We have two functions g and f and the latter looks somewhat convoluted. ()

Function f takes x as input, which is an integer and there are quite a few integers, so how many distinct behaviors of f are there? Let’s ask Imandra!

When we run decompose command with the name of the function f, we get the following (we include a still screenshot below, but the full region decomposition can be interactively explored at try.imandra.ai):

Image for post
Image for post
A screenshot of an Imandra region decomposition — visit https://try.imandra.ai/ to explore!

Imandra decomposes the function f into 7 regions. We’ve created a Voronoi widget for Jupyter Notebooks to help us visualize them. The region selected in the image above is R[1] In this decomposition, R[1] represents subset of the state-space where input x satisfies -2 < x <= 20 (e.g. -1 or 15) — in this region, the function will always return the value103. If you click on R[2], the constraint is 20 < x <= 22 and invariant result is x + 120. Notice that the invariant in R[2] is symbolic, so if x is 21 then the function will return 141.

If you combine all of these regions together, you will get the full behavior of f, a fact which Imandra mathematically proves behind the scenes.

Why RD scales

An immediate question might be whether scales to “real-world” algorithms. The answer is “Yes.” In fact, it is already deployed in production at some of the world’s leading financial institutions. It took us a few years of R&D and industrial POCs to get it right.

A naive approach would be to simply follow ‘’ within code and identify those paths as ‘distinct’ behaviors, but this would quickly ‘’ with even remotely complex algorithms. Moreover, when recursive functions are involved (even typical operations over Lists are recursive!), things get really tricky — Imandra is packed with quite a few powerful techniques to manage this.

Here’s a small example to help illustrate how works under the hood. Consider the function myfunc below:

How many regions would you expect myfunc to have? The answer is 1 and if this wasn’t the case, then the approach we present here wouldn’t scale.

Imandra determined that myfunc may be described with a single region without any constraints on the input argument a and whose behavior is described by the invariant result of a > 5 ! For all possible values of a, the output of myfunc a is equal toa > 5, which of course could either be true or false.

Image for post
Image for post

Imandra’s support for Region Decomposition

As described, is one of the core features of Imandra and there are quite a few ways you can use it.

At the lowest level Imandra accepts OCaml (and ReasonML). For the pure subsets of OCaml/ReasonML, Imandra (automatically) converts code into an axiomatic representation, applying numerous magical tricks and generating a state-space decomposition with an accompanying completeness proof. Each region in the decomposition also includes a solver which can be used to automatically generate inputs satisfying the region constraints. This feature is very useful for test-suite generation, for example.

In this section we’ll briefly cover transforming/printing regions into other formats (e.g. English prose) and using side-conditions to focus on various subsets of the algorithm state-space you’re decomposing. As always, please see our documentation at https://docs.imandra.ai for full information.

Image for post
Image for post

Transforms and printers

What can we do with these results? Well, since Imandra reflects the results back into the OCaml/ReasonML runtime, quite a lot:

-Synthesize inputs and test-cases: Given a region, we can ask Imandra to synthesize concrete inputs (as valid OCaml values) satisfying the region constraints. These are inputs which “push” the system into the given region. This provides a powerful basis for test-case and documentation generation, for example.
- Compute with them: From each region, an executable classification predicate can be automatically extracted, allowing you to take actual input values and figure out which regions they correspond to.
- Print/convert them into a format like FIX messages (used in finance)

We’ll look at an example of a printer for the longer example below.

Image for post
Image for post

Assumptions

A key feature of is the ability to superimpose constraints on the inputs and/or result of the function you’re decomposing. This allows you to focus on a specific subset of the state-space of the algorithm.

To to illustrate this, let’s revisit our first example function f. What if we wanted to restrict the input?

Image for post
Image for post

Since the side condition is just a boolean function, we can constrain the state space based on the result of running the function f:

Image for post
Image for post

Example: stock exchange pricing logic

Let’s now look at something a bit more complex and The complete example is in our documentation page (with a live Jupyter Notebook), so please check it out — we will just provide a high-level overview here.

Modeling financial venues with Imandra

Algorithms underpinning financial venues (exchanges and “dark pools”) are notoriously complex and are definitely “safety-critical”, so when they crash or ‘misbehave’, bad things happen. Naturally, they should be designed, tested and audited with formal methods. At Imandra, we’re making this a reality.

For this next example, we’re going to take a description of order book pricing logic from an actual exchange (SIX Swiss). This specific component determines the price of a transaction between a buy and a sell order, if the trade is possible. This logic would be just a fraction of the entire model of the exchange, but this is outside the scope of this exercise (interested readers should take a look at our white papers for complete discussion of how we model exchanges and other venues).

The main function of this code is as follows:

When we ask Imandra to decompose this function, we get this:

Image for post
Image for post

The constraints as they’re written by default are basically OCaml – you’ll notice some of the constraints contain List.hd ob.buys or List.hd ob.sells. Those are references to the first buy and sells orders using the standard OCaml list functions. To make this more readable we can make use of OCaml’s pattern matching and Imandra’s extensibility to create a custom printer (the notebook has full source code):

After we install this new printer and view the regions, we see the constraints printed in a much more ‘human-readable’ format (if you spend more time on the printer, you can do much more!).

Image for post
Image for post

It’s a pretty powerful technique — we’ve gone from a pure executable OCaml algorithm describing the business logic (a simulator) to automated generation of edge case descriptions in English-like prose.

The last step we’ll perform is generating sample points for the regions. A natural use case for these samples are test cases. Imagine you create a model of the system you would like to test and ask Imandra to describe its regions (or ‘edge cases’) and construct inputs for each one of them. This will give you test cases and also a describing what it is that you’re actually testing!

Image for post
Image for post

What’s next

In this post, we have introduced — a powerful technique of describing algorithm state-spaces. The next posts in this series will cover more advanced features and applications. Here’s the current plan:

  • — a powerful Imandra toolkit for cloud-native parallel decomposition and reachability.
  • — application of region decomposition for improving training of reinforcement learning algorithms
  • – will discuss application algorithm state-space decomposition to certifying autonomous vehicles.
  • covers the available options for performing .
  • – demonstrates the pipeline you can use for region transformers and custom printers.

If you have any questions or requests, don’t hesitate to drop us a line (or two) in our Discord server.

Stay tuned!

Imandra

Reasoning as a Service @ www.imandra.ai

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app