Oblivious State Machines in Rust

Victor Ermolaev
ING Blog
Published in
10 min readOct 18, 2022

Intro

A lamp state machine.

Usually, a computer algorithm is represented by a sequence of states and possibly conditional transitions between them. Examples are

  • a data processing algorithm requiring multiple rounds of data cleaning, possibly anonymization, extraction of interesting pieces, their analysis, etc.,
  • processes that naturally happen in discrete stages, e.g., an order is first placed, then paid, possibly cancelled, and shipped,
  • cryptographic protocols, involving at least two independent communicating parties acting based on each other’s messages.

In this article, I would like to focus on the last case. Such protocols play a prominent role in cryptography and, particularly, in distributed/decentralized algorithms. Diffie-Hellman key negotiation, Raft consensus algorithm, and the Schnorr protocol (one of the most straightforward zero-knowledge proof systems) are immediate examples of such protocols. At my work at ING, I need to have a working knowledge of such protocols and sometimes implement them. To aid that task, I find it prudent to have a framework enabling me to focus on the protocol design without reinventing the wheel every time.

Who does not like to invent a wheel? Credit: https://www.youtube.com/watch?v=a0Yn5bYclv4

Distributed cryptographic protocols have several distinguishing features:

  • They require network communication. This fact immediately pushes us to make some assumptions about such communication; I shall assume a partially synchronous model: it may take an unknown but finite time for a message to be delivered from a sender to a receiver. Notably, only the delivery is guaranteed; their order is not. Any protocol implementation must be independent of the order the communicated messages arrive.
  • They are often asymmetric, i.e., one party performs computations different from the second party’s. This implies there will be two parts to the protocol. As one moves to the protocol implementation, it is desirable to extract a common core for all participants, use it for all protocol participants and only make minor adjustments for specific participants.

An abstraction we need is Finite State Machines. FSM describes a system as being in a state, expecting events to perform a transition; optionally, entry and exit actions can be performed when entering or leaving the state. FSMs exhibit oblivious behaviour: an FSM only knows the current state and attempts to advance it to the next unknown state based on the input. Both entry and exit actions depend only on the current state itself. Theoretically, FSMs may be equivalently described by encompassing transition decision logic into states or by listing all transitions between the states. By closer inspection, we may see that the latter option makes more knowledge accessible to the FSM. In applications, the second option will materialize in a more specific description of each state machine. For example, consider the following Kotlin-inspired pseudo-code (not valid Kotlin code):

Option 1.

interface State {
fun advance(): State?
fun processInput()
}
class FSM {
fun run(initial: State, input: Channel<*>): State {
var curr = initial
var next = curr.advance()
while ( next != null ) {
curr.processInput(input.next())
curr = next
next = curr.advance()
}
return curr
}
}

Option 2.

interface Stateobject A: State
object B: State
object C: State
class FSM {
val terminalStates = listOf(C)

fun run(initial: State, input: Channel<*>): State {
var curr = initial

while ( curr !in terminal ) {
curr = when Pair(curr, input.next()) ) {
Pair(A, <toB>) -> B
Pair(B, <toC>) -> C
else -> error(“Unknown transition”)
}
}
return curr
}
}

These examples show that the second variant is hardly oblivious: it has the complete lists of possible state and their transitions and the knowledge of which states are terminal. These conditions make the state machine hardly reusable across different applications, particularly across participants to an asymmetric protocol. Therefore, I look to implementing an oblivious state machine in Rust following the first paradigm.

What is out there?

Before jumping to the implementation, let’s investigate what is available at our disposal. I likely missed something; I would greatly appreciate any extra references.

I found information on state machines in this tutorial incredibly useful. I do recommend going through it. This tutorial explains how to implement a state machine in a very Rust-y way by implementing the From trait for all transitions. Unfortunately, that approach fails to implement an oblivious state machine: every state transition must be explicitly type-casted. The driving routine must select an appropriate state to transition to and invoke the correct transition version.

This code snippet presents a more straightforward path to state machine implementation. This approach comes closer to the ideal case: transitions depend on the current state and the input. Unfortunately, this strategy is hardly generalisable to an arbitrary state machine as it requires explicitly bundling all possible states into an algebraic data type and managing the transitions inside the state machine driving routine.

The ING’s repository for digital signatures contains another take on state machine implementation. That lib implements a cryptographic multi-party computation (MPC) protocol and is an excellent instance of an asymmetric protocol. I found the state machine module to be an incredible source of inspiration. I believe my implementation improves over it; gather some patience, and in the end, I will communicate the improvement points. Now let’s jump to the theory.

An oblivious state machine

Before showing the code, I’d like to reflect on the guiding principles for my design:

State

A state is defined by its behaviour:

  • Entry action. On entry, the state may produce messages that must be sent to another state maintained by the counter-party in the protocol. I shall call this behaviour initialize.
  • A state may accept messages and return the delivery status, e.g., accepted or unexpected; the latter status is a materialization of the previous assumption on a partially synchronous communication model.
  • A state can attempt to advance to another state. The possible results of such an attempt are same (the state requires more input to transition), next (a new state must be returned), and terminal (as we need a way to identify when the state machine completes).

Rust’s way of implementing this design is by introducing a trait. However, unlike in other high-level languages, where returning an interface is perfectly OK, we’ll have to deal with memory allocation explicitly and thus box the trait object describing a state.

In the following code snippets, I will present their simplified versions and refer to the actual code on Github.

type BoxedState<M> = Box<dyn State<M>>;pub enum DeliveryStatus<M, E: Debug> {
Delivered,
Unexpected(M),
Error(E),
}
enum Transition<M> {
Same,
Next(BoxedState<M>),
Terminal,
}
trait State<M> {
fn initialize(&self) -> Vec<M>;
fn deliver(&mut self, message: M) -> DeliveryStatus<M, String>;
fn advance(&self) -> Result<Transition<M>, String>;
}
// source

One may notice that the advance method accepts &self and not self, albeit it “feels right” to consume self when progressing to the next state. There are two reasons for that. The first one is technical: accepting self is impossible because the size of the actual State implementation cannot be statically determined at compile time. On another hand, this technicality is easy to overcome by altering the method declaration slightly to

fn advance(self: Box<Self>) -> Result<Transition<M>, String>;

The second reason is a design choice: I preferred to keep &self not to pollute the Transition variants with the BoxedState definitions and to simplify the handling of erroneous advance attempts in the upcoming state machine code.

Another item worth mentioning is state-downcasting. When a state is Terminal, we shall only know that it is a state, not which one it is. Crate downcast-rs brings in such functionality. It produces a bunch of safe convenience methods defined on any implementation of State, such as is<T: Statet<M>>(..) -> bool matching the State trait object with a specific type, and downcast<T: State<M>> -> Result<Box<T>, Box<Self>> attempting to downcast the trait object to a given concrete type.

State Machine

Let’s also highlight the guiding principles for the State Machine.
State machine must take care of the unexpected messages for the current state, saving them as future input for the future states.
State machine must be oblivious: it holds a single state and tries to advance it. State machine must terminate whenever a state is terminal.

To address the first bookkeeping item, I have implemented an abstraction called Feed (very imaginative, I know) over an asynchronous message receiver channel and a queue of unexpected messages. When the feed is polled, it first releases messages from the queue until exhaustion and only then awaits on the receiver channel. The unexpected messages are saved for later use and put back in the queue every time the state is advanced. The reader can find implementation here.

I have also put up a small convenience structure, InnerState, bundling together a state, its initialization status and a channel to send out initialization messages. However, for convenience’s sake, I will still refer to it as just “state”.

With the state and feed abstractions in place, the main loop driving the state machine becomes quite succinct:

let mut is_initalized = false;loop {
if !is_initalized {
state.initialize();
// send out initialization messages
is_initialize = true;
}
let advanced = state.advance()?;

match advanced {
Transition::Same => {
let message = feed.next().await;
match state.deliver(message) {
DeliveryStatus::Delivered => {}
DeliveryStatus::Unexpected(message) => feed.delay(message),
DeliveryStatus::Error(_err) => todo!(“Message processing error”),
}
}
Transition::Next(next) => {
*state = next;
is_state_initialized = false;
feed.refresh();
}
Transition::Terminal => {
break;
}
}
}
// source

This loop concludes my case, but I have left a few details out of scope: message delivery to the state machine and time budget for the machine’s main loop. To address these items, I have introduced TimeBoundStateMachineRunner, the primary interface for the state machine interaction. It accepts an initial state and the overall execution time budget and exposes a synchronous method to communicate the incoming messages to the state machine. To kick off the state machine runner, thus starting a state machine, one needs to provide a channel over which initialization messages and the execution result — a terminal state or an error — will be reported. These two options are wrapped inenum Either and the corresponding channel can be re-used across different state machine of the same kind.

Example

All right, how do I use it? The repository contains an extensive example; I shall only highlight the prominent snippets to avoid bloating the article.

Model

We shall model a lifecycle of an item on a showcase: the item can be on display, one can place an order to purchase it, provide purchase details to verify the order, and finally, the item gets shipped. It is possible to cancel the order before it is shipped. This state machine is schematically depicted below.

An online-order state machine.

We first define the messages the state machine can receive

enum Message {
PlaceOrder,
Verify(Verify),
FinalConfirmation,
Cancel,
}
enum Verify {
Address,
PaymentDetails,
}

Next, describe states; for exemplary purposes, I select the Placed state because it requires some input to be advanced.

struct Placed {
is_address_present: bool,
is_payment_ok: bool,
is_canceled: bool,
}
impl Placed {
fn new() -> Self {
Self {
is_address_present: false,
is_payment_ok: false,
is_canceled: false,
}
}
}
impl State<Types> for Placed {
fn deliver(
&mut self,
message: Message,
) -> DeliveryStatus<Message, <Types as StateTypes>::Err> {
match message {
Message::Cancel => self.is_canceled = true,
Message::Verify(Verify::Address) => self.is_address_present = true,
Message::Verify(Verify::PaymentDetails) => self.is_payment_ok = true,
_ => return DeliveryStatus::Unexpected(message),
}
DeliveryStatus::Delivered
}
fn advance(&self) -> Result<Transition<Types>, <Types as StateTypes>::Err> {
if self.is_canceled {
return Ok(Transition::Next(Box::new(Canceled)));
}
Ok(if self.is_payment_ok && self.is_address_present {
Transition::Next(Box::new(Verified::new()))
} else {
Transition::Same
}
)
}
}

Now it is clear that deliver updates the respective fields, while advance dynamically selects the next state based on its internal values. All other states are described analogously and can be looked up in the test code. To wire everything together, we need to create a time-bound runner, kick it off and feed it some messages to make the state machine progress through the states. Observe:


// Initial state.
let on_display = OnDisplay::new();
// Fake a feed of un-ordered messages.
let mut feed = VecDeque::from(vec![
Message::Verify(Verify::Address),
Message::PlaceOrder,
Message::FinalConfirmation,
Message::Verify(Verify::PaymentDetails),
]);
let mut feeding_interval = time::interval(Duration::from_millis(100));
feeding_interval.tick().await;
let mut state_machine_runner = TimeBoundStateMachineRunner::new(
“Order”.to_string(),
Box::new(on_display),
Duration::from_secs(5), // time budget
);
let (state_machine_tx, mut state_machine_rx) =
mpsc::unbounded_channel(); state_machine_runner.run(state_machine_tx);
let res: TimeBoundStateMachineResult<Types> = loop {
select! {
Some(Either::Result{ result, ..} ) =
state_machine_rx.recv() => { break result; }
_ = feeding_interval.tick() => {
// feed a message if present.
if let Some(msg) = feed.pop_front() {
let _ = state_machine_runner.deliver(msg);
}
}
}
};
let order = res.unwrap_or_else(
|_| panic!(“State machine did not complete in time”)
);
assert!(order.is::<Shipped>());

The code is self-explanatory: we start from the item being on_display, and given the sequence of messages, we expect the time-bound state machine runner to complete within 5 seconds with a terminal state Shipped.

This example completes the basic scenario of how the presented functionality must be used.

Conclusion

Comparing my implementation and the one found in the threshold-signatures lib, I can confidently distinguish two points:
Internal (not crucial for the end-user): the introduced feed abstraction is much easier to reason about should you debug your application.
External: states may require no input to transition; this is where the threshold-signatures state machine will get into trouble, while my design will have no problem with that.

I now have an oblivious state machine which I already use in another application using an asymmetric cryptographic protocol. The overall design renders it relatively simple to implement arbitrary state machines: one only needs to define states and their inner logic, and the rest is automatic.

Do have a look, and have a go. Don’t be too harsh; it is still a work in progress :)

Full source can be found here.

--

--