Lightweight Formal Methods for BGP: panic-free deserialization

FUJITA Tomonori
nttlabs
Published in
4 min readNov 30, 2023

Modern BGP implementations are complicated. Their code is very hard to get right.

When we started to write GoBGP, an OSS BGP implementation in Go, decided to run some tests in the CI process; use wire data to test the serialization and deserialization code, run multiple BGP daemons as a container to test the interactions, etc. The similar tests are often used in the development of other OSS BGP implementations too. However, from my experience in GoBGP development, I can say that more than those old-fashioned tests are necessary. So I tried lightweight formal methods for RustyBGP, an OSS BGP implementation in Rust, with reference to Amazon S3 paper.

Hopefully I could write a series of articles about using lightweight formal methods to validate a BGP implementation. This is the first article about an experiment to safely parse wire data (BGP packets).

Note that I’ve just started to use lightweight formal methods. I’m not familiar with them at all yet. Any feedback would be greatly appreciated (that’s one of reasons why I wrote this article).

Serialize and deserialize for BGP packets

A BGP implementation has lots of serializers and deserializers that marshal data between in-memory and on-wire formats. To make things worse, we continue to add serializers and deserializers to a BGP implementation because IETF people endlessly write RFCs for BGP.

A BGP implementation must treat on-wire data as untrusted because a BGP peer might send malformed data intentionally or unintentionally. We typically find an out-of-bounds bug in these code. In C, out-of-bounds bug leads to accessing some random memory outside an array; aka a buffer overflow, could enable a cracker to exploit vulnerabilities. In Rust, bounds checks always are executed so no buffer overflow. Instead, a process immediately panics when it tries to access outside an array.

A panic is better than vulnerabilities but a BGP daemon is usually a critical component in production. So a panic must be avoided.

The Amazon S3 paper says:

We worked with the developers of Crux, a bounded symbolic evaluation engine, to prove panic-freedom of our deserialization code.

Let’s see what we can do with Crux. Consider the following deserialization code for the widely used format in BGP; the first byte indicates the length of the rest data:

Note that the real deserialization code does something useful with the rest data but this is just an example so it just returns the last byte in the rest data.

Clearly there are two problems that might lead to a panic. Firstly, if the length of the buffer is zero, the parser panics at the line 2. Secondly, the idx might be out-of-bounds at the line 3. It’s easily write a normal test code that calls the deserializer with an array to crash this code where the length is bogus:

However, with complicated deserialization code, a condition that crashes the code might be overlooked. With Crux, we don’t need to think about such conditions. Instead, Crux can do for us like:

x and y can be thought of as simultaneously representing all possible values of type u8. Crux doesn’t execute the test with all possible values. Instead, Crux analyzes the deserialization code with a technique called symbolic execution to determine values. On my environment, the above Crux test finished quickly as the normal test:

The verification failed

Great, the verification failed; Crux found an input that crashes the code. Let’s fix the deserialization code like:

The verification of the fixed code still fails! Why? A careful reader might already notice that the line 6 is wrong. Crux found an input that crashes the code again. The correct deserialization code is:

Now the verification was successful.

Conclusion

You might argue that iterators should be always used instead of accessing by an index. Then an out-of-bounds bug never happens. I plan to try to rewrite RustyBGP’s serialization and deserialization code in such way but I expect it would not be pretty. For example, we sometimes have to go forward and backward in a buffer.

I think that the result of the initial experiment is positive but more experiments with the actual code are necessary. Also I need to investigate a way to use Crux. I can’t find any examples of using Crux for serialization and deserialization code on the Internet (even though the developers of Crux states that these code is one of the typical use-cases).

--

--

FUJITA Tomonori
nttlabs

Janitor at the 34th floor of NTT Tamachi office, had worked on Linux kernel, founded GoBGP, TGT, Ryu, RustyBGP, etc. https://twitter.com/brewaddict