Can Rust-wrapped C++ offer stability and performance benefits?
Crashes are good!
If you’re in a security team, that is. If your program encounters an unexpected condition it should exit, intentionally and swiftly, instead of progressing into an area of undefined behavior (UB) which may be exploited by an attacker.
So Chromium, like many other security-critical open source projects, scatters many CHECK
s throughout its source code to look for unexpected conditions. It similarly enables checks in the C++ libraries on which it depends, notably abseil.
Let’s see how some of these checks play out in practice. We’ll use base::Value
, the Chromium type which represents trees of arbitrary values (integers, strings, Booleans, nested lists, nested dictionaries, etc.) These trees of values are often used to represent the results of some parsing (for example of a JSON document), and are often passed to a privileged process after some sandboxed parsing.
We’ll use simple examples here, but the code can of course get arbitrarily complex, and the bugs can get arbitrarily hard to avoid.
(You can test these examples by adding them to values_unittest.cc and, with a Chromium build environment on Linux, using the gn args symbol_level=2
andis_debug=false
; then running ninja -C out/Release base_unittests && out/Release/base_unittests
.)
Let’s first attempt type confusion between a list and a dictionary Value
:
Here we’re creating a parent list base::Value
, appending a child list, and then treating that child list as a dictionary instead — by trying to set a dictionary key. Oops. Here’s the crash:
[1021033:1021033:FATAL:values.cc(341)] Check failed: is_dict().
#0 0x5580847879b2 base::debug::CollectStackTrace()
#1 0x558084678a23 base::debug::StackTrace::StackTrace()
#2 0x55808469163f logging::LogMessage::~LogMessage()
#3 0x55808469240e logging::LogMessage::~LogMessage()
#4 0x55808477f129 base::Value::SetKey()
#5 0x5580842d51c7 base::ValuesTest_TypeConfusion_Test::TestBody()
Great! Nothing exploitable there. A nice crash.
Let’s try a data race. (Some would argue that this can’t be a race since it’s single-threaded, so call it ‘concurrent modification’ if you prefer).
Here we’re creating a list, adding three items, and iterating through it. Part way through the iteration — bang! — we modify the list by reserving more space.
We get:
[1055932:1055932:FATAL:checked_iterators.h(206)] Check failed: start_ == other.start_ (0x1c70005cda40 vs. 0x1c7000623000)
#0 0x5635d868bd72 base::debug::CollectStackTrace()
#1 0x5635d857cde3 base::debug::StackTrace::StackTrace()
#2 0x5635d85959ff logging::LogMessage::~LogMessage()
#3 0x5635d85967ce logging::LogMessage::~LogMessage()
#4 0x5635d75636a4 base::CheckedContiguousIterator<>::CheckComparable()
#5 0x5635d81d93fe base::ValuesTest_DataRace_Test::TestBody()
Nice. Again, a controlled crash. Let’s try a good old buffer overflow.
[1000598:1000598:FATAL:values.cc(899)] Check failed: index < storage_.size() (3 vs. 2)
#0 0x564f5d372382 base::debug::CollectStackTrace()
#1 0x564f5d260dc3 base::debug::StackTrace::StackTrace()
#2 0x564f5d27c00f logging::LogMessage::~LogMessage()
#3 0x564f5d27cdde logging::LogMessage::~LogMessage()
#4 0x564f5d367e0f base::Value::List::operator[]()
#5 0x564f5cf59315 base::ValuesTest_Overflow_Test::TestBody()
Great. Finally let’s try a use-after-free.
(This would be a use-after-free because appending something to the parent may cause the parent list to be reallocated, invalidating the child.)
[1056287:1056287:FATAL:values.cc(351)] Check failed: is_list().
#0 0x55ba66b37d72 base::debug::CollectStackTrace()
#1 0x55ba66a28de3 base::debug::StackTrace::StackTrace()
#2 0x55ba66a419ff logging::LogMessage::~LogMessage()
#3 0x55ba66a427ce logging::LogMessage::~LogMessage()
#4 0x55ba66b2ecff base::Value::Append()
#5 0x55ba666854dd base::ValuesTest_UseAfterFree_Test::TestBody()
Again a nice controlled crash… err, wait. Check failed: is_list()
? That’s a bit odd. Let’s flip is_asan=true
in our gn arguments and try again:
==1069051==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001108a8 at pc 0x556351140274 bp 0x7fff66c8f1d0 sp 0x7fff66c8f1c8
READ of size 8 at 0x6030001108a8 thread T0
#0 0x556351140273 in absl::variant<absl::monostate, bool, int, base::Value::DoubleStorage, std::Cr::basic_string<char, std::Cr::char_traits<char>, std::Cr::allocator<char>>, std::Cr::vector<unsigned char, std::Cr::allocator<unsigned char>>, base::Value::Dict, base::Value::List>::index() const third_party/abseil-cpp/absl/types/variant.h:698:63
#1 0x556351140273 in base::Value::type() const base/values.h:311:54
#2 0x556351140273 in base::Value::is_list() const base/values.h:321:33
#3 0x556351140273 in base::Value::GetList() base/values.cc:351:3
#4 0x556351140273 in base::Value::Append(char const*) base/values.cc:1065:3
#5 0x556350000f21 in base::ValuesTest_UseAfterFree_Test::TestBody() base/values_unittest.cc:2626:9And finally,
Oh. We didn’t get a controlled crash this time; we got a potentially exploitable use-after-free.
Nevertheless this is a pretty good score. Even on a release build, three of the four of these coding errors result in a deliberate, controlled crash, rather than exploitable undefined behavior. This is marvelous, and base::Value
is considered a very safe API.
But, at what cost?
First, although us security folk think controlled crashes are good, we obviously think they’re bad relative to not crashing at all. Stability suffers, and the pace of development is slowed by debugging.
Second, every check in place has a performance penalty. Mostly they’re cheap — just integer or pointer comparisons, with a “good path” such that CPU branch predictors can continue to keep pipelines full. But, all the different checks do gradually add up. Speed suffers.
In Chromium, we experimented with wrapping this API in Rust bindings.
One of the promises of Rust is that it does most of its safety checks at compile-time, imposing no runtime performance cost and no risk of runtime crashes.
Do those promises hold true even if you’re wrapping pre-existing C++ code? The answer is — yes, at least in some cases.
We didn’t wrap the entire base::Value
API, just the subset that we needed in order to construct these structures using serde
.
But here’s, roughly, how these four cases would look using the Rust wrappers for base::Value
which we created. The first few lines of each example are a bit strange, but in summary: a ValueSlotRef
is a space into which a new base::Value
can be constructed. The rest of the examples should follow the above C++ closely.
(You can test these examples by adding them to values_unittest.rs and flipping enable_rust=true
in your gn args).
The first three literally do not compile. These runtime crashes simply can’t happen, because the compiler prevents them. We avoid both security and stability peril.
In one of these cases — the type confusion example — we’re using a “newtype wrapper” in Rust to make it impossible to mix up references to dictionary and a list. Chromium C++ offers similarly type-safe variants of base::Value
itself. In our Rust bindings we go a little further and return a type-safe reference when a nested list or dictionary is added. Rust’s high level type system concepts (such as newtype wrappers, tagged unions and zero-sized types) mean that such compile-time bug prevention is expected in Rust. So, in designing our Rust wrappers for base::Value
we chose more Rustic norms, and the result is that this class of error is impossible at compile-time. The benefit relative to our C++ APIs here is marginal, though: typical usage of our C++ APIs would create the child list then append it to the parent, bypassing this runtime check entirely.
The fourth case is even more nuanced. A fn get(self: &ValueListRef, position: usize) -> &Value
API would require a runtime check. Just as in our C++, it would be secure but risk a runtime crash. Rust doesn’t offer any advantage with such an API: but it’s much less common in Rust to index into containers; instead, pipelines of iterators are more commonly used.
There’s no possibility for runtime crashes with this style. As with the type confusion example, Rust’s stability advantage isn’t just from its fundamental language features, but from its usage norms.
What about performance? Right now, our base::Value
code checks for every error condition possible, with a small performance penalty as mentioned previously. If Rust prevents some of those conditions at compile-time, runtime checks are unnecessary. In principle we could compile the underlying base::Value
code (and its dependencies, such as checked iterators) in a different configuration with some of those runtime checks removed, for a small speed boost, but the binary size cost would almost certainly not be worthwhile (assuming we still needed the checked version too).
All in all, what’s our score card?
- C++: one of the problem classes caused an exploitable bug; three achieved safe runtime crashes. (One of the runtime crashes can be avoided by using safer variants of the C++
base::Value
APIs). - Rust: none of the problem classes involved an exploitable bug; one resulted in a safe runtime crash (but normal Rust usage patterns might avoid that).
Overall, we were pleasantly surprised with this experiment. We did not set out to prevent classes of bug in our Rust bindings for base::Value
— we didn’t think it was realistic to prevent bugs at compile-time in a thin Rust wrapper around underlying C++ code.
Nevertheless as we iterated through the code review process it gradually became clear that we could prevent some classes of bugs in these Rust wrappers, and that Rust coding norms effectively required us to do so (or to feel guilty by providing non-Rustic APIs!)
Our Rust experiments are now proceeding in a different direction and these Rustbase::Value
APIs will be removed. Of course, the real test would be whether such safety, stability and performance wins can be applied to more complex situations such as the lifetime of content::RenderFrameHost
objects. We don’t yet know.