A practice of rewriting foundation software from C to Rust(1)

Practice of C to Rust

Huailong Zhang
Rustaceans
5 min readJan 18, 2024

--

Preface

Inspired by “Google use Rust to rewrite the Android system and all Rust codes have zero memory security vulnerabilities” [1] in 2022, and with a strong interest in following the trend of Rust, I am trying to convert a foundation software from C into Rust. The primary purpose of this article is to communicate with everyone on the usage of Rust by recording these common and interesting problems I encountered during the conversion and how to solve these problems.

Problem Description

I encountered a problem related to the implementation of the CAS (Compare and Swap) [2] operation during this conversion. In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization.

The software targets different chip platforms and implements CAS operations by defining macros in C according to the category of the chip platform and embedding the corresponding assembly code. It’s obvious that the assembly codes of CAS operations corresponding to different chip platforms are different [3], for example:

x86–64 (Intel/AMD) requires assembly code blocks similar to the following:

lock cmpxchgq [destination], rdx

ARM requires assembly code blocks similar to the following:

ldrex  r1, [destination]
cmp r1, r2
strexeq r1, r2, [destination]

PowerPC requires assembly code blocks similar to the following:

lwarx  r0, 0, destination
cmpw r0, r1
bne retry ; branch if not equal
stwcx. r2, 0, destination
bne retry ; branch if store failed

However, as shown in the code snippet below, even if the software uses the same Intel x86 chip platform, the assembly instructions implemented on different operating system platforms are different.

Part of the code of cas_operation.h in the C header file is as follows:

#if defined(__i386) || defined(__x86_64__) || defined(__sparcv9) || defined(__sparcv8plus)
typedef unsigned int slock_t;
#else
typedef unsigned char slock_t;
#endif
extern slock_t my_atomic_cas(volatile slock_t *lock, slock_t with, slock_t cmp);
#define TAS(a) (my_atomic_cas((a), 1, 0) != 0)
#endif

Part of the code of corresponding to the implemented x86 assembly file cas_operation.s is as follows:

my_atomic_cas:
#if defined(__amd64)
movl %edx, %eax
lock
cmpxchgl %esi, (%rdi)
#else
movl 4(%esp), %edx
movl 8(%esp), %ecx
movl 12(%esp), %eax
lock
cmpxchgl %ecx, (%edx)
#endif
Ret

As we all know, although Rust also has a macro definition package Macros, it is quite different from the C. Therefore, how to achieve code compatibility based on the chip platform and the operating system level during the conversion is the biggest challenge I encountered.

Solution

Consider 2 solutions:

  1. Use asm! macro to wrap assembly code for different chip platforms
  2. Write Rust code to implement specific operations if possible

The first option is pretty simple. Just use std::arch::asm package in the code, and then use the asm! macro (similar to the println! macro) to wrap the assembly code for different chip platforms. This is also the most direct and easiest approach that I can think of, and there is no need to consider the instructions for specific assembly operation implementation. However, this method mixes a lot of assembly code from different platforms, and I am required to do a lot of extra platform-related logic control via Rust. The maintenance of these logic is also a quite tedious work. Such as, considering to support for the new platform RSIC-V for this software.

The second option requires considering the specific operation logic, then implement the same logic as the assembly instructions do by using Rust code. Although it takes more effort, this method can eliminate various problems caused by different chips and system platforms, especially the difference on assembly code implementation.

Readers can refer to the document Inline assembly [4] for the first option and the second option for CAS operations is the solution which I recommend. I implement u32-type CAS operation via Rust, the implementation code snippet is showed as below in file of my_compare_and_swap.rs:

use std::sync::atomic::{AtomicU32, Ordering};

pub type uint32 = libc::c_uint;
pub struct my_atomic_uint32 {
pub value: uint32,
}
impl my_atomic_uint32 {
#[inline]
pub fn compare_and_swap(&self, expected: uint32, newval: uint32) -> bool {
let atomic_ptr = self as *const my_atomic_uint32 as *const AtomicU32;
let atomic = unsafe { &*(atomic_ptr) };
atomic.compare_and_swap(expected, newval, Ordering::SeqCst) == expected
}
}

pub fn my_compare_and_swap_u32_impl(
mut ptr: *mut pg_atomic_uint32,
mut expected: *mut uint32,
mut newval: uint32,
) -> bool {
let atomic = &*ptr;
atomic.compare_and_swap(*expected, newval)
}

Let me explain above code. Since the software need to be converted from C to Rust, I used Rust’s libc package to predefine the uint32 type (Do not forget to import libc). Then wrap the CAS atomic operation on uint32 by customizing the struct my_atomic_uint32, and implement the inline compare_and_swap operation function for the struct.

The key is to convert the entity of my_atomic_uint32 into an AtomicU32 constant (Note that std::sync::atomic::{AtomicU32, Ordering} [5] must be used at the beginning of the Rust code file) in this function, then call compare_and_swap of AtomicU32 to finally implement the CAS operation of uint32.

In addition, the choice of Ordering::SeqCst memory order [6] is also a pretty delicate topic. The SeqCst I use here is actually an option which does not consider efficiency optimization while ensuring correctness.

In the last part of this code, my_compare_and_swap_u32_impl is the CAS operation of u32 for external user (in fact, the software mainly needs to implement the CAS operation on uint32).

Conclusion

Describing in above Solution section, both solutions have their own Pros & Cons and I must consider my own scenario. In my case, I prefer the second option because almost CAS operation implementation in this software is based on uint32 which corresponding to AtomicU32 of Rust.

In the end, I want to raise an open question here, if I need to implement CAS operation on many data types (such as uint32, int32, uint64, int64, float, float32, float64…), Which choice should I make? In my opinion, this may be a matter of options.

About Author

Huailong Zhang (Steve Zhang) has worked for Alcatel-Lucent, Baidu and IBM to engage in cloud computing R&D, including PaaS and DevOps platform development. He is working in Intel SATG now, focusing on cloud native ecosystem, such as kubernetes and service mesh. He is also an Istio maintainer and has been a speaker at KubeCon, ServiceMeshCon, IstioCon, InfoQ/QCon and GOTC etc.

References

[1] https://security.googleblog.com/2022/12/memory-safe-languages-in-android-13.html

[2] https://en.wikipedia.org/wiki/Compare-and-swap

[3] https://marabos.nl/atomics/hardware.html

[4] https://doc.rust-lang.org/reference/inline-assembly.html

[5] https://doc.rust-lang.org/std/sync/atomic

[6] https://marabos.nl/atomics/memory-ordering.html

Rustaceans 🚀

Thank you for being a part of the Rustaceans community! Before you go:

  • Show your appreciation with a clap and follow the publication
  • Discover how you can contribute your own insights to Rustaceans
  • Connect with us: X | Rust Bytes Newsletter

--

--

Huailong Zhang
Rustaceans

I am working on Intel SATG as a cloud software engineer and is Istio maintainer and been a speaker at KubeCon, ServiceMeshCon, IstioCon, QCon and GOTC