Article Roadmap: From JSON parsing to bindgen to Declarative and Procedural Macros and how to debug them

Safer, Simpler Embedded Rust with Apache Mynewt on STM32 Blue Pill

Lup Yuen Lee 李立源
Jul 6 · 18 min read

Declarative and Procedural Macros (plus bindgen and tips for Visual Studio Code) to protect Embedded Rust coders from stumbling into embedded traps

I’m named MyNewt not Mynewt because my handler is a crazy coder named Lup and this lizard doesn’t speak on behalf of the wonderful Apache Mynewt team

What’s great about Apache Mynewt? Today Mynewt runs on many microcontroller platforms with preemptive multitasking. (So it won’t choke when reading sensors and transmitting data simultaneously.)

Mynewt has drivers for many sensors (like BME280), networks (ESP8266, nRF24L01, …) and protocols (CoAP, CBOR, …).

But Mynewt was built with C, which has its problems…

The C code appeared in an earlier article about Mynewt on STM32 Blue Pill

Declarative Macros in Rust

Declarative Macros in Rust have the form ( pattern ) => { substitution }

Here’s a simple Declarative Macro add_88!() that returns its argument plus 88…

From https://github.com/lupyuen/test-rust-macros/blob/master/src/main.rs#L4-L14

The Rust macro accepts a single parameter $e, which should be a valid Rust Expression (denoted by expr). In C we would code the macro as…

#define  add_88(e)  ((e) + 88)

The Rust and C macro definitions are similar, except that Rust insists on knowing whether the parameter will be an expression (expr), identifier (ident), type (ty), statement (stmt), code block (block)… (Here’s the whole list)

We may provide multiple patterns like this to create an “overloaded function”…

From https://github.com/lupyuen/test-rust-macros/blob/master/src/main.rs#L4-L25

Complex Pattern Matching in Declarative Macros

But wait… Rust macros can do so much more because of pattern matching! Here’s a tiny snippet from a Rust macro that parses JSON code (adapted from the serde_json library)…

https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/src/mynewt/macros.rs#L105-L120

The pattern looks complicated…

(@$enc:ident @object $obj:ident [$($key:tt)+] ($value:expr) , $($rest:tt)*)

The parse!() macro is meant to be called with these parameters…

parse!( @json @object context ["device"] ("010203") , (omitted) )

Which is an intermediate step that’s called when encoding our CoAP message in JSON: coap!(@json { “device”: “010203”, … })

The macro parameters are matched against the pattern like this…

Matching the parse!() macro parameters with the macro pattern

What’s a Tag? Think of a Tag as an enum — an option that specifies how the macro should behave. For the first parameter of the macro we accept the tag “@json” to indicate that the macro should encode sensor data in JSON format, or “@cbor” for CBOR format. Tags may be used to implement internal rules.

What’s a Token Tree? It’s one or more Rust tokens that are logically grouped. Here are three examples of Token Trees: x, (x + y), { println!("hi"); }.

The + and * operators should be familiar if you have used Regular Expressions. So $($key:tt)+ means the $key placeholder will be matched to one or more Token Trees (denoted by tt).

In the source code of the macro there are a couple of interesting things…

1️⃣ d!(…) is a simple macro we created to dump the parameters of the macro. Useful for debugging.

2️⃣ coap_item_str!(…) is a macro that will generate the JSON or CBOR code for encoding the key: value entry. The encoding format depends on the first parameter.

3️⃣ The macro is recursive! Once the rule has encoded the key: value entry, it continues to encode the rest of the input JSON ($rest) by calling itself!

parse!( @$enc @object $obj () ($($rest)*) ($($rest)*) )

That’s how our coap!() macro (complete source code here) recursively parses a JSON document and emits the CoAP encoding. Which makes Rust Declarative Macros very powerful for parsing many types of code recursively. Even Domain Specific Languages! Check out the details in the Little Book of Rust Macros.

🤔 You may think… What’s the cost of embedding Rust code inside a C platform like Mynewt? Do Rust macros inject a lot more code to make them run on a C platform?

Not at all! The macros are expanded at compile time. And the Rust compiler is incredibly clever at pattern matching and type inference. So these macros will take up exactly the same RAM, ROM and CPU resources as the older C code! While keeping the code clear and simple!

Two Legged Problems

Import C Functions into Rust with bindgen

We could import a C function into Rust by coding the Rust binding manually (as explained earlier). But writing the Rust bindings by hand for the entire Mynewt API is way too tedious.

Fortunately we have the bindgen tool: It reads a Mynewt C function declaration like this (os_task_init() is the Mynewt function for creating background tasks)…

And produces the Rust binding like this…

We have a script that generates Rust bindings for Mynewt Core API, Sensor API, JSON API and CBOR API.

There’s still manual tweaking required… In the script we see many whitelisted and blacklisted C functions and types. These were carefully chosen to avoid duplicates across the different APIs.

Rust is self-documenting, so the Rust bindings for Mynewt automatically have documentation. Check out the Mynewt API Documentation for Rust

💎 Normally bindgen reads an entire C header file to generate Rust bindings for all functions declared in the file. But Mynewt uses many include folders that will totally confuse bindgen.

That’s why the script passes the options -CC -E -dD to gcc to create a C file that has all the include files (for that specific API) concatenated into one long source file. Which works great with bindgen!

The Restless bindgen Horde

Creating the safe wrapper in Rust doesn’t need a lot of manual coding… Procedural Macros in Rust can do the job for us automatically! First let’s understand what a Procedural Macro can do…

Create a Procedural Macro with syn

Here’s a simple Procedural Macro named out!() (shown in the pic above) that expands…

out!( NETWORK_TASK )

into…

unsafe { &mut NETWORK_TASK }

…because to a C coder starting Rust coding, unsafe &mut looks doubly intimidating, so we use a macro to declare that NETWORK_TASK will be modified by the external function, like this: out!(NETWORK_TASK)

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L279-L292

Unlike Declarative Macros, Procedural Macros are coded like a real Rust function. A Procedural Macro simply transforms a stream of Rust source code tokens (TokenStream) into the expanded tokens, that will be fed back into the Rust compiler after expanding the macro.

parse_macro_input!() is provided by the syn library for parsing streams of Rust source code tokens. We can parse any Rust code with it: Expressions, Statement Blocks, Struct Definitions, Extern Declarations, … Here we are parsing the input as a Rust Identifier, like a variable or constant name.

We use format!() to create the desired code. Then we parse the code into a TokenStream and pass it back to the compiler.

https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L293-L306

This strn!() macro is similar… It expands

strn!( "network" )

into…

&Strn::new( b"network\0" )

The macro code looks similar, except that it parses the input as a Literal String (LitStr) instead of an Identifier. (And the expansion is different of course.) Why did we create strn!()?

C vs Rust Strings: Null vs Non-Null Terminated

In the world of Embedded C, strings are simple fixed arrays of ASCII bytes, terminated by a null 00 byte. Rust strings are more complicated — A Rust string is a Vector of bytes. A Rust Vector is resizable, so the vector uses an internal counter to remember its length. Strings in Rust are not terminated by the null byte.

This causes problems when Embedded Rust coders call Mynewt APIs, drivers and other C functions… 1️⃣ They may use Rust strings as though they were C strings, omitting the terminating null byte (and causing the C code to crash) 2️⃣ Or they may create many temporary Rust strings while appending the null byte before calling the C functions.

The solution we have chosen is a simplistic one — We create a new type Strn that represents a null-terminated string that will never be modified. It contains a fixed slice (array) of bytes that always ends with null. Strn verifies this when setting and getting the string.

The wrappers for Mynewt APIs only accept Strn for incoming strings, instead of the default *const char.

How do we create an Strn? By calling the strn!() macro we just seen…

strn!( "network" )

Which expands into…

&Strn::new( b"network\0" )

This creates an Strn from the Rust Byte String b"network\0". C strings behave more like Rust Byte Strings, but unfortunately Byte Strings are harder to manipulate without a helper like Strn.

No more missing terminating nulls… No more temporary copies of strings just for adding nulls… We have made Mynewt APIs safer, simpler and more efficient for Rust!

Compose a Procedural Macro with quote!{}

Checking result codes returned by C functions can be very tedious and we forget to check them sometimes. Let’s look at this macro run!() that was used in our sample code to transform a bunch of C function calls (for encoding CoAP CBOR messages)…

…into this code with proper error checking, via check_result()

How should we generate the expanded code in the run!() macro? We could have used the format!() method shown earlier…

But for generating a block of code, calling the quote!{} macro (from the quote library) looks cleaner…

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L395-L402

Note that we use #stmts as a placeholder to tell the macro to substitute the value of stmts into actual expanded tokens.

We also used quote!{} in our macro like this…

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L371-L377

quote!{} is used here like a template, injecting stmt_tokens into a chunk of Rust code. Super useful for generating Rust code… in a Rust program!

Match Code Patterns with syn

Procedural Macros are more powerful than Declarative Macros because they can analyse the source code tokens (with the syn library) and expand them differently depending on the context.

Our run!() macro is picky — it only watches out for calls to functions named cbor_encode_.... And it wraps the call with an error handler (and inserts the namespace tinycbor):
let res = tinycbor::cbor_encode_...

It doesn’t disturb other statements like let encoder = ... because wrapping this with an error handler would be so wrong. So our macro needs to…

1️⃣ Match a Statement…

2️⃣ That contains a Function Call…

3️⃣ That looks like cbor_encode_...

That’s easy to do with the syn parser and Rust pattern matching…

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L358-L371

Rust has incredibly powerful enums that will let us match against patterns like Semi(expr, ...) and Call(expr). This works really well with the syn parser for creating powerful source code transformations.

The complete run macro with calls to syn and quote is shown below. Although the run macro is meant for calling Embedded C functions, the macro applies a layer of error checking on top of the function calls, in a way that feels natural to the Rust coder. Safer coding, made possible with Rust’s Procedural Macros!

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L326-L404
Evolving the #[safe_wrap] macro

Put Everything Together: #[safe_wrap] macro

Remember the Safe Wrapper for Mynewt Functions? Now we can explain how the wrapper was constructed automatically with Procedural Macros.

To create the wrapper, we apply the #[safe_wrap] attribute to the Rust binding created by bindgen

#[safe_wrap] is a Procedural Macro defined here

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L24-L107

The macro calls parse_macro_input!() from the syn library. The input to the macro is parsed as an extern function declaration (denoted by ItemForeignMod).

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L24-L107

Next, we inspect each parameter of the extern function and call transform_arg_list() to transform each parameter into three forms…

1️⃣ Wrapper Declaration: How the parameter type is exposed via the wrapped function.

For example, *mut for output pointers looks odd to C coders, so we rename it as Out<…>, which clearly states the intent (output pointer).

*const c_char for input strings is renamed to &Strn, since the Strn type validates that the string is null-terminated. (Again, Strn signifies the intent.)

2️⃣ Validation Statement: To validate each parameter if needed, like verifying that all input strings are null-terminated. (We use the Strn type to perform the string validation.)

3️⃣ Call Expression: Inside the wrapper, use this type cast expression to call the C function. Rust is stricter than C about types, so we use type casting to handle tiny discrepancies like i32 vs u32 (signed vs unsigned integers).

From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L24-L107

Finally we use quote! macro to combine the three forms into the expanded output below. The quote! macro clearly shows the structure of the output, thus I highly recommend quote! for composing Procedural Macros.

We use quote_spanned! instead of quote! in some spots…

What’s a Span? When the Rust Compiler sends a stream of tokens to our macro, it also transmits the byte location of each token in the source file. So when it hits a compilation error, the compiler can display the exact source code that caused the error.

Our macro is synthesising source code — creating new source code based on the original source code. So it’s possible that our synthesised source code will have errors. When that happens, we should tell the Rust Compiler the precise location of the original code.

That’s why it’s important for our macros to preserve the Span information. We create three forms of each parameter, but they are placed into different sections. By using quote_spanned! we preserve the original Span of each parameter. And the Rust Coder gets meaningful, relevant error messages.

Overview of the #[safe_wrap] attribute
Result, Ok and Err in Embedded Rust

What are Result<>, Ok() and Err() in Rust?

In Rust, it’s customary for functions to return Result<…> instead of an error code like C. That’s why our Safe Wrapper is declared as…

MynewtResult<…> (derived from Result<…>) is a Generic Type that contains either…

1️⃣ Ok(result) in which result is an optional result value

2️⃣ Err(error_code) in which error_code is the Mynewt system error code

How do we indicate the type of the result value? Through the function definition…

fn my_function() -> MynewtResult<i32>

…means that my_function() returns an integer result value or an error code.

fn my_function() -> MynewtResult<()>

…means that my_function() returns no result value (a.k.a. void) but it may return an error code.

To return a result value, we return Ok(value) or if there’s nothing to be returned, Ok(())

To return an error, we return a standard Mynewt error code wrapped with Err() like this: Err(MynewtError::SYS_EAGAIN)

What happens when we call a function that returns an error? For example…

See the strange ? dangling at the end of the task_init() function call?

It returns any errors immediately when they occur. The function exits early without executing the rest of the function.

So Result<…>, Ok(…), Err(…) and ? really help to make Mynewt error handling so much easier.

Debug Rust Macros with Visual Studio Code

Visual Studio Code is a great way to debug Rust Declarative and Procedural Macros. Just follow these steps…

1️⃣ Install Visual Studio Code. Install rustup according to the instructions at rustup.rs

2️⃣ Select the Nightly Build of the Rust Compiler…

rustup default nightly
rustup update

3️⃣ For Windows: Install the Remote WSL Extension for Visual Studio Code so that the Rust build runs in the Linux (Ubuntu) environment, which has fewer problems. Otherwise you’ll have to install Rust twice according to these instructions.

4️⃣ Install the Rust Language Support Extension for Visual Studio Code. If it won’t install properly, check these instructions.

5️⃣ Install the Task Runner Extension for Visual Studio Code. This lets you click on build tasks easily in the Task Runner pane at lower left.

5️⃣ In Visual Studio Code, click View → Command Palette → Git Clone. Enter

https://github.com/lupyuen/test-rust-macros

and select a local folder. This Rust project contains demo macros used in the next section.

6️⃣ When prompted, open the cloned repository and open the workspace

7️⃣ In the Workspace pane, open the file src/main.rs to view the demo macros. We’ll be using the demo macros next…

Here’s a video of the installation steps. Click CC to view the instructions…

Watch Macro Expansion in Visual Studio Code

To see how simple macros are expanded, use trace_macros!() like this…

From https://github.com/lupyuen/test-rust-macros/blob/master/src/main.rs#L170-L176

Mouse over the macro name, like add_88 (located in the main() function)

The macro expansion appears in a pop-up

Click Peek Problem if the expansion is long

For complex macros, use this method to view expanded macros in the Rust build log…

1️⃣ In Visual Studio Code, browse the Workspace and open the file .cargo/config

2️⃣ Uncomment the first option…

"-Z", "unstable-options", "--pretty", "expanded",

3️⃣ Click Terminal → Run Task → cargo build

The expanded macro appears in the cargo build log…

Expanded macro in the cargo build log

4️⃣ With the Task Runner Extension, we may also click cargo build in the Task Runner pane at lower left

Here’s a video of the macro expansion. Click CC to view the instructions…

If you’re not using Visual Studio Code, run this in the command line to see the expanded macros…

cargo rustc -- -Z unstable-options --pretty expanded

Macro Hygiene in Rust

More about Macro Hygiene

Debug Macro Hygiene in Visual Studio Code

The Rust Compiler can show us information about the salt variables and which context they belong to…

1️⃣ In Visual Studio Code, browse the Workspace and open the file .cargo/config

2️⃣ Uncomment the second option…

"-Z", "unstable-options", "--pretty", "expanded,hygiene",

Check that the first option is commented.

3️⃣ Click Terminal → Run Task → cargo build

The expanded macro with context information appears in the cargo build log. Here’s what it means…

Rust Compiler displaying the context of every variable

Here’s a video of macro hygiene in Visual Studio Code. Click CC to view the instructions…

If you’re not using Visual Studio Code, run this in the command line to see the hygiene information…

cargo rustc -- -Z unstable-options --pretty expanded,hygiene

Macro Context: Embedded vs Desktop

🤔 Macro programming is also known as Metaprogramming… Like “Inception”, it plays weird tricks with your mind, as you code a program within a program… And you ask yourself: “Which level am I at right now? What am I really coding for?”

We are actually stacked on TWO “Inception” Layers right now…

1️⃣ We’re coding an Embedded Rust program for STM32 Blue Pill. Cargo.toml is located here.

2️⃣ We’re coding a Procedural Rust Macro, which transforms the source code of the Embedded Rust program. Cargo.toml is located here.

They run on different contexts…

So we may make the mistake of adding Rust Standard Library features to our Embedded Rust program. Which will produce Rust Compiler errors.

As we code, always think carefully… Which level are we coding on?

Looking for more Embedded Rust on STM32 Blue Pill? Check this out…

Appendix: Index Of Images

This section is not meant for humans; it’s for web crawlers to index the text content of the images

“Hello I’m MyNewt. I’m an Embedded OS that runs on Bare Metal”

“Yep I’m open source. Fully fluffy inside.”

“Many types of microcontrollers. I run on Super Blue Pill too.”

“Why run Mynewt instead of Embedded Rust on bare metal?”

(Thanks to RedMart for delivering the bare metal cans on Sunday)

“What’s wrong with C? The code gets messy when we do something simple… Like sending a JSON message over CoAP, using macros…”

“It’s almost 2020. Why do coders suffer like this?

“Here’s the solution in Rust… Simple Clean Rust Code”

“Just that the code inside the curly brackets {…} isn’t really Rust, it’s JSON”

“How does Rust support alien languages?”

“Answer: Rust supports them thru Declarative Macros”

Encoding Format: @json or @cbor

Parser State: @object means that we are parsing an object

Encoding Context: Required for macro hygiene. Key of JSON entry / Value of JSON entry. Remaining JSON to be parsed

“My two-legged friends are now thinking…”

“What if I need to call some code in Embedded C?”

“What if my sensor or network drivers run only on C?”

“The entire Mynewt API is in C… How do we call that from Rust?”

“Rust will let you call any C function, even C functions defined in Mynewt!”

“All you need to provide: the Rust bindings for the functions”

“Which the bindgen tool can generate automatically”

“This C declaration… is transformed into this Rust declaration by bindgen”

“With bindgen we can import all the C functions from Mynewt into Rust… In a single click!”

“But is the imported horde all set to run on BARE METAL?”

“See how we are rolling on top of Bare Metal? This is highly UNSAFE!”

“We could tumble down anytime and damage the Bare Metal. Or ourselves”

“Calling an imported C function from Rust is UNSAFE. Looks ugly too!”

“In my fantasy, calling a Mynewt function from Rust would be so safe and easy. Like this…”

“We should call the Mynewt function thru a SAFE wrapper…”

“Wrapper checks incoming strings for null termination”

“Wrapper checks that output pointers are valid”

“The SAFE wrapper protects Bare Metal from any damage. SAFER and SIMPLER coding… the power of Rust!”

“Sometimes working with Rust feels like meeting a Food Safety Inspector…”

“HEY YOU!!!”

“Yes… Inspector?”

“You are violating the rules of Safety and Hygiene… NO LIZARDS ALLOWED!!!”

“I can explain…”

“NO LIZARDS ALLOWED!!!”

“What’s Hygiene? Hygiene is keeping your bathroom clean. Hygiene is keeping your pet lizard’s cage clean”

“Hygiene means Never do sneaky things to fool the Rust Compiler… Like pretending to be something else with the same name”

“This is very odd but 100% true… (Ask your Food Safety Inspector): Declarative Macros respect Hygiene, Procedural Macros DO NOT respect Hygiene”

“Pretend you’re running a Restaurant Business (for humans, not lizards). A Food Safety Inspector walks into your restaurant and orders the soup. The Inspector makes a complaint…”

“There’s no salt in the soup”

“Sorry Inspector, I’m really sure there’s salt in your Tomato Soup”

“There’s no salt in the soup”

“I wish I could show it to you… But there’s really SALT IN YOUR SOUP!”

“There’s no salt in the soup”

“I am putting MY FINGER IN THE SOUP… There’s REALLY SALT IN THE SOUP!!! Why do you KEEP SAYING THAT???”

“There’s no salt in the soup”

“As you bash your head against the menu, you realise one thing… The menu reads TOMATO SOUP with SEA SALT… But you served the Inspector TOMATO SOUP with TABLE SALT”

“The Inspector was right… There was no SEA SALT in the soup. And SEA SALT is not the same as TABLE SALT, even though we refer to them by the same name: SALT”

“Unfortunately for Rust macro coders, this Hygiene problem is very real… The salt that’s inside the macro here… Is NOT in the same context as the salt here…”

“Because the contexts are different, the Rust Compiler stops us from using the same salt from inside and outside the macro…”

“The solution: Pass the salt as a parameter. This forces the Rust Compiler to treat the two salts as the same”

“In make_bad_soup(): Contexts don’t match. Rust Compiler fails with Hygiene Error”

“In make_good_soup(): Contexts match. Rust Compiler is happy!”

“Rust has so many goodies for Embedded Coders…”

“Do we really want to keep on coding Embedded C forever?”

“Some of the Arduino and C code I’ve seen… Made me fall off the tree!”

“As you sleep tonight, dream of the safe, clear, simple code that you’ll be writing… In Embedded Rust”

“I’m not really green. FOOLED YA!”

Lup Yuen Lee 李立源

Written by

Techie and Educator in IoT 物聯網教師