Abusing Fire for Light

Jeff Hiner
Dwelo Research and Development
8 min readAug 2, 2019

Why we didn’t rewrite our IoT app in C++

Photo by Jamie Street on Unsplash

This is part 2 of a many part series about how we at Dwelo rewrote their IoT gateway software in Rust. The series starts here, and the next chapter is here.

I was originally going to write about why we at Dwelo didn’t choose C++ for our IoT rewrite, but then I realized there’s a broader issue to discuss. I want to put this decision in the context of the sins of all of our historically bad code, collectively, as a species. In the 80s and 90s we used our Promethean gift of C to write software for every embedded system everywhere. There was no limit to what we could accomplish. But a lot of things we thought were clever back then are in retrospect clearly terrible habits. These familiar mistakes continue to cause pain and destruction decades later. I’m going to bring up some of the indisputably awful things I’ve seen in production code in both C and C++. I’m going to talk about the mistakes I’ve made and seen, and in the next chapter we’ll discuss how a simple choice of language categorically eliminates a great deal of them, while still giving us enough flexibility to write our low level software.

We could have rewritten our IoT platform app in C++. It checked all the boxes. It would have been familiar to me, even easy. But it would also have made it too easy for myself or others to make mistakes. Using C is like using a candle for light. Its basic properties are well-known, it’s been around since the beginning of civilization, and it will set your house ablaze around you if you misuse it. (In this metaphor C++ would be “the set of all things that can be lit on fire to produce light”.)

I’ve read a lot of bad code over the last 30 years. I’ve written my fair share as well. None of us are perfect coders, and we should choose tools that reflect that humility. Let’s see if any of these pitfalls are familiar to you.

The Mystery Pointer Argument

void myfunc(char* c);
char c;
myfunc(&c);

Is the argument to myfunc an input? An output? Both? Is it actually intended to be a null-terminated array? Do all code paths initialize the passed value? Will myfunc segfault if I pass it NULL? Will the function retain the pointer I pass in and use it later? It's impossible to look at the function definition and determine exactly what it will do, because it’s allowed to do quite a lot. For small programs this is slightly annoying. For programs with more than about a dozen functions it rapidly becomes impossible to reason about program correctness, and we have to rely on API documentation. And as we all know, documentation is always correct and up to date.

I’m sure the pedants among you are screaming at the monitor that if the pointer argument is intended solely as an input then the signature should take a const pointer. While this is correct, I have never worked on a legacy codebase that correctly and consistently declares arguments const. (Note: I’m not sure if this is because ancient compilers didn’t support the const keyword or because of some collective arthritis that made those extra characters painful to type in decades past. Regardless, it’s apparently commonplace in older code.)

Const correctness still doesn’t address the other elephant in the room…

The Null Pointer

const char* foo = 0; /* I remembered to initialize my variable! */
printf(foo);

I’m going to assume you’ve read about the billion dollar mistake and that you’ve seen “Segmentation Fault” or “NullPointerException” at some point in your programming experience. If you haven’t, go ahead and click on the link above. I’ll still be here.

Back yet? Awesome.

I’m going to come out here and say this: Sir Tony Hoare is not only brilliant, but also an amazingly humble person. There are very few computer scientists or software engineers that will flat-out admit a design decision was a mistake. And it’s unfortunate that this particular mistake from the 1960s is so pervasive that I’m sure many of you will look at the code above and think, “The compiler will warn you, what’s the big deal?”

The big deal comes when you’re passing pointer arguments around from A to B to C through ten different functions in eight different files and it’s not immediately apparent to the compiler (or a reviewer) that you’ve just passed null or uninitialized pointers around between functional contexts.

You’re not checking every function entry for nonzero pointers at run time, I’m not, and the standard library sure isn’t. Let’s not kid ourselves here.

Fine Then, Let’s Use C++ References

#include <string>bool isEqualToLast(const std::string& s) {
static const char * last = "";
bool foo = s.compare(last) == 0;
last = s.c_str();
return foo;
}

This code compiles without warnings on GCC 8 with -Wextra, we’ve used const correctly, and we know the parameter points to real data. It will work fine… as long as each parameter is statically allocated, or heap allocated and not prematurely freed. Bug free, right until the call after you invoke it once with a stack variable.

Implicit Cast Bugs

char data[ENORMOUS_BUF_SZ];
for (int i = 0; i < sizeof(data); ++i) {
/* do stuff */
}

One of the most innovative concepts from C is the type system. Every expression has a type, and if you try to use a character array where an integer is required, for example, you get a compile error. Often this prevents you from making gross mistakes like mixing up the order of function arguments. However, the compiler sometimes “helps” by silently casting 8-bit to 32-bit, signed to unsigned, fudging the types a bit until they line up. It is required to follow certain implicit cast rules, even though these rules may not match your intuition. Because this is expected behavior, the compiler is not required to warn you when it does so.

We’ve had size_t for decades now. Unfortunately, modern code is still littered with loops that use int or unsigned int when they should use size_t, and incorrectly specified function argument types abound. It works fine, up until it doesn't. And don’t get me started with incorrect uses of time_t.

Explicit Cast Bugs

const int data[512] = {0};
volatile uint32_t* WDT_REG = 0xFFFFFFE0;
/* ... */
byte_sending_function((char *)data, sizeof(data));
handle_watchdog((uint32_t *)WDT_REG);

The cast to byte_sending_function should be (const char *), and the signature for handle_watchdog needs to take a volatile pointer.

Yes, I know about static_cast and reinterpret_cast in C++. But C-style casts are still in books and taught in classes, and they are still making it into new C++ code. And in every compiler I’ve seen, it's perfectly legal to cast away const or volatile.

Who needs error handling anyway?

#include <stdio.h>
#include <stdlib.h>
int main() {
FILE * fp = fopen("file.txt", "w+");
fprintf(fp, "This cannot possibly go wrong.\n");
fclose(fp);

return 0;
}

This lovely example from Google’s #1 hit for “fopen example” (reformatted slightly) doesn’t bother to check to see if we could actually open the file, and the compiler doesn’t require or even remind us to. Works for me, not a bug, hurry up and compile because I have more potential segfaults I need to add to this codebase. Chop chop.

(I realized while reviewing this post that fprintf and fclose can also return negative values to indicate failure. I forgot, because it’s common even for code that correctly verifies the fopen handle to not check even a single fprintf return. Omitting this check won’t segfault, but the code also won’t know if it couldn’t correctly write to the file.)

Buffer overfl$%^&#\b0x9328A7F0Segmentation fault

This paragraph has performed an illegal operation and must be terminated.

If you believe you are seeing this message in error, please contact support.

Union types

union {
int id;
void * widget_ptr;
} widget;
#ifdef LINUX
widget.id = 42;
#else
widget.widget_ptr = malloc(64);
#endif
/* many lines later... */
/* I'm quite certain I stored a pointer in here */
free(widget.widget_ptr);

Guys, I’m going to put this button right here with a small sign that says “do not touch.” As long as nobody misuses it, we’ll be fine.

Thread safety

I’m not even going to get into the gory details of reentrant functions, side effects, atomicity, mutexes and semaphores, memory fences, or any of that. C and C++ are structured for single-threaded imperative programming — you give the computer a list of calculations to perform, in order. If you’re trying to be clever and use it for multithreaded applications then the onus is on you to be skilled enough with your pointers and aliasing and shared state to never make a single mistake anywhere in your code. If it works, you will have the fastest business app the world has ever seen. If it doesn’t, will anyone find the problems before you’re long gone? It’s said that genius and insanity are two sides of the same coin; let’s explore that boundary a bit. Here is your best practices guide. Valhalla awaits.

Okay, I get it. But I like C. Can’t we just fix C/C++?

A ton of work has been poured into making warnings smarter, improving linters, documenting best practices, and so on. There are shiny new helpful bits in C++11/14/17: unique_ptr, range-based for loops, and RAII all can help prevent bugs (if you use them). There are standards organizations like MISRA and security organizations like CERT to help you find and fix critical safety and security issues, if everyone on your team follows those recommendations to the letter without fail. But the sharp detritus of K&R C is still littering the floor, and nothing stops you or the guy next to you from ignoring the caution tape and tripping onto it.

Despite substantial effort dedicated to tooling and process, the C standard still has tons of undefined behavior. When all is said and done, the pair of programming languages that drive most of the world’s software are subtly yet fundamentally broken because trained, intelligent professionals consistently make costly mistakes in production. We’ve collectively tried to paper over the problem, and it’s a testament to the incredible skill of just two Bell Labs engineers that their language is flexible enough to let us try all these fixes! But we really need to address the underlying structural issues. And, unfortunately, we can’t fix C without breaking backward compatibility.

C and C++ code manages your car’s throttle control, airbags, and anti-lock brake system. It underlies avionics software for passenger jets, both critical and non-critical. It quietly runs the embedded operating system on a bunch of stuff you use without a second thought, things you expect are so simple they should just be secure and stable by default. Credit card terminals. Electrical power grid systems. Elevators. Military hardware. Wifi routers. ATMs. Voting machines. As a fallible human being who writes embedded software for a living and has seen what’s in the wild, that terrifies me.

Collectively, we wield these tools because they are familiar, but experience shows we cannot use them safely. I like C, and we owe a lot to its legacy. But we also have a responsibility to write in languages that help us work around our own shortcomings. The problem isn’t that we’re installing road flares to light our houses when gas lamps would be more appropriate, though there’s certainly enough of that going around. The problem is that we shouldn’t be using fire at all. We have LED flashlights now. It’s time to stop lighting candles.

In the next chapter, we talk about the ways the design of the Rust language mitigates some of these concerns.

--

--

Jeff Hiner
Dwelo Research and Development

I’m an IoT software engineer at Dwelo, a company that is working to make smart apartments a reality.