Why is low-power software so difficult?

If you are an embedded software developer, I expect that you’ll enjoy this article, and perhaps find it helpful. If you are the manager of an embedded development team, then you should also read this — although you may not like what I have to say.

There are plenty of articles and app-notes concerning low-power software and hardware design, but here I want to focus on the human (and political) aspect.

A quick history:

Today, the overwhelming method of fabricating digital ICs is with a CMOS process, but let’s rewind for a moment back to the early ‘70s, when CMOS chips were just beginning their commercial availability… If you weren’t alive back then, it’s hard to describe what a fundamental revolution that this technology was; in fact as important as flash memory, JTAG, or the microchip or transistor itself.

Power consumption was suddenly and drastically reduced by several orders of magnitude compared to the incumbent TTL and NMOS fab processes, with far-reaching consequences wrt designing total systems. A honest-to-God computer system could now run for weeks, off of a small battery pack. Today’s chips would in fact glow red-hot and melt themselves into puddles of slag, if it were not for CMOS.

And now, today:

Transistor sizes shrunk, integration levels increased, and we now have cheap SoCs/MCUs with staggering capabilities; incredible, even when compared to the 10-years-ago state of the art, yet we take this technology for granted. The chips are very inexpensive, but we pay another price: complexity.

Immense, almost overwhelming, complexity.

Low-power requirements amplify that complexity by at least a whole order of magnitude. With the available suites of SoC subsystems comes a plethora of power-gating and clocking options, each of which must be carefully orchestrated within the whole to maintain coherent operation — or there will be utter hell to pay.

Welcome to the Twilight Zone, folks…

This is not the same tamed-down, who-cares-about-power-consumption profligate regime as desktop IT; there are no disk drives, cooling fans, nor 350W power supplies. Memory requires power, so bloated desktop-style code cannot be tolerated; both the tasks and the OS itself must be lean and mean. Frugality with every microwatt is crucial. Wasting power on a smartphone may be merely an annoyance for the customer, but for a spacecraft or a medical implant, wasting power can result in disaster (speaking of which, I’ll add that these systems must not crash, ever).

Everyone on the product team (and I mean everyone, including the marketing people) must have a good understanding of what the power budget needs to be, right from the initial concept phase, and throughout the entire development process. There is an upside to this: you’ll know when you are able to say “good enough” — and when to piss some people off and say that it’s not nearly good enough. For critical applications, “good enough” of course means “the best possible”.

Good software people already use their best practices for well-factored, modular code that produces no unintended (or worse yet, unknown) side-effects, but now there is a Joker in the deck: No matter how well-planned and robust the software architecture is, there is now another complex network of interdependencies in the hardware that must be deeply understood, and dealt with; the slightest change of merely a single bit in a hardware register can result in far-reaching consequences. (e.g. as a trivial example, go ahead and change the main PLL; what just happened to the UART baudrate? What happens to the intended spectrum, when you are using DSP?)

Side-effects cannot be avoided; we can only plan ahead so that the system-as-a-whole is “aware” of the effects, and adapts its state suitably, because we already foresaw those consequences during the detailed design.

The run-of-the-mill IT programmer’s mentality will not help us here; we must intimately understand the underlying hardware that we are working with, and that includes (gasp!) learning our CPU’s assembly language.


Most programmers are sold on pre-supplied libraries and frameworks; these do a tremendous job of speeding development, and take care of the heavy lifting so that she does not need to become intimately familiar with the underlying hardware — but what is the real price of this convenient level of abstraction? (If you’re unfamiliar with the word “tanstaafl”, it’s an acronym for “there ain’t no such thing as a free lunch”.)

For instance, there are library functions to set a Pulse-Width Modulation (PWM) value in a hardware timer that look something like:

SetTimerOperation(WhichTimer, WhichTimerChannel, TimerMode, PWMvalue);

This takes dozens of instructions and more than 40 clocks, to do exactly the same job as:

ThisTimer->ThisChannelPWMregister = PWMvalue;

which takes only two instructions and three clocks. Setting the PWM is a frequently-performed routine, so the cost of using the library function is very high. Tanstaafl.

It gets much worse when we use a typical bloated commodity OS, with a convenient high-level language like Python or Java. Let’s look at Android for an example. Say we want to simply put a single pixel on the screen; our program chugs through the Java Virtual machine, which chugs through the Android API framework, which chugs through Android’s Linux kernel, which chugs through the Hardware Abstraction Layer, which finally chugs through the device driver — hundreds of instructions and thousands of clock cycles, merely to put a single pixel on the screen.

Why does your mobile device get hot? Why doesn’t the battery run as long as you wish? Now you know. Tanstaafl.

Think again.

We can’t meet our performance goals by cranking up the clock speed or adding more cores; we must make the programs as efficient as possible instead, and that restricts the choice of what programming languages we can use. Every bit — damn near literally every binary bit — of our object code needs to be crafted by an artisan, not cranked-out ASAP by sweatshop factory workers.

This requires time. Lots of time, and managers need to understand this.

Seemingly endless hours of time, buried in thousands of pages of documentation. Time to solve problems that were caused by the docs being incorrect. Time for experiments on the test-bench to find the truth, when the docs are unclear or incomplete. Time for coffee breaks that suddenly turn into marathon brainstorming sessions at the whiteboard (impromptu “design review meetings”, which often turn out to be more productive than the formal ones). Time to nail down the specs and requirements — I mean really nail them down, because a seemingly trivial change downstream can result in a lot of scrapped work. Time to sort out the caveats and “gotcha” traps.

With luck, we are managers who have been down in the trenches, and already know all about this. If not, then we will need to place trust in the crew, and give them every resource possible to get the job done. We need to watch the team at work solving all the problems, and learn for ourselves what’s really involved. The next time we are asked to sign-off on a proposed schedule, we’ll know much better than to give a knee-jerk “can do!”.

We all hate wasting our time in meetings and writing memos, right? Low-power gigs have some news for us: The bad news is that we will need to budget a lot of time for those sorts of activities (in fact we’ll probably need to memo ourselves, frequently, just to keep track of all the myriad details, details, details).

The good news is that the time won’t be wasted; it will be a very worthwhile up-front investment that will pay off handsomely in the home stretch.

We just can’t expect unrealistic miracles wrt the schedule.

cheers, — vic

Originally published at www.linkedin.com on March 13, 2015.