The Debugger’s Guide to the RTOS

Prasanna Rawke
Tech@Carnot
Published in
10 min readAug 9, 2020

The article explains the Carnot way of designing an RTOS enabled IoT application.

Table of contents:

A Real-Time Operating System, commonly known as RTOS, and pronounced as Are-Toss, enables embedded systems to run multiple tasks simultaneously on simple microcontrollers, unlike bare-metal approach.

A bare-metal program performs the function of multiple tasks sequentially as a single task.

But why do we need RTOS in the first place?

Over time, after a lot of brainstorming, debugging and optimization on different IoT applications for cars, bikes, and tractors at Carnot, we have come up with a simple rule of thumb

Any IoT application can be divided into three stages:

Input, Processing and Output

  1. Input stage, that reads different sensors in your system.
  2. Processing stage, that applies different algorithms on the raw data to generate the processed and meaningful data
  3. Output stage, which can display this processed data on a display, or send it to some server over internet or just stream it over Bluetooth etc.
Bare-metal IoT Application

Think of any embedded application, let’s consider a simple calculator.

  1. Your calculator has the keypad to receive different numbers and operations like addition, multiplication. Whenever you press a new key, it reads your input.
  2. The result is computed based on the inputs from the keypad. If the user enters wrong values, it will generate error or warnings.
  3. The generated output or warnings is then displayed on the screen.

Now that you know how a typical IoT application looks like, you know how to implement it in bare-metal approach.

The Problem with the Bare-Metal Approach

The bare-metal approach is polling and interrupt driven approach.

  1. Here, input module reads the data from sensors periodically and sets some dataAvailable flag to processing module.
  2. Processing module keeps polling for this flag. Once this flag is set, it processes the data and generates the output and similarly sets the outputAvailable flag.
  3. Similarly, the output module waits for outputAvailable flag and sends / displays the output.

By now you might have noticed that the input, processing and output modules are running in sequentially one after another.

If one module takes extra time for its functioning, other modules will be blocked for that much duration, meaning that you cannot precisely schedule a particular module in straight-forward way. And whenever you add a new sensor in your system, the process becomes more and more complicated.

Here, RTOS comes to the rescue. Using RTOS, you can run these three input-output-processing stages as three different threads running simultaneously.

RTOS-Based Approach

The input thread and the processing thread can forward their data to processing thread and output thread respectively.

Generic RTOS based implementation of any IoT application

Each of these threads has a dedicated task to perform, e.g. in case of a calculator

  1. The input thread reads the keys pressed periodically / or on interrupt events. Whenever the keys are pressed, it will forward data to the processing thread and start waiting for the next input event
  2. The processing thread receives the data from the input thread and generates the result (or the warnings) to be displayed to the user. After forwarding the generated result to the output thread, it starts waiting for new data from the input thread.
  3. The output thread continuously waits for the new data from the processing thread. It updates the display whenever a new data is available from the processing thread.

You can run these three threads simultaneously using RTOS to obtain precise timing and better accuracy of your application.

The rest of the article will explain different tools and mechanisms provided by RTOS, and how to use FreeRTOS APIs to utilize them in your IoT application.

The Building Blocks of RTOS

These are the most important tools from RTOS, needed to implement a full-fledged embedded application:

If your embedded application is a human body…

Sensors and the drivers are the five senses
Threads are different functions you perform like reading, watching a movie
Scheduler is the heart and brain which decides what happens next based on emotions (interrupts) and situation (present state of the program)
Queues, Streambuffers are the veins, circulating the blood through the body
Mutex, Semaphores are like hands and legs which perform the balancing (task synchronization and resource management)
And timers and delays provide the sleep in such a way that you still keep breathing

We will go through these different components one by one.

Scheduler

Scheduler, the brain of the RTOS, plays the critical role of running the different threads sequentially and determining which thread runs the next. The scheduler takes different interrupts, different tasks, RTOS timers into account to decide the state of a thread (running / ready / waiting).

While a scheduler can have different scheduling algorithms like round-robin, first-in-first-out, smallest-task-first to decide the next task to execute, the most state of the art RTOSes use priority-based schedulers.

You don’t have to worry much about this part of the RTOS, as the developers of the RTOS have already spent a lot of brains on this to make it optimum and bug-free.

Threads (aka Tasks)

A task is a piece of code that can be separated from the remaining program and can be executed independently. An overall embedded application can be divided into multiple such tasks, which can run independently of each other.

Now, why would I want to divide my application into multiple tasks?

  • It simplifies its implementation by separating the source code of one task from another
  • It increases the readability and reducing the number of possible bugs seen during the development cycle
  • And thus, different developers can work on different tasks (yeah, both RTOS task and a scrum task) at the same time without worrying about the other tasks

Every task has a few parameters allocated to it. These parameters can be specified in its initialization:

  • Stack size: All local variables used by the thread are stored inside the stack allocated to the thread.
  • Priority: Priority allocated to the task is used by the scheduler to determine which task runs the next. A higher priority task / ISR can interrupt an ongoing lower priority task, and this is called preemption.

Preemption is the act of temporarily interrupting a task being carried out by the operating system, and to resume the task at a later time.”

Thread states

RTOS Threads can be in three states:

1. Running thread: In RTOS program, There can be only one running thread at any given instance. A running thread goes to waiting state, if it starts waiting for an event. A running thread when it is interrupted by a thread / ISR of higher priority goes into Ready state.

2. Ready thread: Any thread which is ready to be executed, is in the ready state. A ready thread with the highest priority will go into the running state, when the scheduler is executed.

3. Waiting thread: Any thread, waiting for some event to occur, is in waiting state. Waiting thread goes into a ready state when the event it is waiting for occurs. The events can be typically a semaphore, mutex, wait for queue or a simple delay.

Note that the threads in the ready and waiting state do not use any CPU resources apart from their allocated stacks.

Only the running thread has the access to all peripherals at any moment.

Initializing an RTOS Task:

The following snippet initializes a thread using FreeRTOS APIs.

Main Thread with Priority 1 is initialized with 512 bytes of Stack Memory

Semaphores and Mutexes

Semaphores and mutexes are used for thread synchronization, and sharing a resource across multiple threads in the RTOS.

Semaphores

Semaphore is simply a variable that is non-negative and shared between threads, essentially signaling from one thread to another.

A semaphore is generally used for two purposes:

  1. Signaling events between the thread:
    Incrementing the semaphore inside an event handler (e.g. UART receive interrupt), decrementing it inside the thread when the event is processed (UART data is processed).
  2. Resource Management:
    Task decrements a semaphore whenever a resource is being used, increments when the resource is free. If the semaphore hits 0, all resources are busy.
    MUTEX can be used when only one resource is available.

Mutexes

Mutex (‘MUT’ual ‘EX’clusion) is nothing but a RTOS aware binary flag.

A MUTEX can be used as a key to access a shared resource from two threads in such a way that only one thread has the access to the resource at any instance.

Use of MUTEX to share a RTC peripheral between two threads

When the rtcMutex is acquired by some thread or interrupt, the RTC module is assumed to be busy.

Whenever a task wants to access RTC, it has to acquire the rtcMutex, access the RTC module, and release the mutex.

While the rtcMutex is acquired by taskA, taskB will wait for the mutex before accessing RTC.

Once the rtcMutex is released by taskA, taskB can acquire the Mutex and can access the RTC, and release the mutex.

Timers

A software timer can be defined using the RTOS APIs, which can give the resolution up to RTOS Tick Period (usually 1ms).
The use of RTOS timers removes the dependency on the hardware-specific timer details, like clock Prescaler, up / down counter mode, etc.

For an instance, if you implement a RTOS timer of 1s, the same source code can be used on Arduino as well as ST’s Discovery, as long as both of them are using the same RTOS codebase.

A scene at Carnot Headquarters

Using a RTOS Timer for a Blinky

Use of software timer to toggle LED every 1s

Delays

RTOS based delay puts the thread into the waiting state for the specified delay duration,
on the other hand, the bare-metal based delay which uses the CPU resources and NOP instructions or a counter to execute the delay.

Why keep looking at the watch and wait,
when you can do other tasks or just set alarm and sleep!

Whenever a thread calls a delay, the thread goes to waiting state while the scheduler is free to execute other tasks in the ready state.

RTOS Delays running Two Blinky’s at the same time

Use of RTOS delay to implement two red and blue LED blinky simultaneously.

Queues and Streambuffers

These RTOS data types provide the tools for inter-thread and interrupts-thread communication, thus simplifying the implementation of shared variables to be accessed in different threads and ISRs.

The brute force implementation of shared variables usually causes scenarios like multiple threads accessing/updating the same variable at the same time, causing the ambiguity in the behavior of the program, and thus future bugs.

Why implement a Linked List and Circular Queue,
when you have a Task Aware RTOS Queue and a Streambuffer

Queues

A typical use of first-in-first-out RTOS Queue in RTOS enabled application:

  1. A sender thread or an ISR enqueues an object in the queue,
  2. Receiver thread dequeues from the queue.
  3. The receiver thread can wait in WAITING state until there is an element in the queue or the user-specified timeout, whichever happens first.

As the receiver thread stays in waiting state, it does not use any resources of the CPU, unlike the polling-based implementation in bare-metal implementation.

Timer callback sending the data to a thread by RTOS Queue

An RTOS timer callback reading the timestamp and sending it to the Processing Task via queue

Streambuffer

Streambuffer is essentially a shared buffer, which can be filled from an ISR or a thread, and read from other thread.

You might be wondering, what is the difference between a queue and streambuffer?

A queue can hold elements of the fixed data type like an integer or weather_t object, on the other hand, streambuffer holds bytes and thus data can of any size.

You can use streambuffers whenever you want to transmit and receive the data of unknown sizes across different threads.

Streambuffer to transfer UART data between two threads

Use of a streambuffer to send and receive data of variable size from one task to another

IN CONCLUSION

You can improve the application performance as well as the development process by dividing your application into threads and by using proper data structures from RTOS.

Let your RTOS work for you!

More it works for you, less number of bugs you have in the system.

We are trying to fix some broken benches in the Indian agriculture ecosystem through technology, to improve farmers’ income. If you share the same passion join us in the pursuit, or simply drop us a line on report@carnot.co.in

--

--

Prasanna Rawke
Tech@Carnot

Tech Enthusiast | Firmware Lead @ Carnot Technologies