How to build a Real-Time Operating System(RTOS)

Sudeep Chandrasekaran
8 min readMay 2, 2020

--

This note is an attempt to demystify an RTOS scheduler by building one. To start with, we will touch base on what a pre-emptive scheduler is and finally implement one for ARM Cortex-m cores. Most of the discussion is architecture agnostic. However, the scheduler we implement relies heavily upon the processor architecture. That isn’t a problem, we would discuss in detail the ARM cortex-m internals necessary to implement the scheduler.

If you are someone who already understands how pre-emptive schedulers work, you could skip a few sections and start from Intuition for Building a pre-emptive scheduler.

What is a pre-emptive scheduler?

Assume that we intend to toggle two LEDs at 1 Second and 2 Second intervals respectively. Below is a bare-metal approach(without timers) of doing it:

In this approach, a decision about LED states needs to be taken at an interval of the highest common factor of the delays(in the above example it is 1 second). It is cumbersome to design with this approach if the number of LEDs is large. Also, adding a newer LED(with a different blink rate) needs considerable re-work of the older code. Hence, this approach is not scalable.

This solution could have been much simpler if there was a possibility of having multiple “main()” functions with each main function dedicated to a LED. A pre-emptive scheduler is a software abstraction that provides access to an illusion of multiple “main()” functions. Each such “main()” function is called a task. With the help of a pre-emptive scheduler the above code can be written as below:

To add a new LED we add a new task. This opens up the possibility for multi-tasking. The use of a pre-emptive scheduler can simplify the design of what would otherwise be a complex software application:

  • Multitasking allows the complex application to be partitioned into a set of smaller and more manageable tasks.
  • The partitioning can result in easier software testing, work breakdown within teams, and code reuse.
  • Complex timing and sequencing details can be removed from the application code and become the responsibility of the operating system.

A pre-emptive scheduler forms the heart of an RTOS.

Illusion of Multitasking

A conventional processor can only execute a single task at a time — but by rapidly switching between tasks an RTOS can make it appear as if each task is executing concurrently. This is depicted by the diagram below which shows the execution pattern of three tasks with respect to time. The task names are color-coded and written down the left hand. Time moves from left to right, with the colored lines showing which task is executing at any particular time. The upper diagram demonstrates the perceived concurrent execution pattern, and the lower the actual multitasking execution pattern.

Intuition for Building a pre-emptive scheduler

As a task executes it utilizes the processor/microcontroller registers and accesses RAM and ROM just as any other program. These resources together (the processor registers, stack, etc.) comprise the task execution context.

A task is a sequential piece of code — it does not know when it is going to get suspended (swapped out or switched out) or resumed (swapped in or switched in) by the scheduler and does not even know when this has happened. Consider the example of a task being suspended immediately before executing an instruction that sums the values contained within two processor registers. While the task is suspended other tasks will execute and may modify the processor register values. Upon resumption the task will not know that the processor registers have been altered — if it used the modified values the summation would result in an incorrect value.

To prevent this type of error it is essential that upon resumption a task has a context identical to that immediately prior to its suspension. The scheduler is responsible for ensuring this is the case — and does so by saving the context of a task before it is taken out of execution. When the task is taken for execution, its saved context is restored by the scheduler prior to its execution. The process of saving the context of a task being suspended and restoring the context of a task being resumed is called context switching.

A mechanism to save/restore the task context needs to be built to implement a pre-emptive scheduler. A task context comprises of the data it has stored in the stack and the CPU registers(including the program counter) values. The strategy is to maintain a separate stack for each task and save all CPU registers onto the stack before taking the next task for execution. This way the only parameter that needs to be remembered to restore a task back to execution is its stack pointer. We will use a periodic interrupt to save the current task context and restore a suspended task context. From the developer’s perspective, it would look like all tasks are running concurrently. Below is the summary of the high-level design:

  1. Create a stack for each task(equivalent to reserving some memory).
  2. Initialize the CPU to execute the first task. Set the CPU to use the first task's stack. Though obvious, the only information that CPU has about stack is the stack pointer(SP). Hence, this operation can be accomplished by copying the stack address of the first task to SP.
  3. Initialize a periodic interrupt.
  4. Periodic ISR:

a. Save/Push the CPU registers onto the stack. Note that the SP is currently pointing to the task stack that is currently in execution. Let us call this task X.

b. Change the SP value to point to the stack of the new task(call this Y) to be executed.

c. Restore/Pop the CPU register values stored in the Y’s stack to the actual CPU registers.

d. Exit out of interrupt. Upon exit, the control will go to the Y task because all CPU registers have been loaded with the Y’s context(also remember the task context contains the program counter).

Implementing a Pre-Emptive Scheduler on ARM Cortex M0

We use the NUCLEO-F030R8 development board and the STM32CubeIDE for development. Most of the code is microcontroller agnostic and should work seamlessly with any cortex-m target.

A task control block(TCB) in an RTOS is used to store parameters related to a task(stack pointer, stack size, inter-process parameters, etc). In this design, we keep it simple by only storing the stack pointer of the task in the TCB. Also, the TCB is defined as a linked list to enable easy hopping to the next task. We also define a pointer to the TCB that points to the currently active task.

Reserve stack for each task. For simplicity, we assume only two tasks.

The task context comprises of CPU registers R0-R12, SP, LR, PC, and PSR(see more). They need to be saved and restored when the task is taken into or out of execution.

We intend to use the ARM Cortex M’s systick interrupt as the periodic interrupt for swapping tasks. The processor is designed to save and restore some CPU registers as part of the interrupt entry and exit mechanism. It is important to understand the ARM exception(aka interrupt) handling mechanism before we move further.

The processor pushes 8 registers PSR, PC, LR, R12, R3, R2 R1, and R0 onto the stack on an exception. Then the exception routine is executed. On exit from the exception, the processor pops 8 words(the same ones those where pushed) from the stack and loads them onto the respective CPU registers(in the same order as they where saved). Check this to know more.

We have to only save and restore the rest of the registers(R4, R5, R6, R7, R8, R9, R10 & R11) within the interrupt. Since SP is directly stored in the TCB, we don’t have to push it to the stack. Below is the interrupt implementation.

Adding __attribute__((naked)) suppresses the generation of assembly code that allocates/deallocates space for auto variables in the program stack during the entry/exit of the function. We add this attribute to SysTick_Handler to ensure that the stack is not altered between the occurrence of the interrupt and execution of the interrupt routine instructions.

SysTick_Handler assumes that there is always a task context stored in the suspended task’s stack. Hence, we need to initialize the task stacks such that they contain a task context even during the first task context switch. We need to do the following to initialize the task stack:

  1. Set the tasks stack pointer(in tcbs) such that it points to the top of the task context(equivalent to have 16 words stored in the task stack).
  2. Copy 0x01000000 to xPSR in the stacked context. This makes sure that the processor knows that it needs to run on thumb mode when the value is copied from to xPSR register(more here).
  3. Copy the address of the function that implements the task to the stacked PC.

osKernelAddThreads implements this.

We have one last piece of code to add before we can have the scheduler working. After executing osKernelAddThreads, pCurntTcb will be pointing to TCB of task0. LaunchScheduler starts the scheduler by restoring the context saved in TCB pointed by pCurntTcb to the CPU registers.

We implement two tasks that toggle 2 GPIO’s(PC2 & PC3) at 100 ms and 50 ms respectively.

Have a complete look at code here. You can find the complete STM32CubeIDE project git repo here. A peek into the results:

Conclusion

Observant readers would have figured out that we did not actually implement the “Real Time” part of the OS. Adding realtimeness can be thought of as adding another layer above the pre-emptive scheduler. It includes assigning priority to tasks, capability for tasks to wait on interprocess communication primitives(i.e mutex, semaphores, queues, etc). This was conveniently omitted for the sake of simplicity.

References

The major content for this article is derived from this course:

Others:

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0497a/Babefdjc.html

📝 Save this story in Journal.

👩‍💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.

--

--

Sudeep Chandrasekaran

An IoT specialist with a focus on developing secure scalable software. My interests wildly swing between embedded systems, cryptography, and physics.