Let’s write a Real-Time Operating System(RTOS) (Part4: Concurrency and interrupt management)
Introduction
In the previous article, I showed you how to use PendSV interrupts to implement multithreading, and finally got through the most difficult part of the ARM architecture. Now, I can show you how to step a little into the world of operating systems that you are familiar with, that is, the concurrency of multithreading and interrupt management.
Concurrency and race conditions in operating systems
Imagine what would happen if two threads in an RTOS change a global resource at the same time?
For example, thread A wants to write data to an address, and thread B also happens to write data to an address. What would happen in this case? Obviously, the final result is that the second thread to complete the write wins.
To give a more detailed example, what would happen if two threads increment a variable at the same time?
For a opeartion counter++, consider the following situation: Assume that counter is equal to 0, and two threads read the value of the counter variable and increment it at the same time. We should think that counter should be equal to 2, but in fact, counter will be equal to 1. We call this problem caused by thread competition a race condition.
counter++:
- Thread A reads counter = 0
- Thread A increments counter to registers
- Thread A writes back counter = 1
- Final counter value from Thread A: 1
- Thread B reads counter = 0, assume it is happened at the same time.
- Thread B increments counter to registers
- Thread B writes back counter = 1
- Final counter value from Thread B: 1
- Race Condition Occurs:Expected counter value: 2, Actual counter value:1
A race condition usually occurs as a shared access to resources. When two threads access the same resource, the possibility of mixing up always exists.
Therefore,I give two suggestions:
- Whenever possible, avoid sharing resources. The most direct way to achieve this is to reduce the use of global variables. If we use global variables, we must consider whether it will cause multiple threads to access concurrently. However, in actual development, this is actually aThis is a luxury, because hardware resources are inherently shared, and sometimes global variables have to be used for memory.
- Access to shared resources must be explicitly managed. We must ensure that only one thread accesses a shared resource at a time.
So, how do we do this?
Critical Sections and Atomic Operations
In multithreading and concurrent computing, shared resources — often called critical resources — must be carefully managed to ensure only one thread can access them at a time. To achieve this, we rely on critical sections and atomic operations, both of which prevent interruptions during execution.
An atomic operation is a fundamental operation that executes as a single, indivisible unit, meaning it cannot be interrupted or observed in an incomplete state. This ensures consistency and prevents race conditions.
In Real-Time Operating Systems (RTOS), critical sections are typically enforced through interrupt switching, allowing exclusive access to shared resources. Before early operating systems like Unix and Linux supported symmetric multiprocessing (SMP), concurrent execution primarily stemmed from hardware interrupt services, making interrupt switching a common technique for ensuring resource exclusivity.
However, atomic operations are not necessarily dependent on interrupt switching. Instead, they can be implemented using assembly instructions (such as ldrex
and strex
) or locking mechanisms, ensuring shared resources are accessed safely without unintended interference.
Atomic operations
ldrex
and strex
are ARM processor instructions used to implement atomic operations efficiently, ensuring safe concurrent access to shared memory without locks. These instructions help prevent race conditions in multi-threaded environments.
code
#define ATOMIC_OPS(op) ATOMIC_OP(op) ATOMIC_OP_RETURN(op)
ATOMIC_OPS(add)
ATOMIC_OPS(sub)
#define atomic_inc(v) (atomic_add(1,v))
#define atomic_dec(v) (atomic_sub(1,v))
#define ATOMIC_OP(op) \
static inline void atomic_##op(uint32_t i, uint32_t *v) \
{ \
uint32_t tmp; \
__asm volatile ( \
"1: ldrex %0, [%2] \n" /* Load value from memory */ \
" "#op" %0, %0, %1 \n" /* Perform operation (e.g., add/sub) */ \
" strex %1, %0, [%2] \n" /* Attempt to store result */ \
" teq %1, #0 \n" /* Check if store was successful */ \
" bne 1b \n" /* Retry if unsuccessful */ \
: "=&r"(tmp) \
: "r"(i), "r"(v) \
: "cc" \
); \
}
- Loads the value from memory address
v
into register%0
(output). - Performing Operation: The specified operation (e.g., add or subtract) is executed using registers
%0
andi
, storing the result in%0
. strex
: Attempts to store the result back to memory addressv
, setting the flag intmp
. If successful, returns0
; otherwise, returns a non-zero value.- Retry Mechanism: If the storage fails, execution jumps back to label
1
to retry.
Critical Sections
How to prevent critical section from being interrupted?
In our RTOS, we use PendSV interrupts to complete context switching to achieve multithreading, which may cause race conditions. Another possibility is that an external interrupt occurs, causing the current task’s access to shared resources to be interrupted. We found that concurrency in RTOS comes from interrupts.
So, as long as the operation of disabling interrupts is added to the program that accesses critical resources, this program will only be executed by the current thread. Of course, we must enable interrupts after execution. First analyze the operating architecture of RTOS. Taking cm3 as an example, its interrupt mask register is as follows:
PRIMASK, FAULTMASK, and BASEPRI Registers: Exception and Interrupt Masking
In ARM Cortex-M processors, PRIMASK, FAULTMASK, and BASEPRI registers are used for exception and interrupt masking, allowing the processor to control priority levels and manage critical execution states efficiently.
Each exception, including interrupts, has a priority level, where a smaller number represents higher priority and a larger number represents lower priority. These registers mask exceptions based on priority levels but can only be accessed in privileged mode — writes from unprivileged state are ignored, and reads return zero. By default, all three registers are set to 0
, meaning no masking is active.
PRIMASK Register: Global Interrupt Disabling
The PRIMASK register is a 1-bit interrupt mask that, when set (1
), blocks all exceptions (including interrupts) except:Non-Maskable Interrupt, (NMI)HardFault Exception
Effectively, PRIMASK raises the current priority level to 0, the highest programmable priority. It is most commonly used to disable all interrupts during time-sensitive processes. Once the process is complete, PRIMASK must be cleared (0
) to re-enable interrupts.
FAULTMASK Register: Suppressing Faults
The FAULTMASK register operates similarly to PRIMASK, but with one key difference: it also blocks HardFault exceptions. This effectively raises the priority level to -1, providing an additional layer of control.
FAULTMASK is particularly useful in fault-handling routines, where suppressing certain faults can help prevent cascading failures. For example, it can be used to bypass the Memory Protection Unit (MPU) or suppress bus faults, depending on system configuration. Unlike PRIMASK, FAULTMASK is automatically cleared upon exception return.
BASEPRI Register: Flexible Interrupt Masking
Unlike PRIMASK and FAULTMASK, BASEPRI allows fine-grained control over which interrupts are masked. Rather than blocking all exceptions, BASEPRI masks exceptions based on priority levels, allowing higher-priority exceptions to execute while blocking lower-priority ones.
The width of BASEPRI depends on the number of priority levels implemented by the microcontroller:
- Most Cortex-M3/M4 chips support either 8 (3-bit width) or 16 (4-bit width) priority levels.
When BASEPRI = 0, it is disabled. When set to a non-zero value, it blocks all exceptions with the same or lower priority, while higher-priority exceptions remain unaffected.
Now we will use the BASEPRI register, but the other two will work just as well!
CODE
__attribute__((always_inline)) inline uint32_t EnterCritical( void )
{
uint32_t xReturn;
uint32_t temp;
__asm volatile(
" cpsid i \n"
" mrs %0, basepri \n"
" mov %1, %2 \n"
" msr basepri, %1 \n"
" dsb \n"
" isb \n"
" cpsie i \n"
: "=r" (xReturn), "=r"(temp)
: "r" (configShieldInterPriority)
: "memory"
);
return xReturn;
}
How It Works
- Disables Interrupts (
cpsid i
): Temporarily blocks all maskable interrupts to ensure atomic execution. - Stores Previous
BASEPRI
Value (mrs %0, basepri
): Saves the current priority mask for restoration later. - Sets New
BASEPRI
Value (msr basepri, %1
): UpdatesBASEPRI
to11
, blocking lower-priority interrupts. - Synchronization Barriers (
dsb
,isb
): Ensures memory and instruction consistency before proceeding. - Re-enables Interrupts (
cpsie i
): Allows higher-priority interrupts to execute. - Returns Previous
BASEPRI
Value: This ensures proper nesting, allowing restoration after exiting the critical section.
Similar, the exit function is:
__attribute__((always_inline)) inline void ExitCritical( uint32_t xReturn )
{
__asm volatile(
" cpsid i \n"
" msr basepri, %0 \n"
" dsb \n"
" isb \n"
" cpsie i \n"
:: "r" (xReturn)
: "memory"
);
}
Handling Nested Interrupts
During execution, an interrupt may occur within a critical section. If another critical section is entered within that interrupt, the BASEPRI register will be modified, shielding lower-priority interrupts again. However, when the interrupt exits, the BASEPRI value is reset to 0
, effectively invalidating the previous critical section.
To maintain proper nested behavior, the returned BASEPRI
value (xReturn
) should be used to restore the previous priority mask after exiting the critical section.
If we don’t maintain the value of xReturn, like this:
void A()
{
EnterCritical(); //no return, just disable interrupt
B();
ExitCritical(); //just enable interrupt
}
void B()
{
EnterCritical();
...your code
ExitCritical(); // enable interrupt
}
Expand the code. Actually, it looks like this:
void A()
{
EnterCritical(); //no return, just disable interrupt
EnterCritical();
...your code
ExitCritical(); // enable interrupt
ExitCritical(); // enable interrupt is useless.
}
so, if we don’t maintain the value of xReturn, our code cannot nested calls the critical section.
Conclusion
Now we implete concurrency and interrupt management.We can use it like:
Atomic operations
int a = 0;
void funA()
{
atomic_add(1, &a);
}
void funB()
{
atomic_add(1, &a);
}
Critical Sections
int a = 0;
void funA()
{
uint32_t xReturn = EnterCritical();
a++;
ExitCritical(xReturn);
}
void funB()
{
uint32_t xReturn = EnterCritical();
a++;
ExitCritical(xReturn);
}
That’s all!