Compiling and Linking programs for Embedded projects
Writing Linker Script for STM32 (Arm Cortex M3)🛠️
A step-by-step guide on how to write the custom linker script for Arm Cortex M3 micro-controller e.g. STM32F103 from scratch📝
ld
combines a number of object and archive files, relocates their data, and ties up symbol references. Usually, the last step in compiling a program is to run ld
.
When compiling programs for desktops🖥️, the linking process is taken care of by the toolchain. However, the linking process is more important when developing for embedded domain📟 as specific sections need to be placed in specific memory locations.
To have better command over the linking process, a set of instructions is provided to the linker which tells it to put the different sections in the output binary in a specific way.
In this article, we will be writing the linker script required for building a blinky program on an Arm Cortex M3 STM32 MCU from scratch...
Linker🔗
The linker combines input files into a single output file. The output file and each input file are in a special data format known as an object file format.
It accepts Linker Command Language files written in a superset of AT&T’s Link Editor Command Language syntax, to provide explicit and total control over the linking process.
Linker Script📝
Every link is controlled by a linker script. This script is written in the linker command language.
The main purpose of the linker script is to describe how the sections in the input files should be mapped into the output file and to control the memory layout of the output file. It directs the linker to perform many other operations, using the commands described in the subsequent sections.
The simplest possible linker script consists of only one ‘SECTIONS’ command to describe the memory layout of the output file. The following linker script loads the ‘text’ section at address 0x10000 and the data section at 0x8000000.
ENTRY command
The first instruction to execute in a program is called the entry point. You can use this command to set the entry point. This address is then put into the ELF file header.
The argument is a symbol name: ENTRY(<symbol_name>)
SECTIONS command
This command tells the linker how to combine input sections into output sections, and how to place the output sections in memory. The format of the SECTIONS
command is:
SECTIONS
{
sections-command
sections-command
…
}
Each sections-command may be one of the following:
- an
ENTRY
command - a symbol assignment
- an output section description
- an overlay description
Don’t worry about this jargon, I will be explaining all these in upcoming articles.
MEMORY command
This command describes the location and size of blocks of memory in the target. You can use it to describe which memory regions may be used by the linker, and which memory regions it must avoid. The format of the MEMORY
command is:
MEMORY
{
name [(attr)] : ORIGIN = origin, LENGTH = len
…
}
The name is the name used in the linker script to refer to the region. The attr string is an optional list of attributes which can be one of the following:
- ‘R’ for Read-only sections
- ‘W’ for Read/write section
- ‘X’ for the Executable section
- ‘A’ for the Allocatable section
- ‘I’ for the Initialized section
- ‘!’ to invert the sense of any attribute that follows
In the above script, we specify that there are two memory regions available for allocation.
- rom
This region is read and execute region starting at address 0x00 with a size of 256 kilobytes. - ram
This region is a read, write, and non-executable region starting at address 0x40000000 with a size of 4 megabytes.
Every loadable or allocatable output section has two addresses. The first is the VMA or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA or load memory address. This is the address at which the section will be loaded.
Symbols and expressions
Symbols are the identifiers defined in the linker scripts that are placed in the symbol table. Unlike variables, linker scripts symbol declarations, by contrast, create an entry in the symbol table but do not assign any memory to them. Thus they are an address without a value.
The special symbol name · indicates the location counter. You may only use this within a SECTIONS
command. The semicolon after the expression is required.
In this example, the symbol floating_point
will be defined as zero. The symbol _etext
will be defined as the address following the last .text
input section. The symbol _bdata
will be defined as the address following the .text
output section aligned upward to a 4-byte boundary.
Location Counter📍
The special linker variable · (dot) represents the current output location counter, hence it can only appear in an expression within the SECTIONS command.
Assigning value to · will cause the location counter to be moved however, it can only be moved forward and not backward. Moving it backward may create areas of overlapping LMAs.
I will be explaining the location counter in detail in another article.
Linker script in action
Let’s write a linker script from scratch for our bare-metal blink project on STM32F103C8T6 MCU.
Declare entry section
We have defined a function Reset_handler()
in the startup_stn32f103.c file. It should be called by the MCU after powering on, so we specify the symbol as the entry point for the application using this statement:
Declare MEMORY regions
Let’s define two memory sections for SRAM and FLASH memories. The FLASH is read/execute while the SRAM is read/write but the non-executable region.
From the datasheet, we have set the start addresses of FLASH and SRAM as 0x8000000 and 0x20000000 respectively. The MCU has 64KB of FLASH and 20KB of embedded SRAM.
Declare section for vector table
For Arm Cortex M3 micro-controllers the 0x00 address should contain the value to be stored in the stack pointer, and the next address should contain the address of the Reset handler followed by the rest of the vector table.
In the startup_stm32f103.c file, we have defined an array vectors
having the MSP value as well as the entire vector table. The array is placed in a separate section .isr_vector
using the section compiler attribute.
To ensure proper booting of the MCU, we have to relocate this section to address 0x8000000. This is the first section declared in FLASH so it will be present at the first available memory location in the FLASH i.e. 0x8000000.
You might be wondering about the KEEP
directive surrounding the input section name🧐, it instructs the linker to keep this section in the final executable even if it is not referenced in any other section.
Sections, that are not referenced by any other section but still need to be included in the executable are known as Magic Sections.
.text section
The exact position of other sections is not a concern as long as they are present in the correct memory region. The .text
label on line 3 is the name of the output section and it is assigned to the FLASH memory region.
The wildcard present in *(.text)
statement, instructs the linker to include .text
sections from all the object files. The wildcard at the end of the *(text.*)
statement instructs the linker to include all the subsections inside the .text
section from all the object files. e.g. .text.ms_delay
from the delay.o file or .text.startup.main
from main.o file.
The . = ALIGN(4)
statement updates the value of the location pointer to be aligned to 4bytes as the branch address is required to be aligned to 4bytes in Arm Cortex M architecture.
The _etext = .
expression at line 10, assigned the value of the location pointer to the symbol _etext
.
.data section
The .data
section contains the pre-initialized global variables. Its contents are stored in the FLASH region when linking but should be loaded in SRAM during boot. The relocation is done by the startup_stm32f103.c
and we need to provide the boundaries of the data section.
The Load Memory Address (LMA) is provided using the LOADADDR
directive and the Virtual Memory Address (VMA) is provided using the _sdata = .
statement. Unlike .text
section where . represented LMA, in the .data
section representing the VMA as this section is stored in SRAM.
.bss section
Unlike the .data
section is not stored in FLASH but directly initialized to 0’s in SRAM.
Build, Run and test
The complete source code for this article is available in this Git Repository. Just clone, build and flash the binary onto an STM32F1 board to see it in action.
Output
We can verify the contents of the final executable file are as per the instructions given in the linker script.
Further Reading
Visit this article to understand the working of the startup file written in embedded C/C++.