Compiling and Linking programs for Embedded projects

Writing Linker Script for STM32 (Arm Cortex M3)🛠️

Rohit Nimkar
7 min readDec 2, 2022

A step-by-step guide on how to write the custom linker script for Arm Cortex M3 micro-controller e.g. STM32F103 from scratch📝

ld combines a number of object and archive files, relocates their data, and ties up symbol references. Usually, the last step in compiling a program is to run ld.

When compiling programs for desktops🖥️, the linking process is taken care of by the toolchain. However, the linking process is more important when developing for embedded domain📟 as specific sections need to be placed in specific memory locations.

To have better command over the linking process, a set of instructions is provided to the linker which tells it to put the different sections in the output binary in a specific way.

In this article, we will be writing the linker script required for building a blinky program on an Arm Cortex M3 STM32 MCU from scratch...

Linking & Relocation flow

Linker🔗

The linker combines input files into a single output file. The output file and each input file are in a special data format known as an object file format.

It accepts Linker Command Language files written in a superset of AT&T’s Link Editor Command Language syntax, to provide explicit and total control over the linking process.

linking process

Linker Script📝

Every link is controlled by a linker script. This script is written in the linker command language.

The main purpose of the linker script is to describe how the sections in the input files should be mapped into the output file and to control the memory layout of the output file. It directs the linker to perform many other operations, using the commands described in the subsequent sections.

The simplest possible linker script consists of only one ‘SECTIONS’ command to describe the memory layout of the output file. The following linker script loads the ‘text’ section at address 0x10000 and the data section at 0x8000000.

ENTRY command

The first instruction to execute in a program is called the entry point. You can use this command to set the entry point. This address is then put into the ELF file header.
The argument is a symbol name: ENTRY(<symbol_name>)

SECTIONS command

This command tells the linker how to combine input sections into output sections, and how to place the output sections in memory. The format of the SECTIONScommand is:

SECTIONS
{
sections-command
sections-command

}

Each sections-command may be one of the following:

  • an ENTRY command
  • a symbol assignment
  • an output section description
  • an overlay description

Don’t worry about this jargon, I will be explaining all these in upcoming articles.

The syntax for the SECTIONS command

MEMORY command

This command describes the location and size of blocks of memory in the target. You can use it to describe which memory regions may be used by the linker, and which memory regions it must avoid. The format of the MEMORY command is:

MEMORY
{
name [(attr)] : ORIGIN = origin, LENGTH = len

}

The name is the name used in the linker script to refer to the region. The attr string is an optional list of attributes which can be one of the following:

  • ‘R’ for Read-only sections
  • ‘W’ for Read/write section
  • ‘X’ for the Executable section
  • ‘A’ for the Allocatable section
  • ‘I’ for the Initialized section
  • ‘!’ to invert the sense of any attribute that follows
Syntax for the MEMORY command

In the above script, we specify that there are two memory regions available for allocation.

  1. rom
    This region is read and execute region starting at address 0x00 with a size of 256 kilobytes.
  2. ram
    This region is a read, write, and non-executable region starting at address 0x40000000 with a size of 4 megabytes.

Every loadable or allocatable output section has two addresses. The first is the VMA or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA or load memory address. This is the address at which the section will be loaded.

Symbols and expressions

Symbols are the identifiers defined in the linker scripts that are placed in the symbol table. Unlike variables, linker scripts symbol declarations, by contrast, create an entry in the symbol table but do not assign any memory to them. Thus they are an address without a value.

The special symbol name · indicates the location counter. You may only use this within a SECTIONS command. The semicolon after the expression is required.

Syntax for the SECTION command

In this example, the symbol floating_point will be defined as zero. The symbol _etext will be defined as the address following the last .text input section. The symbol _bdata will be defined as the address following the .text output section aligned upward to a 4-byte boundary.

Location Counter📍

The special linker variable · (dot) represents the current output location counter, hence it can only appear in an expression within the SECTIONS command.
Assigning value to · will cause the location counter to be moved however, it can only be moved forward and not backward. Moving it backward may create areas of overlapping LMAs.
I will be explaining the location counter in detail in another article.

Linker script in action

Let’s write a linker script from scratch for our bare-metal blink project on STM32F103C8T6 MCU.

Declare entry section

We have defined a function Reset_handler()in the startup_stn32f103.c file. It should be called by the MCU after powering on, so we specify the symbol as the entry point for the application using this statement:

Program entry point

Declare MEMORY regions

Let’s define two memory sections for SRAM and FLASH memories. The FLASH is read/execute while the SRAM is read/write but the non-executable region.

From the datasheet, we have set the start addresses of FLASH and SRAM as 0x8000000 and 0x20000000 respectively. The MCU has 64KB of FLASH and 20KB of embedded SRAM.

Definition of the memory regions

Declare section for vector table

For Arm Cortex M3 micro-controllers the 0x00 address should contain the value to be stored in the stack pointer, and the next address should contain the address of the Reset handler followed by the rest of the vector table.

In the startup_stm32f103.c file, we have defined an array vectors having the MSP value as well as the entire vector table. The array is placed in a separate section .isr_vector using the section compiler attribute.

To ensure proper booting of the MCU, we have to relocate this section to address 0x8000000. This is the first section declared in FLASH so it will be present at the first available memory location in the FLASH i.e. 0x8000000.

Definition of the vector table

You might be wondering about the KEEP directive surrounding the input section name🧐, it instructs the linker to keep this section in the final executable even if it is not referenced in any other section.

Sections, that are not referenced by any other section but still need to be included in the executable are known as Magic Sections.

.text section

The exact position of other sections is not a concern as long as they are present in the correct memory region. The .text label on line 3 is the name of the output section and it is assigned to the FLASH memory region.

The wildcard present in *(.text) statement, instructs the linker to include .text sections from all the object files. The wildcard at the end of the *(text.*) statement instructs the linker to include all the subsections inside the .text section from all the object files. e.g. .text.ms_delay from the delay.o file or .text.startup.main from main.o file.

Definition of the .text section

The . = ALIGN(4) statement updates the value of the location pointer to be aligned to 4bytes as the branch address is required to be aligned to 4bytes in Arm Cortex M architecture.

The _etext = . expression at line 10, assigned the value of the location pointer to the symbol _etext .

.data section

The .data section contains the pre-initialized global variables. Its contents are stored in the FLASH region when linking but should be loaded in SRAM during boot. The relocation is done by the startup_stm32f103.c and we need to provide the boundaries of the data section.

The Load Memory Address (LMA) is provided using the LOADADDR directive and the Virtual Memory Address (VMA) is provided using the _sdata = . statement. Unlike .text section where . represented LMA, in the .data section representing the VMA as this section is stored in SRAM.

Definition of the .data section

.bss section

Unlike the .data section is not stored in FLASH but directly initialized to 0’s in SRAM.

Definition of the .bss section

Build, Run and test

The complete source code for this article is available in this Git Repository. Just clone, build and flash the binary onto an STM32F1 board to see it in action.

Output

We can verify the contents of the final executable file are as per the instructions given in the linker script.

Section table of the final executable file

Further Reading

Visit this article to understand the working of the startup file written in embedded C/C++.

--

--

Rohit Nimkar

Know a little about coding but aim to self employ myself from it.