An Introduction to Linker file

Phan Cuong
9 min readOct 15, 2023

--

Introduction:

As an embedded software engineer, understanding the linker file is crucial. The linker file serves a significant role by consolidating various object sections into a single executable file. This article provides insights into the structure and syntax of the linker file, as well as guidance on how to utilize it effectively.

The initial segment of the post delves into the explanation of program sections in the C programming language. Once you’ve grasped the concept of program sections, the subsequent section introduces you to the linker script. It covers its structure and guides you on how to craft your own custom linker files. The final segment of the post provides instructions on how to install an ARM compiler, which you can utilize for practice.

Program sections:

To fully grasp the function of a linker file, it’s essential to first understand the program sections in the C programming language. For illustrative purposes, we’ll use two files: main.c and calculus.c. These examples utilize the GNU Compiler (GCC) for ARM, which is part of the GNU Arm Embedded Toolchain. If you’re interested in a hands-on experience, follow the instructions in the installation section to set up the GNU Arm Embedded Toolchain. Please note that these examples were executed in a Windows environment.

Below code block is main.c.

#include "calculus.h"
/* Initialized variable.
if the variable is intialized to 0, then the variable is unintialized variable*/
int var_1 = 1;

/*Constant variable*/
double const var_2 = 0;

/*Unitialized variable*/
int var_3;

void main(void ) {
/*static int v_4 = 0;
the v_4 variable map to .data section*/
signed int a = 0,b = 1,c=2,d=3;
addition(a,b);
subtraction(c,d);
}

Below code block is calculus.c

#include "calculus.h"

signed int addition(signed int a, signed int b){
signed int c = 0;
c = a+b;
return c;
}
signed int subtraction(signed int a, signed int b){
signed int c = 0;
c = a-b;
return c;
}

Run commands to generate object files by using arm-none-eabi-gcc and use the object files for extracting all program sections of c files by using arm-none-eabi-objdump

# compiling 2 c files and generating a map file without linker file. cannot compile main.c alone, due to main.c has a dependence with calculus.c
arm-none-eabi-gcc -nostdlib main.c calculus.c -o main.o -Wl,-Map=main_wo.map

# compiling calculus.c to create object file of the c file
arm-none-eabi-gcc -nostdlib calculus.c -o calculus.o

# generate section of main.c and calculus.c
arm-none-eabi-objdump -h main.o > main.o.obj
arm-none-eabi-objdump -h calculus.o > calculus.o.obj

Finally the main.o.obj contains all program sections with size and default start address.

main.o:     file format elf32-littlearm

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 000000d4 00008000 00008000 00008000 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 00000008 000080d8 000080d8 000080d8 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .data 00000004 000180e0 000180e0 000080e0 2**2
CONTENTS, ALLOC, LOAD, DATA
3 .persistent 00000000 000180e4 000180e4 000080e4 2**0
CONTENTS, ALLOC, LOAD, DATA
4 .bss 00000004 000180e4 000180e4 000080e4 2**2
ALLOC
5 .noinit 00000000 000180e8 000180e8 00000000 2**0
ALLOC
6 .comment 00000049 00000000 00000000 000080e4 2**0
CONTENTS, READONLY
7 .ARM.attributes 0000002a 00000000 00000000 0000812d 2**0
CONTENTS, READONLY

The .comment section doesn’t influence our software’s behavior, so we can disregard it. Here are the definitions of the key sections:

.text: Code and Data

The code containing instructions which are located in Flash. The text code also store constant values which are encoded as raw bytes at the end of a function.

.data: Initialized variable

Variables can change their values, so variables are copied from Flash to RAM by the startup code. In this example, in main.o, there are 4 bytes for int var_1 = 1;.

.bss: Uninitialized variables

Variables can change their values, so variables are copied from Flash to RAM. However, because these values are uninitialized, so we do not need to store their values in ROM as .data, we just reserve it in RAM only. In this example, in main.o, there are 4 bytes for int var_3;.

.rodata: Read-only data

Constant variables are stored in Flash. In this example, in main.o, there are 8 bytes for double const var_2 = 0;.

These two sections

.noinit : No initialization variable (link) Only with uninitialized variables. It prevents such variables from being set to 0 during a reset. It should be initialized by software specially.

.persistent: No re-initialization variable (link)

Only with statically-initialized variables. It prevents such variables from being initialized during a reset. Persistent variables disable startup initialization; they are given an initial value when the code is loaded, but are never again initialized.

.noinit and .persistentsections are specifics for ELF(Executable and Linkable Format) target. I have not have a chance to work on these segments.

Linker script:

After compilation, all object files were created and each object file has many sections such as .text, .bss, .data, .rodata. Linker file is used to combine these above sections from different object files into the final executable file.

main.c --> main.o {
.text,
.data,
.bss,
.rodata
}

calculus.c --> calculus.o {
.text,
.data,
.bss,
.rodata}
main.elf = main.o + calculus.o = {
.text = .text(main) + .text(calculus)}
.data = .data(main) + .data(calculus)}
.bss = .bss(main) + .bss(calculus)}
.rodata = .rodata(main) + .rodata(calculus)}
}

In addition to this, Linker maps to hardware memory as our desire. Following image is final mapping sections to memory locations of hardware

A linker script is created Memory Layout’s description:

MEMORY command

Describe different memory parts in the systems. Linker users the information to calculate address

MEMORY
{
name (attribute): ORIGIN = <address>, LENGTH = <size>
}

SECTIONS command

Create memory layout by creating section name, section order. In each section, choose which data is used, how data is stored, and loaded.

Location Counter is a special symbol denoted by a dot .. Linker will automatically update it with current location information. A variable can be used to save location to mark boundaries. Location counter can be set also.

SECTIONS
{
<symbol> = LOADADDR(<symbol>); /*Define a global symbol that can be used in linker file or source code*/
.<section>:
{
<symbol> = .; /*Define a global symbol that can be used in linker file or source code*/
*(.sub_section);
. = ALIGN(n);
} ><Run Location> [AT> Storage Location]
}

Here is the linker file:

/* Specify the memory areas */
MEMORY
{
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 512K
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
}
/* Define output sections */
SECTIONS
{
/* The program code and other data goes into FLASH */
.text :
{
*(.text) /* .text sections (code) */
*(.text*) /* .text* sections (code) */
. = ALIGN(4);
_etext = .; /* define a global symbols at end of code */
} >FLASH
/* Constant data goes into FLASH */
.rodata :
{
*(.rodata) /* .rodata sections (constants, strings, etc.) */
*(.rodata*) /* .rodata* sections (constants, strings, etc.) */
. = ALIGN(4);
} >FLASH
/* used by the startup to initialize data */
_sidata = LOADADDR(.data);
/* Initialized data sections goes into RAM, load LMA copy after code */
.data :
{
_sdata = .; /* create a global symbol at data start */
*(.data) /* .data sections */
*(.data*) /* .data* sections */
. = ALIGN(4);
_edata = .; /* define a global symbol at data end */
} >RAM AT> FLASH

/* Uninitialized data section */
. = ALIGN(4);
.bss :
{
/* This is used by the startup in order to initialize the .bss section */
_sbss = .; /* define a global symbol at bss start */
*(.bss)
*(.bss*)
*(COMMON)
. = ALIGN(32);
_ebss = .; /* define a global symbol at bss end */
} >RAM
.ARM.attributes 0 : { *(.ARM.attributes) }
}

In the Linker Script, we define some symbols:

  • _etext: End address of .text section
  • _sidata : Load address (from Flash) of .data section
  • _sdata: Start address of .data section
  • _edata: End address of .data section
  • _sbss: Start address of .bss section
  • _ebss: End address of .bss section

To build with Linker script, use -T <linkerfile>. The option -Wl,-Map=<output> to show the full memory mapping.

arm-none-eabi-gcc -nostdlib main.c calculus.c -o main.o -T linker.ld -Wl,-Map=main.map

Open the file main.map to see the addresses assigned to symbols in the linker script

Memory Configuration
Name Origin Length Attributes
FLASH 0x08000000 0x00080000 xr
RAM 0x20000000 0x00020000 xrw
*default* 0x00000000 0xffffffff
Linker script and memory map
LOAD C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
LOAD C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccqcW3xU.o
.text 0x08000000 0xd4
*(.text)
.text 0x08000000 0x54 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x08000000 main
.text 0x08000054 0x80 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccqcW3xU.o
0x08000054 addition
0x08000094 subtraction
*(.text*)
0x080000d4 . = ALIGN (0x4)
0x080000d4 _etext = .
.glue_7 0x080000d4 0x0
.glue_7 0x080000d4 0x0 linker stubs
.glue_7t 0x080000d4 0x0
.glue_7t 0x080000d4 0x0 linker stubs
.vfp11_veneer 0x080000d4 0x0
.vfp11_veneer 0x080000d4 0x0 linker stubs
.v4_bx 0x080000d4 0x0
.v4_bx 0x080000d4 0x0 linker stubs
.iplt 0x080000d4 0x0
.iplt 0x080000d4 0x0 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
.rodata 0x080000d8 0x8
*(.rodata)
.rodata 0x080000d8 0x8 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x080000d8 var_2
*(.rodata*)
0x080000e0 . = ALIGN (0x4)
0x080000e0 _sidata = LOADADDR (.data)
.rel.dyn 0x080000e0 0x0
.rel.iplt 0x080000e0 0x0 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
.data 0x20000000 0x4 load address 0x080000e0
0x20000000 _sdata = .
*(.data)
.data 0x20000000 0x4 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x20000000 var_1
.data 0x20000004 0x0 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccqcW3xU.o
*(.data*)
0x20000004 . = ALIGN (0x4)
0x20000004 _edata = .
.persistent 0x20000004 0x4 load address 0x080000e4
.persistent 0x20000004 0x4 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x20000004 test_2
.igot.plt 0x20000008 0x0 load address 0x080000e8
.igot.plt 0x20000008 0x0 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x20000008 . = ALIGN (0x4)
.bss 0x20000008 0x18 load address 0x080000e8
0x20000008 _sbss = .
*(.bss)
.bss 0x20000008 0x4 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x20000008 var_3
.bss 0x2000000c 0x0 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccqcW3xU.o
*(.bss*)
*(COMMON)
0x20000020 . = ALIGN (0x20)
*fill* 0x2000000c 0x14
0x20000020 _ebss = .
.noinit 0x20000020 0x4 load address 0x080000e8
.noinit 0x20000020 0x4 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x20000020 test_1
.ARM.attributes
0x00000000 0x2a
*(.ARM.attributes)
.ARM.attributes
0x00000000 0x2a C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
.ARM.attributes
0x0000002a 0x2a C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccqcW3xU.o
OUTPUT(main.o elf32-littlearm)
LOAD linker stubs
.comment 0x00000000 0x49
.comment 0x00000000 0x49 C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccNkHklJ.o
0x4a (size before relaxing)
.comment 0x00000049 0x4a C:\\Users\\Cuong\\AppData\\Local\\Temp\\ccqcW3xU.o

The section .text has 2 functions main, subtraction and addition.

The section .data started at 0x20000000 is 4 bytes for int var_1 = 1;

The section .bss started at 0x20000008 is 4 bytes for int var_3;

The section .rodata started at 0x080000a0 is 8 bytes for double const var_2 = 0;

To find the symbols and their addresses:

arm-none-eabi-nm main.o
20000020 B _ebss
20000004 D _edata
080000d4 T _etext
20000008 B _sbss
20000000 D _sdata
080000e0 A _sidata
08000054 T addition
08000000 T main
08000094 T subtraction
20000020 B test_1
20000004 D test_2
20000000 D var_1
080000d8 R var_2
20000008 B var_3

How to use linker script symbols

Linker scripts symbol is a name for an address. In a linker script you can define symbol and assign them addresses. So for example the linker script definition:

foo = 1000;

creates an entry in the symbol table called foo which holds the address of memory location 1000, but nothing special is stored at address 1000. This means that you cannot access the value of a linker script defined symbol - it has no value - all you can do is access the address of a linker script defined symbol.

Hence, when you are using a linker script defined symbol in source code you should always take the address of the symbol, and never attempt to use its value. For example suppose you want to copy the contents of a section of memory called .ROM into a section called .FLASH and the linker script contains these declarations:

start_of_ROM   = .ROM;
end_of_ROM = .ROM + sizeof (.ROM);
start_of_FLASH = .FLASH;

Then the C source code to perform the copy would be as below. Note the use of the & operators. These are correct.

extern char start_of_ROM, end_of_ROM, start_of_FLASH;
memcpy (& start_of_FLASH, & start_of_ROM, & end_of_ROM - & start_of_ROM);

Alternatively the symbols can be treated as the names of vectors or arrays and then the code will again work as expected:

extern char start_of_ROM[], end_of_ROM[], start_of_FLASH[];
memcpy (start_of_FLASH, start_of_ROM, end_of_ROM - start_of_ROM);

Note how using this method does not require the use of & operators.

Installation the gcc-arm toolchain — Windows

  • For Windows, download the installer gcc-arm-none-eabi-5_3–2016q1–20160330-win32.exe.
  • Run the installer. Select Next > and I Agree for the terms and conditions of the license.
  • You can use the default install location, or select another. The default location may vary, but on 64-bit Windows is typically: C:\Program Files (x86)\GNU Tools ARM Embedded\5.3 2016q1
  • On the final page, be sure to select Add path to environment variable before clicking Finish. This is not the default and it is required for the builds to work properly.

Conclusion:

I hope you gain a little knowledge about linker script and experiment with an example.

Reference:

https://www.codeinsideout.com/blog/stm32/compilation/#linker-symbols

https://downloads.ti.com/docs/esd/SLAU132/the-noinit-and-persistent-pragmas-stdz0558942.html#:~:text=However%2C in applications that use,either pragmas or variable attributes

https://www.nongnu.org/avr-libc/user-manual/mem_sections.html

https://docs.particle.io/archives/local-build-using-gcc-arm/#download-the-gcc-arm-toolchain-windows

--

--