A brief info on Linker, Loader, Symbol & Symbol Tables

0x00: Why blog on this topic?

RIXED LABS
RIXED_LABS

--

On, the previous blog on a brief introduction on a ELF file, I took some time to describe few necessary topics such as the headers, Segment, Section but a few more info on how the files are loaded into memory, or how are they linked? Still remains missing if we look from a reader’s prospective, the blog also missed a quite important topics like symbol and symbol tables, therefore this blog will focus around how this aforementioned stuffs work and what role they play in the life of a source code to an executable and finally getting that to the memory.

0x01 : Introduction to Linkers

A small prerequisite on how compilers work, will make us quite favorable with the term Linker, but in my opinion the concept there’s a pretty little probability we are present with a lot of info on the topic Linker, a decent blog or tutorial on how High Level Language gets converted to Absolute Machine code would conclude this fact :

High Level Language
|
(Pre-Processor)
|
|
| Pure HLL
|
(Compiler)
|
|
| Assembly Language
(Assembler)
|
| Relocatable Machine Code
|
|
(Loader/Linker)
|
|
|
(Absolute Machine Code)

Well, the topic Linker isn’t much described in depth in the above map which has been referenced from a site, so linker can be described as a program which helps to link object modules into a single object file. Now what are object modules & how does it get linked into a single object file?

The assembler creates object modules from the assembly code, and then the linker creates object files from these object modules.

Linkers can be of two types the first one is static and the other is Dynamic linking, static linking is the type of linking which is performed before execution, in static linking the linker takes a bunch of relocatable object file and args and generate fully object file which can be run. Dynamic linking is performed during the run time by placing the name of the shareable library in the executable image, compared to static linking this type of linking has more chances of causing error also as dynamic linking code can be relocated for smooth running, and the address is fixed at runtime.

0x02: Introduction to Loaders

Although in the above map, the context of loader is not described well, loader can be defined as the program which takes input as the object code( .o) from the linker and loads it to the memory for the execution, including proper allocation of memory space is done by the loader, loaders have three types of approach : Absolute, Relocatable and dynamic run-time loading, absolute loading is simple terms can be said as, its job is taking the output of the assembler and load that in memory, whereas in dynamic run-time loading the load module is located into some location in the main memory, and the last but not the least relocatable loading distributes the application in random position of memory.

Therefore, it was a small introduction to loader and linker, there is a more in depth difference between the both here.

0x03: Introduction to Symbols & Symbol Tables

A symbol can be defined as objects such as variables, functions and all other which get converted to offsets and addresses during the compilation of the program , one can view the symbols using

nm -D filename 

and if you are using a Windows machine, a simple ELF parser will do the job for you. These symbols or symbolic reference are exported in order of improving the context of the generated machine code.

0x04: Why do we even need Symbolic references?

An image from Wikipedia regarding Relocation

Symbols or symbolic references can also be stripped, I will just attach a small snap of how symbols can be stripped, but before getting a bit of info on why do we even need symbolic reference, Linkers as described above solely depend on symbols and symbol tables while referencing the symbol inside the executable object file during the linking time, as symbols provide much info on relocations and how to match that with the corresponding value at the symbol table, if you are curious what does relocation even mean ? According to Wikipedia, it can be simply defined as segmentation of object files into the memory segments using the relocation table.

Along with the linkers, debuggers are too dependent on symbols as without them, it is extremely tough to debug the executables, making analysis a tough job for the person analyzing the functionalities of the ELF executable because without them, it’s a difficult job to read about the functions, variables and all other components valuable.

Stripping a section header, as we can see from the screenshot we have 28 section headers, now if we intend to strip any section header for instance .text section, we can use the command, where bloat is the name of the executable

strip -R .text [bloat] 

We can check out that the section header has been stripped successfully.

0x05: Understanding attributes of the Elf symbol Structure

pub struct = public structure( Rust FTW)

The ELF symbol structure includes of various attributes, we will take a moment to read a brief about them:

  • st_name :This attribute describes the name of the symbol.
Representation global symbols, but their definitions can be overwritten
  • st_info : This attribute describes symbol bind type, which determines the reference by an external object. The common may be STB_WEAK, STB_GLOBAL, also this attribute contains info on symbol types such as STT_NOTYPE references that symbol type is not specified.
  • st_other: This attribute contains information about the symbol visibility, also defines that how a given symbol is to be accessed once the symbol is a part of an executable.
  • st_shndx: This attribute specifies section index within the Section Header Table along with the Section Header Table
  • st_value : This attribute contains the symbol value for the symbol table entry.
  • st_size: This attribute contains symbol’s size.

0x06: Symbol Tables

An executable comprises two distinct symbol tables, one named .symtab and the other named .dynsym . We will understand what are they but we do need a small briefing on what are allocable and non-allocable functions in an ELF executable.

Allocable Sections: The sections needed by an ELF by the process during at runtime can be described as allocable sections.

Non-Allocable Sections: The sections not needed by a linker, debugger and other tools and are not mapped into memory are known as non-allocable sections.

Now, we understand what they are, it will be quite easy for us to understand what actually .symtab is, symtab is a single, non-allocable symbol table symtab was present back days when sharable libs and dynamic linking was not needed during the run time. Whereas .dynsym came to play while we had to perform dynamic linking so this symbol table saves the virtual memory during the running process as it was not possible for making the .symtab allocable therefore dynsym is allocable and contains all the symbols needed for runtime operation and .symtab is the non-allocable and can be stripped.

0x07: Summary

Hence the blog describes all the contents as promised and provides a little bit of information on all prerequisite information, hope after reading the blog we are less confused than before. If you find any wrong or incomplete information please let me know, thank you in advance.

Resources:

https://www.intezer.com/blog/malware-analysis/executable-linkable-format-101-part-2-symbols/

Till then happy ELF exploration!

Blog by Nerd of AX1AL. Join us at the discord server.

--

--