Why a program won’t run on different platforms?

Maxwell Nguyen
SotaTek
Published in
12 min readNov 13, 2018

A computer program is normally defined as a collection of instructions that performs a specific task when executed by a computer.

For some programs written in source codes such as PHP, Python, JavaScript, and so on, they can be run on various platforms with a compatible engine. Java source code cannot be executable; however, the compiled Java bytecode program can be executed across platforms with almost negligible differences (or sometimes the difference is stark, that’s why the Sun’s slogan Write Once, Run Anywhere usually becomes Write Once, debug everywhere).

Programs written in machine code are the sole exception (assembly also). Technically, machine language is the only language which the computer can understand and execute, so it might be true to say that one program cannot run on many different platforms. If you have experience in C++ (a.k.a running Hello World under Salers and HRs languages), you know pretty well that to allow the code to run on another platform, you must use cross-compiler, designate a target platform in order to compile a completely new machine code file from the source code.

By definition, a platform is a combination of a hardware architecture and a certain version of an operating system. From this definition, how does a platform influence our machine code? This is the answer based on my working experience at https://sotatek.com.

CPU

First of all, machine code is written in machine language, which is not the sole programming language. Machine language is the language a CPU can read, understand and execute right away; however, different processors speak different languages. Each CPU family has a specific Instruction Set Architecture (ISA); therefore, the first fundamental element of a platform upon which machine code depends is CPU architecture.

https://en.wikipedia.org/wiki/X86_instruction_listings

Let’s take a look at CPU x 86’s instruction sets. There are a number of differences not only between different CPU manufacturers such as Intel or AMD, but also between each generation in the same family of the same manufacturer. This collection of instructions is similar to keywords in high-level programming languages, and it is also the most intuitive one that illustrates that machine code cannot run on many ISAs. It’s worth pointing out that this is just the difference in list instruction of x86 family. The difference between ARM and x86 processors is much bigger.

That is not all. ISA of each CPU also dictates the difference between quality, name, and purpose of registers, or supported data types, byte order, word size, mechanisms for managing memory, I/O model, mechanism for spreading and dealing with exceptions and interruptions, so on. All these factors make machine code cannot run well on two CPUs having different ISAs, even when they have the same instruction sets.

Luckily, AMD and Intel processors are fairly well compatible since a program which runs smoothly on the CPU of this manufacturer can run quite smoothly on the other. The main difference between the two lies at the lower architectural levels where individual instructions are executed. CPUs often come with opcode CUPUID, which helps identify runtime architecture — a system which manages individual behaviors of each architecture. Thanks to these fundamental elements, rarely do we have to bother compiling the source code for each of different CPU architecture.

https://en.wikipedia.org/wiki/CPUID

So, can machine code run smoothly on every computer sharing the same ISA?

The answer is YES, if we load programs directly from boot sector. And it might be NO, if this program loads from the operating system.

The boot sector is a secondary memory sector which is loaded into RAM by CPU’s built-in firmware right after the computer successfully turns on. Normally, this is where the first part of boot loader lies, which loads and runs the first program of almost every modern system: Operation System.

A program which is booted directly from the boot sector is called bare-metal program, then it must change CPU mode by itself (for instance, for x86–64 processors, it must change from default Real Mode into Protected Mode, which supports resource protection mechanism and can use more main memory, or change from Protected Mode into Long Mode support 64 bit), and also must have self-managing code for all resources system such as building page tables for main memory, allocating and managing main memory, installing handlers to deal with I/O, interruptions and exceptions, and so on. These tasks are complex and error-prone; however, the OS will take care of them so that we can minimize peripheral work and focus on specific productive business.

Operating System

Delegating the OS with the authority and responsibility of managing resources leads to the result that unless our programs don’t use these resources, otherwise they must rely on OS’s interface. If a program does not use those resources, it means the program does not read or write files, write output on console, use hardware devices and the internet as well. In other words, that program is useless.

In order to code a useful program (at least it can be print out the line Hello World onto the console), we must familiarize ourselves with another element beside computer hardware, also the last constituent in the definition of a platform: OS.

Platform ABI

The most outstanding difference about OS’s influence is the executable file format. Windows’ executable files usually have extension .exe, but Linux has no extension, and neither do MacOS’s. All these files are not raw machine code but may have various other components depending on the OS’s requirement.

Besides the raw machine code, the executable file also includes some information such as:

  • Static variables which the program defines
  • Program’s entry point (Ex: Main function address declaration)
  • Target architecture
  • Target ABI version
  • Relocation information

Not every executable file formats support all these above functions. Follow the links below for comparisons between some popular executable file formats:

https://en.wikipedia.org/wiki/Comparison_of_executable_file_formats

An example of the executable file following the Linux’s ELF format:

https://en.wikipedia.org/wiki/Executable_and_Linkable_Format

Developers who have little experience in system programming might be alien to ABI but familiar with its closely related concept of API.

API — Application Program Interface, which is a communication interface which defines the communication protocols between components at source code level (ie. Dealing with functions and arguments of a program). At higher levels, API can define communication interface between processes in the same computer through Inter-process Communication (IPC), or even processes on different computers connected via an intranet or internet.

While API defines the interface at source code level, ABI — Application Binary Interface defines the interface at a lower level: Binary level. Put it simply, API contains functions which can call definition and necessary arguments. On the other hand, ABI contains definitions on how to call these functions and the mechanism for passing parameters using machine language. Specifications on the way ABI execute the function calls is known as calling conventions. This is considered as one of the most important specification of an ABI.

The CPU itself hardly knows about the functions. Normally, to make a function call, the CPU will do it simply by moving IP cursor (or EIP cursor on 32-bit system), the cursor points to the next instruction that would be executed, to where the function is declared on main memory. Such tasks as creating stack frames which stores local variables of functions, including passed parameters, deleting these values in order to free up memory after executing functions, etc, are defined and specified by ABI. In particular, a calling convention must, at the least, contain the following:

  • How parameters are passed to the function (store the parameters in stack or registers, or both registers and stack). If the parameters are stored in registers, it is necessary to back up the current values on these registers and then restore after return.
  • Parameters and local variables cleaning mechanism (which one will do the cleaning action, the caller or callee? How to clean and restore the data to the original status?)
  • Where will return data be stored? How does caller read the return value?
  • How will the exceptions spread and be handled?

Some examples of calling conventions:

https://en.wikibooks.org/wiki/X86_Disassembly/Calling_Convention_Examples#Example:_C_Calling_Conventions

Defining communication interface at source code level, the API doesn’t have to depend on the platform. By contrast, working at binary level, ABI relies heavily on the platform on which it runs. Therefore, basically if a program needs to run on different platforms, it just has to be re-compiled for the target platform, so that the compiler can replace the ABI without having to update source code.

When looking at EFL file’s header, the e_ident field which stores ABI version of the executable file. The 0x00 default value is the most popular version of the Linux distributions: SystemV.

System V Application Binary Interface defines function calling convention in machine language, executable file format, library linking mechanism. The ELF itself is a part of the System V Application Binary Interface.

We all know that, to be executed, program’s code has to be loaded into main memory. If a program was loaded directly from the boot sector, the version on memory is exactly the same with machine code file stored in secondary memory. The first instruction of it will be stored in the address 0x00000000 of main memory and gradually increased to the end. IP register (or EIP) is started with value 0x00000000. This is called program’s entry point meaning the first command will be executed whenever this program runs.

When OS involve in the story, the program will be loaded by a component called loader. It will read information from the executable file, check the validations and target architecture, initiate static values, pass the command line arguments to the program and replace the value in EIP by e_entry field value in the header file. This allows the program main function can be declared at any position in machine code instead of the first line.

At the same time, the OS can change the program’s base address 0x00000000, the file relocation information will help the OS update referenced addresses in machine code according to the new base address. Thanks to address relocation, the program can be loaded in any position in the memory instead of 0x00000000 address. It will reduce dependence on main memory’s address status, create ability to run multiple instances of a program in the memory.

In fact, due to popularity of memory paging, the technique that helps to map a memory address called logical addresses used by a program into a physical address in computer main memory, modern programs don’t need address relocation anymore to can be loaded in any position of the main memory. Two programs can use a same logical address, which will be mapped into two different physical addresses that the programs even don’t know about. This allows each program’s instance can be loaded and executed in a memory sandbox, where each instance thinks it is the only one loaded process with the base address is 0x00000000.

But not so that address relocation doesn’t seem useful. Since we are using the loader, the loaded version of the program in memory don’t need to be an exact copy of the raw machine code in executable file anymore. Modern OSs provide a feature called shared library, or dynamic linking library. Your program can use external modules of the OS without storing the external modules’ machine code into your executable file at compile time as using static linking libraries. Each shared library is an object file - a file in executable file format but cannot be executed directly. The shared library’s machine code in the object file will be loaded into memory at loadtime of another executable file or at runtime of a process. The in-memory version of our program now will be a combination of machine code in executable file and machine code in required libraries.

If multiple processes using a same library, the library code will be loaded only once and will be shared between memory sandboxes by mapping into the different address spaces of the processes. Since the logical memory address of the library will be the same between processes, and it can be different with the expected library logical address, we will need to relocate the address of that library in each process using relocation information.

Regarding to calling convention, if the program just calls its own functions only, it can define for its own calling conventions. However, if it intends to use shared libraries, it has to follow to one calling convention to make sure all binary components can communicate with each other.

To summarize, the program relies closely on each platform’s ABI. If it does not follow to the ABI, the OS loader cannot be able to load the programs in the memory or they cannot interact with outside library.

Operating System API

Unlike a bare-metal program, programs loaded by the OS do not contain codes that manage and process system resources. In fact, even if we try to inject the machine code that tries to access system resources into the program, it still cannot be executed. Later CPU Intel versions provide an executable mode called Protected Mode, together with the ability to allocate memory through paging setting supported by Memory Management Unit (MMU), which helps to limit the access to I/O ports, interrupting, main memory, and so on. A CPU normally provides four privilege levels for each code segment in the main memory, corresponding with the Privilege Ring from 0 to 3, where ring 0 has the most power and ring 3 has the least one. Each request to execute and access to system resources can require a minimum privilege level.

As soon as loaded from bootloader, the OS creates a memory section storing kernel code and data called kernel space with ring 0, then sets the required privilege levels to other resources and allow kernel space to manage them. The OS loads all subsequent codes to the least powerful memory section (ring 3) called user space. Every operation on system resources requires the kernel space’s permission through the API that OS provided.

There is a common misconception about the existence of kernel space that kernel runs as an independent process in the system. As we know, each CPU core can only execute one program at a time. If a single-core CPU loads and executes a user program, and the program does not actively allow the CPU to execute the kernel, then the kernel code cannot stop the user process. In reality, similar to shared libraries, the kernel space is loaded into the memory at the time of loading the user program and becomes a part of the program’s memory code. At any time, there is always a section of kernel code ready to handle interrupts, exceptions, and take away the user space ’s CPU usage.

As the kernel space becomes a part of the program together with the user space, just like shared libraries, the way that the user space calls APIs from the kernel space is similar to the way it calls a function from shared libraries: Using a calling convention.

The API that the OS provides for the user space to request the kernel space to perform a task is called a system call. The rules of making a system call are stated in system call calling convention. This convention differs from other calling conventions because it completely isolates the code, stack, and data of the two spaces. Moreover, it also changes the current privilege level from ring 3 to ring 0 and from ring 0 back to ring 3.

This convention varies depending on the platform. In addition, the system calls that each OS provides are different, in terms of number, name, purpose, number of parameters, and order of parameters.

This is an example of a typical system call table of an existing Linux distribution and Windows NT kernels. The difference is enormous and even with OS loader and ABI, it’s impossible for a Linux program to run normally on Windows and vice versa.

https://syscalls.kernelgrok.com/

https://github.com/j00ru/windows-syscalls

As a matter of fact, there is a kind of program called compatibility layer that tries to provide an interface allowing the executable file of another platform to be executed on the host platform. The most famous compatibility layer is the WINE project, which enables the Windows PE file format implementation on the Linux system. WINE originally stood for Windows Emulator, but later it was changed to be an abbreviation of “Wine Is Not an Emulator” because WINE actually implemented a PE native file instead of an emulator or virtual layer.

To execute a native PE file, WINE translates Windows system calls to corresponding Linux ones, provides system libraries, and creates a new virtual file system, etc. Obviously, the difference among platforms is too great for the compatibility layer to ensure that the program works perfectly on both foreign and host platforms.

Conclusion

After considering the factors that a platform affects the machine code above, we can summarize that a machine code program, which is a collection of instructions in machine language, depends on a particular platform through the following features, resulting in a program compiled into machine code cannot be executed on another platform:

  • Instruction Set Architecture (ISA)
  • Platform’s Application Binary Interface (ABI)
  • Operating System’s Application Program Interface (API)

This analysis still needs more constructive criticism. We welcome all of your feedback! Thank you!

References:

--

--