Bare-Metal MCU #6: Compilers, Assemblers, and Friends

bhuvaneshwari kanagaraj
5 min readJul 14, 2024

--

In this blog, we’ll delve into the compilation process specific to Arduino. This exploration aims to provide informative insights.

Let’s illustrate how a compiler transforms human-readable code into machine code. We’ll take a sample piece of code and examine its conversion through each stage of the compilation process.

main. c

i =i+1

main.asm

lds r24, 22
adiw r24, 1
sts 22, r24

Assembly language can be translated directly into machine code, a process typically performed by a program called an Assembler. Alternatively, if you’re daring, you can also translate assembly language into machine code manually.

main.hex

If you’re curious about how this hex code suddenly appeared, let me explain. We’ll refer to the datasheet for clarification.

Let’s take the instruction adiw r24, 1 and convert it into hexadecimal. I found this instruction set information in the datasheet.

From Datasheet

Okay, what they mentioned here is some immediate number is being added to a word. For AVR, a word is. a 16-bit number (Stored in two 8-bit registers)

From Datasheet

We should note that operations can only be performed with registers 24, 26, 28, and 30.

We should add 1 to register 24.
opcode = 1001 0110
d = [00]

Explaination why d = 0[00]
Actually 0 = 00, 1 = 01, 2 = 10, 3 = 11
0 = 00 (user r24)
1 = 01 (user r26)
2 = 10 (user r28)
3 = 11 (user r30)

one in 8 bit can be represented by this

1 = 0000 0001

finaly if we want to add 1 to the register, and looking at the 16-bit opcode

1001 0110 0000 0001
1001 0110 0000 0001 in hex is 0x9601
click on View>Programmer in mac

We got 96 01 right we are writing this as 01 96? That’s a good question 0x9601 is entered as [0196] because AVR expects the smallest byte first.

This process is called “byte ordering” or “endianness.” Specifically, when storing or transmitting multi-byte data, endianness determines the order in which bytes are arranged in memory or a data stream. In the context of AVR microcontrollers and many other architectures, it uses “little-endian” byte ordering. This means that the least significant byte (LSB) is stored at the lowest memory address or transmitted first, followed by the most significant byte (MSB). Therefore, the hexadecimal value 0x9601 is stored as [01][96] in little-endian format on AVR microcontrollers.

80 91 16 00
01 96 80 93
16 00

To compile C-based code into AVR-based machine code for Arduino, let’s explore how Arduino utilizes this process.

If you’re using a Mac, you can install avr-gcc by executing the following commands:

brew tap osx-cross/avr
brew install avr-avr
The installation took me 5 to 6 mins!

if you are on Linux,

sudo apt-get install gcc-avr

In the ATmega328 family, all members follow the AVR5 instruction set. If the instruction set is not explicitly specified, the compiler defaults to using the AVR2 instruction set.

With this understanding, let’s proceed to run a basic compilation command for a bare-bones setup.

avr-gcc -mmcu=<microncontroller> <sourceFile>
avr-gcc -mmcu=atmega328 dummy.c

When compiling C code with avr-gcc, the default output filename is a.out unless specified otherwise using the -o <outputName> flag.

Here are some adjustments needed for AVR development:

  • Replace byte with unsigned char* to align with AVR-GCC requirements.
  • Use main() instead of void loop() and void setup(), as AVR-GCC expects a standard main() function for execution.
  • Utilize macros for efficient memory usage and code optimization.

These adjustments ensure compatibility and efficiency when developing AVR microcontrollers using avr-gcc.

#define PORTB *((volatile unsigned char*) 0x25)
#define DDRB *((volatile unsigned char*) 0x24)

int main() {

DDRB = 32;

while(1);

{
PORTB = 32; // Set PB5 HIGH (turn on LED)
for(long i=0; i< 1000000; i++) { PORTB = 32; }
PORTB = 0; // Set PB5 LOW (turn off LED)
for(long i=0; i< 1000000; i++) { PORTB = 32; }

}
}
Can you see a.out?

a.out is technically a machine code.

a.out file

We have something called avr-objcopy which helps in this conversion process (From Bin to Hex)

avr-objcopy -O ihex a.out a.hex

The output format is “intel Hex”, which is something AVRDude supports. a.out is the file generated by compiling dummy.c with avr-gcc

a.hex is the file that we want as the output.

avr-objcopy -O ihex a.out a.hex

// We need only the text and data to to be converetd

avr-objcopy -O ihex -j.text -j.data a.out a.hex
inside a.hex

Please note that if you have any questions about using shell and bash commands beyond what’s covered in this blog, feel free to ask. I’ll be happy to assist you with any related queries you may have in another context or post.

./avrdude -C ../etc/avrdude.conf -p atmega328 -c stk500v1 
-P /dev/cu.usbmodem1301 -B5
-U flash:w:a.hex

If you are encountering issues like this,

avrdude: device signature = 0x000000 (retrying)
avrdude: device signature = 0x000000 (retrying)
avrdude: device signature = 0x000000
avrdude error: Yikes! Invalid device signature.
avrdude error: expected signature for ATmega328 is 1E 95 14
Double check connections and try again, or use -F to override
this check.

Then please use this command,


./avrdude -C ../etc/avrdude.conf -p atmega328
-c stk500v1 -P /dev/cu.usbmodem1301 -B5
-U flash:w:a.hex -F
Uploaded!

That’s all for today. See you in the next blog!

--

--