Learning Assembly Language

Bilal Khan
8 min readMar 22, 2016

--

In my previous blog post I talked about the basics of Assembly Language. Lets level up a little on our knowledge of assembly language. Again in this project lets try to crack a passworded file, we will extract its password, write a bash script to generate a random password for it and also patch the file using a hex editor.

All the files discussed in this tutorial including the main file can be found here. Lets do a file <filename> and find some basic information about our main file.

$ file crackme
crackme: ELF 64-bit LSB executable, x86–64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=1638b10a748d91e4812d0ef54f2aa089b7201308, not stripped

If the file is compiled by using gcc’s -g flag, it tells gcc to not to strip debugging information. This means that for every instruction there is info about which line of the source code generated it. Strip, on other hand, can strip this debugging information and other data which is not necessary for the binary file.

Lets try a random password on the file and see its behavior.

$ ./crackme
Usage: ./crackme password
$ ./crackme heydontdothis
Wrong password

Our first bet would be to try to print all the strings in this file to look for a password, but that won’t work. So lets use objdump to get the assembly code for this file, and have a look at it.

$objdump -d crackme
000000000040057d <checksum>:
40057d: 55 push %rbp
40057e: 48 89 e5 mov %rsp,%rbp
400581: 48 89 7d e8 mov %rdi,-0x18(%rbp)
400585: 48 c7 45 f8 00 00 00 movq $0x0,-0x8(%rbp)
40058c: 00
40058d: eb 14 jmp 4005a3 <checksum+0x26>
40058f: 48 8b 45 e8 mov -0x18(%rbp),%rax
400593: 0f b6 00 movzbl (%rax),%eax
400596: 48 0f be c0 movsbq %al,%rax
40059a: 48 01 45 f8 add %rax,-0x8(%rbp)
40059e: 48 83 45 e8 01 addq $0x1,-0x18(%rbp)
4005a3: 48 8b 45 e8 mov -0x18(%rbp),%rax
4005a7: 0f b6 00 movzbl (%rax),%eax l
4005aa: 84 c0 test %al,%al
4005ac: 75 e1 jne 40058f <checksum+0x12>
4005ae: 48 8b 45 f8 mov -0x8(%rbp),%rax
4005b2: 5d pop %rbp
4005b3: c3 retq
00000000004005b4 <main>:
4005b4: 55 push %rbp
4005b5: 48 89 e5 mov %rsp,%rbp
4005b8: 48 83 ec 20 sub $0x20,%rsp
4005bc: 89 7d ec mov %edi,-0x14(%rbp)
4005bf: 48 89 75 e0 mov %rsi,-0x20(%rbp)
4005c3: 83 7d ec 02 cmpl $0x2,-0x14(%rbp)
4005c7: 74 20 je 4005e9 <main+0x35>
4005c9: 48 8b 45 e0 mov -0x20(%rbp),%rax
4005cd: 48 8b 00 mov (%rax),%rax
4005d0: 48 89 c6 mov %rax,%rsi
4005d3: bf b4 06 40 00 mov $0x4006b4,%edi
4005d8: b8 00 00 00 00 mov $0x0,%eax
4005dd: e8 7e fe ff ff callq 400460 <printf@plt>
4005e2: b8 01 00 00 00 mov $0x1,%eax
4005e7: eb 41 jmp 40062a <main+0x76>
4005e9: 48 8b 45 e0 mov -0x20(%rbp),%rax
4005ed: 48 83 c0 08 add $0x8,%rax
4005f1: 48 8b 00 mov (%rax),%rax
4005f4: 48 89 c7 mov %rax,%rdi
4005f7: e8 81 ff ff ff callq 40057d <checksum>
4005fc: 48 89 45 f8 mov %rax,-0x8(%rbp)
400600: 48 81 7d f8 ee 0d 00 cmpq $0xdee,-0x8(%rbp)
400607: 00
400608: 74 11 je 40061b <main+0x67>
40060a: bf c8 06 40 00 mov $0x4006c8,%edi
40060f: e8 3c fe ff ff callq 400450 <puts@plt>
400614: b8 01 00 00 00 mov $0x1,%eax
400619: eb 0f jmp 40062a <main+0x76>
40061b: bf d7 06 40 00 mov $0x4006d7,%edi
400620: e8 2b fe ff ff callq 400450 <puts@plt>
400625: b8 00 00 00 00 mov $0x0,%eax
40062a: c9 leaveq
40062b: c3 retq
40062c: 0f 1f 40 00 nopl 0x0(%rax)
NOTE: ***ONLY TWO SECTIONS ARE SHOWN HERE***

The -d option by default outputs instructions in AT&T syntax. The format of this syntax is <mnemonic> <source>,< destination>. The mnemonic is the machine instruction. The operand’s source and destination can contain registers (prefixed by %), immediate values (constants prefixed by $), memory addresses, etc.

Lets start with the main function. The current value in %rbp is being pushed to the stack for later use. Then memory is being reserved for the main function by subtracting 0x20 from %rsp register.

%rbp and %rsp belong to the category of special purpose registers. %rbp is the base pointer, which points to the base of the current stack frame, and %rsp is the stack pointer, which points to the top of the current stack frame. Read more about Call Stack.

This is how memory is being reserved in the first few lines

For the next instructions, lets use gdb to make full sense of it. Just for the sake of clarity lets disassemble it again in gdb.

$ gdb crackme
(gdb)disas main
Dump of assembler code for function main:
0x00000000004005b4 <+0>: push %rbp
0x00000000004005b5 <+1>: mov %rsp,%rbp
0x00000000004005b8 <+4>: sub $0x20,%rsp
0x00000000004005bc <+8>: mov %edi,-0x14(%rbp)
0x00000000004005bf <+11>: mov %rsi,-0x20(%rbp)
0x00000000004005c3 <+15>: cmpl $0x2,-0x14(%rbp)
0x00000000004005c7 <+19>: je 0x4005e9 <main+53>
0x00000000004005c9 <+21>: mov -0x20(%rbp),%rax
0x00000000004005cd <+25>: mov (%rax),%rax
0x00000000004005d0 <+28>: mov %rax,%rsi
0x00000000004005d3 <+31>: mov $0x4006b4,%edi
0x00000000004005d8 <+36>: mov $0x0,%eax
0x00000000004005dd <+41>: callq 0x400460 <printf@plt>
0x00000000004005e2 <+46>: mov $0x1,%eax
0x00000000004005e7 <+51>: jmp 0x40062a <main+118>
0x00000000004005e9 <+53>: mov -0x20(%rbp),%rax
0x00000000004005ed <+57>: add $0x8,%rax
0x00000000004005f1 <+61>: mov (%rax),%rax
0x00000000004005f4 <+64>: mov %rax,%rdi
0x00000000004005f7 <+67>: callq 0x40057d <checksum>
0x00000000004005fc <+72>: mov %rax,-0x8(%rbp)
0x0000000000400600 <+76>: cmpq $0xdee,-0x8(%rbp)
0x0000000000400608 <+84>: je 0x40061b <main+103>
0x000000000040060a <+86>: mov $0x4006c8,%edi
0x000000000040060f <+91>: callq 0x400450 <puts@plt>
0x0000000000400614 <+96>: mov $0x1,%eax
0x0000000000400619 <+101>: jmp 0x40062a <main+118>
0x000000000040061b <+103>: mov $0x4006d7,%edi
0x0000000000400620 <+108>: callq 0x400450 <puts@plt>
0x0000000000400625 <+113>: mov $0x0,%eax
0x000000000040062a <+118>: leaveq
0x000000000040062b <+119>: retq
End of assembler dump.
(gdb)

Lets put a breakpoint on the address 0x00000000004005bf (It is in bold above), and go step by step to make sense of things. Also lets run it without any random password.

(gdb) b *0x00000000004005bf
Breakpoint 1 at 0x4005bf
(gdb) run
Starting program: ~/crackme pass
Breakpoint 1, 0x00000000004005bf in main ()
(gdb) info reg edi
edi 0x1 1
(gdb) info reg rsi
rsi 0x7fffffffe5e8 140737488348648

mov %edi,-0x14(%rbp): We checked what is being moved from %edi to -0x14(%rbp). This can be read like take the address stored in %rbp, offset it by -0x14 and save the value in %edi and copy it to this address.

mov %rsi,-0x20(%rbp):

(gdb) nexti
0x00000000004005c3 in main ()
(gdb) x/1d $rsi
0x7fffffffe5e8: -6137
(gdb)

nexti here means move to the next instruction. In the next two instructions it is comparing those two values and if they are equal it jumps (je)to an address. Lets run this program to the end, and see what happened here.

(gdb) continue
Continuing.
Usage: /home/vagrant/password/dont_hate_the_hacker_hate_the_code/crackme password
[Inferior 1 (process 12131) exited with code 01]
(gdb)

So basically, it checked whether the user has inputted password or not, as we didn’t, it didn’t make any jump, and outputted that a password is needed.

Lets move to the machine instruction pointed by the (je) command we just discussed above and try to find out what is happening in the next instructions when the user has inputted a password.

0x00000000004005e9 <+53>:    mov    -0x20(%rbp),%rax**WHERE je mnemonic jumps if password has enters**

In the next instructions we see it calculating a checksum, that should of concern to us. Lets rerun gdb from the start of our program and put a breakpoint at the address 0x00000000004005f7, where it is calling the checksum function.

**rerun from beginning**
(gdb) b *0x00000000004005f7
Breakpoint 1 at 0x4005f7
(gdb) run pass
Starting program: /home/crackme pass
Breakpoint 1, 0x00000000004005f7 in main ()
(gdb) x/1s $rax
0x7fffffffe844: "pass"
(gdb)

Right before the call for checksum we see that the %rax register gets the input password string (probably for calcuating some kind of checksum). Lets find out.

(gdb) nexti
0x00000000004005fc in main ()
(gdb) nexti
0x0000000000400600 in main ()
(gdb) info reg $rax
rax 0x1b7 439

By just looking at the checksum function we get the idea that it is adding the individual ASCII values of the input password. Lets try it. For the input password “pass”, ASCII addition is 112+97+115+115 = 439. Exactly equal to the value in the %rax register.

In the next instruction we see that it is comparing this value with a constant.

0x0000000000400600 <+76>: cmpq $0xdee,-0x8(%rbp)

0xdee in decimal equals to 3566. So on first sight, we should make the assumption that any password that has ASCII addition that equals to 3566, is the password. Lets write a bash script that will produce random passwords that equals 3566 in ASCII addition.

#!/bin/bash
ascii(){
printf "%d" "'$1"
}
chr() {
[ "$1" -lt 256 ] || return 1
printf "\\$(printf '%03o' "$1")"
}
# A-Z
a(){
local variable=$(shuf -i 65-90 -n 1)
echo $variable
}
#1-9
b(){
local variable=$(shuf -i 48-57 -n 1)
echo $variable
}
#a-z
c(){
local variable=$(shuf -i 97-122 -n 1)
echo $variable
}
number=0
password=""
while [ $((3444-$number)) -gt 0 ]
do
random=$($(chr $(shuf -i 97-99 -n 1)))
letter=$(echo $(chr $random))
password=$password$(printf "%s" "$letter")
number=$(($number+$random))
done
last=$((3566-$number))
letter=$(echo $(chr $last))
password=$password$(printf "%s\n" "$letter")
echo "$password"

On every launch this script will produce a random password, that will have an ASCII addition equal to 3566.

Lets generate a password and try it on the file.

$ sh script2.sh
Mt9cCtnsykKx1Ep4bug2tL14yOeCXB1yvG1V57A03QJ
$ ./crackme Mt9cCtnsykKx1Ep4bug2tL14yOeCXB1yvG1V57A03QJ
Tada! Congrats

It works! How about we apply a patch to this program so that this file accepts any password. This definitely seems interesting. For this purpose we will use emacs built-in hex editor. We notice that when an input password is not given it jumps to a specific address, to show the error status and quit. How about we change the address of that jump location and make it so it completely bypasses the password check, yet still retains the error output if no password is entered.

Enter emacs, press and use M-x hexl-find-file to enter into hex mode. Give it the filename/path to your file. Using gdb and keeping track of the jumps we notice that 0x40061b is the address where it jumps to when the right password is entered, which then prints “Tada! Congrats”.

1) M-x hexl-find-file
2) Filename: ~/crackme

Seeing our assembly instructions, these are the three instructions we need to work with to come up with a patch.

4005c7: 74 20         je 4005e9 <main+0x35>
4005c9: 48 8b 45 e0 mov -0x20(%rbp),%rax
-------40061b: bf d7 06 40 00 mov $0x4006d7,%edi

The way this jump address is calculated is by subtracting NEXT MACHINE INSTRUCTION ADDRESS by THE ADDRESS WHERE YOU HAVE TO JUMP.

Lets calculate the current offset of this jump by subtracting the jump to address and the next instruction (0x4005e9–4005c9=20). Do you notice the 20 above in the hex? That’s what we need to change! The other 74 is the opcode for the je instruction. We have to jump to 40061b. and the next address after the je instruction is 4005c9. Hence 0x40061b-0x4005c9=52. Hence we need to put 52 instead of 20 there. Use emacs to search for this hex value and replace it with 52 and save it. Additional information for using emacs as a hex editor.

Patching the file

Search for the these hexcodes 74 20, and replace 20 with 52. Finally run the file entering any random password that doesn’t ASCII addition that equals 3566.

$ ./1-cracked anypassword
Tada! Congrats

It works!

This tutorial is for educational purposes only. Follow me on github

--

--