TISC 2021 Challenge 8 Walkthrough — Get Schwifty

Published in

CSIT tech blog

13 min readMar 31, 2022

TISC 2021 challenge 8 is named Get-Schwifty and it consists of two parts, Web and Pwn. Upon solving the Web portion, the player will be able to obtain a binary for the Pwn portion.

In this writeup, I will not go through the Web portion because I feel that it is quite straightforward and has been well described in the writeups by our top 2 TISC challenge winners: Eugene and Yi Kai. Check out their fantastic writeups on how they completed the challenge in vastly different ways:

● https://spaceraccoon.dev/the-infosecurity-challenge-2021-full-writeup-battle-royale-for-30k#level-8-get-shwifty

● https://chooyikai.github.io/ctf/writeup/tisc-2021/2021/12/11/tisc-2021.html#8-get-schwifty

Since both of them cracked the Pwn portion in a manner that ( painfully :-) ) deviated from how I had intended it to be done, I thought it might be useful to share the intended solution (with the help of IDA Pro’s decompiler).

A little heads up on the following walkthrough: the following format will be used to refer to its counterpart in screenshots:

sub_<address> will be the counterpart of <name>_<address> in the screenshot
For example, sub_2810 will be the counterpart of stage1_sanitycheck_2810

Upon running the binary, you will be presented with a Cromulon asking the player to “SHOW ME WHAT YOU GOT!!!” and a menu in Figure 1.

Figure 1: Credits for the Cromulon ascii art to here

Here’s a rough outline for the intended walkthrough.

Stage 1 (Sanity Test): We can override a flag via a simple stack overflow to reach Stage 2.
Stage 2 (Get Recruited): (i) We can first leak the memory address of a global variable and use that to calculate the address of the secret flag function. (ii) Next, we can use a stack overflow triggered by an integer truncation bug to overwrite a function pointer with the address of the secret flag function.

Note: For a summary of the binary’s functionality, Yi Kai has impressively drawn an extensive flow chart which makes it easier to understand this binary as well as what I will be going through next.

Stage 1

In the challenge, a player will be asked to prove their sanity first before he/she is allowed into stage 2 (Get Recruited).

Reversing the binary, you will find that there is a check that resides in sub_3606 that determines if the player is allowed to continue to Stage 2 or not.

The check is to see if gSpecialFlagToGoToStage2’s last bit is zero or not. Cross referencing gSpecialFlagToGoToStage2, there is only 1 other location that allows the player to write to it, so let’s go and check it out.

After some reversing, you will find that sub_2810 contains the logic for the sanity test.

gSpecialFlagToGoToStage2 is only controllable by special’s last character and sub_2810 normally will not allow the player to modify to the last character of special. But, you can see from my comments in the screenshot above that there is actually a buffer overflow via the std::cin function which sends the user input to src which resides on an allocated stack space. Using this bug, the player will be able to override to the last character of special and the reason why it would work is because of the stack allocations created by alloca.

And since stack allocations are last-in-first-out (LIFO), the final order of the stack allocations by alloca from the top-down will be: dest, src, special. Length of src and special are each 0x20 so you would need 0x40 bytes to overflow to the last character of special but because alloca does not allocate that tightly, there is actually some space in between src and special. So you would actually need 0x50 bytes to overflow from the start of src to the last character of special with a byte that has the last bit being zero and that gSpecialFlagToGoToStage2 is not null.

One other important thing to note is gSpecialString. It will later be used for XOR-ing whatever the player enters as a string to be stored in a linked list node’s buffer for Stage 2.

Stage 2

I expected a player to first notice that there is some kind of function pointer implementation which would then tempt him/her to try and override the function pointer caller to call the secret flag function. When the player is hooked, they will look deeper into the implementation and might then realize that there is an integer truncation bug that can allow them to override the function pointer in the stack. The player should then realize that you will need to defeat ASLR first by somehow leaking some address. Hence stage 2 can be broken down into two parts — Stage 2A involves leaking an address, and Stage 2B is then about using the leaked address to override the function pointer caller.

Note: It might sound confusing because I categorized the binary into Stage 1, Stage 2A and Stage 2B but actually this stage categorization is based on the exploitation flow of the binary, not the logic flow of the binary nor the expected exploration flow of the binary.

What was unexpected from the solvers’ solutions is that the XOR part of Stage 1 enabled an arbitrary read primitive. (you can read their writeups linked at the top for details) What I had intended was for the player to use Stage 2 to creatively exploit and defeat ASLR in order to call the secret flag function by crafting primitives in heap. However, they exceeded expectations and creatively exploited it a different way.

Before covering Stage 2A, I will go through Stage 2B first in more detail on how we can get control over the function pointer call.

Stage 2B

Let’s revisit sub_3606.

To gain control over the function pointer caller, you first need to understand what table is. After some reversing and understanding the data structure (CallTable), you will see that the size of table is 0x40 and there is a function pointer array from offset 0x18 onwards.

The data structure below is roughly what I would have understood.

*Ignore* struct CallTableMetadata because I was able to make use of the privilege of being the challenge writer to recover that part of the data structure :P, for players it doesn’t really matter. Only the part after offset 0x18 in struct CallTable matters.

So where is table allocated at? In the screenshot earlier, from my comments, you will see that it is actually allocated on the stack via alloca very similar to Stage 1! So we want to see what could be allocated at a lower address from it. Going down the code, we can see that buffer is allocated by alloca as well with the size of 0x800 (max_length__int16).

Going back to the decompiled code, you can see that there are some writes to buffer at line 73 in the screenshot above that looks like a memcpy function and there is some sort of a length check first against the max_length__int16 to make sure it doesn’t go out of bounds of the buffer’s size. However, upon closer inspection, you would realized from my naming of the variables and comments that the integer passed from std::size to len__unsigned_int16 is truncated because std::size returns std::size_t. Actually, because of the truncation and passing from an unsigned value, IDA also mistakenly interprets len as unsigned to the benefit of the player (it is actually a signed short in the source code). len__unsigned_int16 is then later used in a signed comparison (JLE) against max_length__int16 where if integer passed from std::size is 0x1ffff (-1 after truncation for a 16 bit length signed integer) or 0x8000 (-32768, a misrepresentation of an unsigned value to a signed 16 bit value) then the check will pass even though len__unsigned_int16 is larger than max_length__int16. This will subsequently allow for out of bounds write on the stack from buffer.

With this knowledge in mind, the payload for Stage 2B will be something like this below.

This is so that Addr to call can override table->list.arrayIndex3. For Addr to call, we actually only needed the last 4 bytes of the secret flag function address because the program is kind enough to replace the higher 4 bytes of the function pointer. We will come back here again after we have figured out what the secret flag function address is.

So the next challenge is how to defeat ASLR and obtain the secret flag function address.

Stage 2A

So one great way to figure out the secret flag function’s address is through other functions’ addresses or global variables’ addresses then figure out if any of them can be leaked. This is because the offset to each other is fixed. As presented in the other solutions, some of them can be found on the stack, while in my solution, I will be focusing on the global variable found in the heap and this can actually be found in the first part of the main function that I named gRootNode when it is initializing the node list and the root node.

So why gRootNode? This is because after entering Get Recruited you will be given multiple commands to manipulate strings that are managed by a doubly linked list (see screenshots below). These commands looked like it has provided enough functionality to allow me to leak the address of gRootNode (n0) through the next node. This is because if I create a new node, then the new node (n1) will contain a prev pointer to n0. This way, I will be able to add a known global variable address to a heap location that I might potentially have a controlled out of bounds read on.

Option 1 provides a way to create a new node. Option 2 and 3 provide a way to manipulate the string buffer in the node. Option 4 provides a way to read all the string buffers in all the nodes. Together, they potentially provide a way to achieve arbitrary read/write primitives. But first, we will need to find out if it is buggy. Let’s list the function address for each options:

Let’s reverse and figure out the node’s data structure first. You should be able to get the following data structure.

It is actually somewhat straightforward to figure out some of these fields. These can actually be mostly obtained from reversing the “Append String” function (sub_3269) alone with more supporting evidence in a couple other functions. Since the append string function would add a new node, this would mean that it would need to initialize and fill most of the fields in the node.

From the screenshot above, my comments highlight the lines that contain clues needed to reconstruct the Node data structure:

int__len

Since the respective decompiled code above in line 21 moves DWORD to the field, we could infer that this field is 4 bytes long. IDA decompiler also helpfully suggests that it is an “int” (signed).

line 8 of sub_2b9d (screenshot above) also suggests that Node’s int__len is actually a signed integer.

buffer

Line 34 suggests that the character from input_data is being pushed to buffer. And if you look deeper into the function, you will be able to see that it is also doing the same thing for field_10. But field_10 might just be the internal direct way of accessing the string buffer, so we can just put our focus on buffer.

sub_269e, where it is used in Option 4 (screenshot above), shows the usage of Node’s buffer. This would suggest that this should be the actual string buffer.

NodePtr__next

Line 41 seems to contain the offset to the next pointer because the new_node is assigned to it. This would only be logical for it to be the next pointer for the older node as you have just added a new node with this “Append String” function.

NodePtr__prev

Line 42 seems to contain the offset to the previous pointer because the older node is assigned to the new_node’s offset, forming a doubly linked list. It would be logical for this offset to store the previous pointer.

Skipping ahead in the reverse engineering effort, let’s go directly to sub_30a5 (Replace String Option) in order to figure out whether there is a bug here.

Since the size of our input can be std::size_t (input_size) and there is no check on the input size against the max length or anything else before it enters sub_2b9d (at line 50) to write to the node’s int__len and buffer, we will be able to cause out of bounds read when we call Option 4 (sub_269e) to read all the string buffers in all the nodes.

This would work because in addition to the reasons in the paragraph above, the input_size2__unsigned_int is truncated and treated as a signed integer argument. In sub_2b9d shown in the screenshot below, input_size__int is first written to the node’s int__len before any bytes are copied over. Next, since input_size__int is signed compared (JGE) with i, we can avoid copying while maintaining int__len a large number. This will be possible if we send an input length of at least 0x80000000 (2,147,483,648) bytes which would be treated as a signed integer (-2,147,483,648). And since, 0 would always be larger than a negative number, we can enter skip the copying operation.

After using this bug to write a big number to int__len of the node, simply calling Option 4 (sub_269e) will allow us to read out-of-bounds from the node’s buffer.

Armed with this knowledge, let’s try to heap massage a little via calling “Append String” multiple times to create more nodes to fill the heap so we can achieve the following heap layout whereby a buffer is allocated right above (or below) a node.

After a short trial and error, I find that I am always able to achieve this layout for n3. Even though the other actual buffer location may be at a lower address and I have a big number in int__len to read, using n1’s or n2’s buffer would not work. This is because n1’s and n2’s buffer are not close its own node address and in the “Show what you have for the Cromulon currently” function, I am only limited to reading a maximum of 2048 bytes from the buffer. However, it would work with n3’s because n3’s actual buffer loc is closer to its own node address.

My exploit strategy after achieving this layout:

Use the bug described for the “Replace Appended String” function to override the int__len for n3, which gives us out-of-bounds write via the “Modify Appended String” (Option 3) function and out-of-bounds read via the “Show what you have for the Cromulon currently” (Option 4) function.
The purpose of the out-of-bounds read from n3’s buffer would be to leak n3’s NodePtr__prev address.
With the previous node’s address (n2), we can use this to replace n3’s buffer pointer via the out-of-bounds write that essentially gives us an arbitrary read/write primitive.
Now we have n3’s buffer pointing at n2, we can again use the Option 4 function to leak n2’s NodePtr__prev (n1).
Then, we use Option 3 function on n3, whose buffer pointer is currently pointing at n2, to replace n2’s buffer pointer with n1’s address.
Now we have n2’s buffer pointing at n1, we can finally read with the Option 3 function to leak n1’s NodePtr__prev (n0 or gRootNode).

Note: Due to the nature of my solution, it will mess up the linked list nodes’ buffer. So before I can execute Stage 2B with my payload containing the secret flag function address, I will need to do some cleaning up on the nodes. My solution is simple and that is to fix n2’s buffer from a saved copy from earlier and then null n2’s NodePtr__next to break the link to n3 and I just ignore the fact that n3 is still alive :). Without this, executing Stage 2B again would crash because in Stage 2B, it will attempt to reset the linked list, free all nodes and clear all buffers.

Ret 2 Stage 2B

After leaking gRootNode’s address, the rest is quite straightforward. The offset from gRootNode to the secret flag function is 22244. So gRootNode address minus 22244 will give you the secret flag function address. Then, we plug in the address bytes in big endian order to our payload as Addr to call and voilà, you win! :)

Afterthoughts

Looking back at the challenge, I realized that during the designing phase I should have expected more of the unexpected when it comes to the exploration flow of the binary. Additionally, if I am creating a single binary that have multiple stages with different levels of difficulty, I should have thought harder on the impact of the lower levels. And if they could have a much larger impact than originally intended :P. For example, I was talking to one of the solvers and from my impression, the challenge appeared as one continuous chunk to him even though I viewed it as separate chunks. I realized it is because I was too focused from the creator’s point of view. When he first explored, he was looking into memory copy operations and after locating Stage 1 with that, he proceeded to look for ways to exploit in Stage 1 instead of going to Stage 2 as it would appear that Stage 1 is a lower-hanging fruit as compared to Stage 2 for exploitation.