Checkpoint challenge 2018 `Steganography`- writeup on the fly

csa.checkpoint.com

Sounds like a cool challenge!
We have a code which encrypts plain inside a gif format. Let’s see how it does it.

Three Files:

  1. secret.gif- containing our flag
  2. lzwlib.py
LZW compression is the compression of a file into a smaller file using a table-based lookup algorithm.
Two commonly-used file formats in which LZV compression is used are the GIF image format served from Web sites and the TIFF image format

3. enc.py- get gif file,flag and returns steganography of the gif.

Trivial words about winning

The slow and less interesting way to solve this challenge is to read all the code and understand it deeply. The right skill for the good researcher is the ability to read just minimum necessary code to get the flag. There are many quick experiments that could follow us to the relevant code.

Let’s get started

For the beginning, I guess the most efficient way to win is comparing between the plain gif and the secret gif that I will create. Binary diff is a very strong tool for reversing without wasting time on reading code.

First Diff

Interesting first diff! Our flag split into two and has been written after the gif header:

mp, ks = M(s)
o.write(f.read(hdr_end))
o.write('\x21\xFE')
o.write(WB('RDBNB'+mp))

I need to understand what WB and M do…

WB

Debugging ≥ Reading :

<Length><Data><\x00>

If the data > 255 it would be separated to blocks.

M

M returns shuffled(upper(flag)) and a list of the key to recover the origin flag (origin indexes and is_upper).

Not bruteforce

Now we can know the shuffled flag! what is it?

secret.gif

Length: 0x1A (161 bytes)

In [60]: secretf = open("secret.gif", "rb")
In [61]: secretf.seek(0xcf)
In [62]: secretf.read(1)
Out[62]: '\x1a'
In [63]: flag = secretf.read(0x1a+1)
In [64]: flag
Out[64]: 'RDBNBWFT}YMRGO{_SL@UEK3AHI\x00'

So the flag is: WFT}YMRGO{_SL@UEK3AHI → FLAG{<16 bytes>}
Such a shame it’s too long for bruteforce! It is more than 60 bit options (in the good case) :(

Find ks

The only place that ks has been mentioned is here:

if ks:
mpindx, isup = ks.pop(0)
obuf += h(mpindx, isup, ww, hh, len(global_colors)-1)

Variables:

+----------------------+------+---------+
| mpindx | 0-22 | from ks |
| isup | 0-1 | from ks |
| ww | ? | from F |
| hh | ? | from F |
| len(global_colors)-1 | ? | from F |
+----------------------+------+---------+

F seems like a kind of a gif parser. Perhaps I can run F on the secret.gif and get the parametrs?

In [141]: F(open("tomcat-power.gif", "rb")) == F(open("tomcat-power.gif.out.gif", "rb"))
Out[141]: True

Now let’s understand h function: It involves randints, shuffeling and compressing. Every time I run it the length of the output changes.

:/

Although there’s always the same magic to all the strings! Can I find the magic in the code?

magic in code

So now, according to this pattern I can find all the interesting information. Let’s write a python code which finds the magics and reverses the data to it’s original parameters:

The asserts are very important because , from my experience, there could be a lot of bugs in parsing which without asserts we wouldn’t know about.
Let’s have a look in the h function:

I know all the parameters which can get into Q and h arguments as I showed before. The mission is to get b6 and b1.

isuuper

Let’s look at b1:

idx = unknown*2 + b1
  1. I know idx
  2. I know b1 is 1 or 0
  3. And I know a product of 2 is always even!

which means:

isup = b1 = idx % 2

mpindx

For b6 let’s have a quick look at m

m reutns always 4 numbers. According to these numbers it will be pretty easy to predict the flow and get a / b6 / mpindx:

Finally

We got all what we need! See the final code (it is not the most efficient, but it is very easy to understand) :

In this challenge I shared my thoughts and all the methods I tried until the flag.

Feel free to review and comment!