Database Reverse Engineering, Part 3: Code Reuse, Conclusion

Preface

Read in English:
Database Reverse Engineering, Part 1: Introduction
Database Reverse Engineering, Part 2: Main Approaches
Database Reverse Engineering, Part 3: Code Reuse, Conclusion
Read in Russian:
Database Reverse Engineering, Part 1: Introduction
Database Reverse Engineering, Part 2: Main Approaches
Database Reverse Engineering, Part 3: Code Reuse, Conclusion

In the second part, we studied Microcat Ford USA database internals. We researched general data structures representing vehicles and vehicle parts. The last component we need to investigate is part diagrams. Recall the data structure dependency axis and the database architecture.

The dependency axis
The database architecture

Analyzing MCImage.dat

Last time we figured out that MCData.idx representing the part tree is linked to MCData.dat containing vehicle parts and to MCImage[2].dat consisting of vehicle part diagrams. The last one is linked via image_offset field — which is showed on the scheme above — and image_size field. Let’s use [2.8] and [2.9] approaches to see how images are stored in the file.

Determining the image offset and the image size
The beginning of the image

What is this? It not looks like a widely used format as well as it is unlikely to be a compressed image since there are many zeroes and repeated bytes. Let’s go on.

The middle of the image

No, all the same it is compressed, so the beginning of the image is a header. Checking another images in the file we make sure every image has a completely different header with no byte patterns. An absence of magic numbers complicates the situation because there are no keywords to search for on the internet except the fact that an image has a header.

Finding and Debugging the Image Displaying Code

We search for “image” string in the program libraries and get the following list.

C:\MCFNA\
18.12.02│186432│A │CSIMGL16.DLL
28.05.07│ 26048│A │FNASTART.DLL
19.08.12│215024│A │FNAUTIL2.DLL
31.10.97│ 6672│A │IMUTIL.DLL
23.05.06│2701 K│A │MCLANG02.DLL
06.09.06│2665 K│A │MCLANG16.DLL
14.04.97│146976│A │MFCOLEUI.DLL
06.09.06│2395 K│A │NAlang16.dll
14.04.97│ 57984│A │QPRO200.DLL
14.04.97│398416│A │VBRUN300.DLL

There are exported procedures found in CSIMG16 and FNAUTIL2 and IMUTIL which could interest us.

We need to find a procedure taking a compressed image as input and giving uncompressed one as output. There is no confidence that such procedure is exist at all since MCImage.dat bytes may be compressed / encrypted with some generic algorithm implemented in the abyss of MCFNA.exe. Therefore we will go another way instead of disassemble the libraries right now. There must be functions — perhaps WinAPI — to display images on the screen. We need to guess those functions, and set breakpoints on them, and trace their callers.

WinAPI allows us to work with images of different formats but the simplest is BMP which requires nothing more than simple calls to USER.exe / GDI.exe (16-bit analogues of user32.dll and gdi32.dll) to display an image. In favor of hypothesis that compressed part diagrams are bitmaps, the evidence is pictures of BMP, RLE (compressed BMP), JPG, GIF residing in Res directory.

Let’s open WinAPI reference to note BMP creation and loading routines: CreateBitmap, CreateBitmapIndirect, CreateCompatibleBitmap, CreateDIBitmap, CreateDIBSection, LoadBitmap. Now we are ready to debug.

But one more remark first. Several NE files use the feature called Self-Loading which allows to execute instructions before code flow will be transferred to OEP, as PE TLS callbacks do. It is used to uncompress an original code packed by Shrinker in our case.

I have tested several 16- and 32-bit debuggers and the most equipped to NE debugging is WinDbg. 16-bit Open Watcom and Insight Debugger was not able to launch MCFNA.exe because of Self-Loading I think. OllyDbg 1 / 2 is able to debug NE indirectly via NTVDM but throws exceptions on 16-bit code breakpoints. x64dbg does not support NE. But WinDbg is pleased as usual, some of its features are: distinguishes NE modules and NTVDM PE libraries loading; stops on hardware and software breakpoints; recognizes and disassembles 16-bit code including displaying and understanding addresses in the form of segment:offset. There are problems with Disassembly window listings but it is not fatal since Command window is OK.

Now let’s talk a little bit on how 16-bit code is stored in NTVDM memory. Thanks to researchers (see references) and by myself it was found out that all modules are loaded at address range from 0x10000 to 0xA0000 which resembles the real mode memory layout. We need to perform an ordinary search for bytes in order to find a desired 16-bit function. In particular, we need to take the first bytes of bitmap creation routines and find them in 0x10000–0xA0000 range.

The search for routine by the first bytes example
The search for routine by the first bytes example

Let’s launch the program under WinDbg. Searching for all the WinAPI function noted above we figure out that LoadBitmap of USER.exe is not found, so the module left is GDI.exe. Then we set up breakpoints on each routine.

We continue execution and break on CreateCompatibleBitmap right after switching from WinDbg to Microcat window. This happens every time, apparently the interface is drawn, so we disable that breakpoint and continue again. Then we select a vehicle, walk on the part tree, and break on CreateDIBitmap twice at the moment when a part diagram should appear. This is the only function we break on.

The break on CreateDIBitmap

Let’s determine the callers in both cases. For that we take two words from the stack, [ss:sp + 2] which is a segment and [ss:sp + 0] which is an offset, combine them and disassemble at a given address.

The stack and the caller on the first break
The stack and the caller on the second break

In both cases code is located in the different segments, therefore it is from the two libraries on a disk. Search for “8b f8 83 3e 08 1f 00 74 2c 83 7e fa 00 74 15 ff” and “8b f8 83 7e f4 00 74 0d ff 76 fe ff 76 f4 6a 00” byte sequences in files. The first sequence is found in Visual Basic Runtime Library VBRUN300.dll and the second is found in FNAUTIL2.dll. That library where export functions related to image processing have been found!

Analyzing the Image Displaying Code

Skipping the reversing process where there is nothing new, I will show commented disassembled code right now.

The exported procedure calling CreateDIBitmap turned out to be GETCOMPRESSEDIMAGE. It receives the next arguments as input: ret_val_far_pointer where an error code will be stored; screen_height and screen_width as an image resolution; image_size defining a compressed image size; offset to an image of MCImage.dat (remember image_offset in the DB architecture); unk_structure_ptr is a pointer to where some magic values and MCImage.dat file handle must be stored. Code written in Visual Basic calling this routine passes arguments under Pascal calling convention i.e. from left to right, so they seem to be oddly arranged in the disassembler.

After that the seeking to the specified offset in MCImage.dat occurs and read_and_unpack_image procedure we are searching for in the previous section is called. It will remain to be a black box for us.

When an image is unpacked its size is adjusted to the specified in screen_height and screen_width, get_palette_handle is called where a palette using CreatePalette WinAPI is created, and then create_bitmap we have found under WinDbg is called to create the bitmap from the unpacked bytes using CreateDIBitmap.

At last, memory which stores the unpacked bytes and serves no more use is freed, and the exported function returns HBITMAP.

So we have find out the function to unpack part diagrams and its interface. The last thing we should do is to write a tool to call the function to unpack required images.

Reusing the Image Unpacking Code

We have to write 16-bit program since FNAUTIL2.dll is 16-bit too. I have chosen Open Watcom C compiler because of that. The below is code calling GETCOMPRESSEDIMAGE from FNAUTIL2.

typedef struct {
long unk_1;
long unk_2;
int unk_3;
int mcimage;
} ImageFileData;
HBITMAP decrypt_image(char* mcimage_path, unsigned long 
enc_image_offset, unsigned long
enc_image_size) {
int mcimage = open(mcimage_path, O_RDONLY | O_BINARY);
if (mcimage == -1) {
printf(“ERROR: cannot open mcimage ‘%s’\n”, mcimage_path);
return NULL;
}
    ImageFileData data = {0};
data.mcimage = mcimage;
long screen_width = 0;
long screen_height = 0;
int ret_val = 0;
HBITMAP bitmap = GETCOMPRESSEDIMAGE_proc(&ret_val,
&screen_height, &screen_width, enc_image_size,
enc_image_offset, &data);
if (!bitmap) {
printf(“ERROR: GETCOMPRESSEDIMAGE failed (MCImage = ‘%s’,
Offset = %p, Size = %p)\n”, mcimage_path,
enc_image_offset, enc_image_size);
close(mcimage);
return NULL;
}
    close(mcimage);
return bitmap;
}

decrypt_image receives path to MCImage.dat file and an image offset and an image size as input. The file is open, the ImageFileData structure named as unk_structure_ptr in the disassembler and the other arguments are initialized and then passed to the exported procedure. decrypt_image returns a bitmap handle. The code calling decrypt_image saves a bitmap on a disk using save_bitmap after that.

int save_bitmap(HBITMAP bitmap, char* dec_image_path) {
int ret_val = 0;
unsigned bytes_written = 0;
    HDC dc = GetDC(NULL);
    // 1 << 8 (biBitCount) + 0x28
unsigned lpbi_size = 256 * 4 + sizeof(BITMAPINFOHEADER);
BITMAPINFO* lpbi = (BITMAPINFO*)calloc(1, lpbi_size);
if (!lpbi) {
printf(“ERROR: memory allocation for BITMAPINFO failed\n”);
return 0;
}
    // BITMAPINFOHEADER:
// 0x00: biSize
// 0x04: biWidth
// 0x08: biHeight
// 0x0C: biPlanes
// 0x0E: biBitCount
// 0x10: biCompression
// 0x14: biSizeImage
// 0x18: biXPelsPerMeter
// 0x1C: biYPelsPerMeter
// 0x20: biClrUsed
// 0x24: biClrImportant
lpbi->bmiHeader.biSize = sizeof(BITMAPINFOHEADER);
lpbi->bmiHeader.biPlanes = 1;
ret_val = GetDIBits(dc, bitmap, 0, 0, NULL, lpbi,
DIB_RGB_COLORS);
if (!ret_val) {
printf(“ERROR: first GetDIBits failed\n”);
free(lpbi);
return 0;
}
    // Allocate memory for image
void __huge* bits = halloc(lpbi->bmiHeader.biSizeImage, 1);
if (!bits) {
printf(“ERROR: huge allocation for bits failed\n”);
free(lpbi);
return 0;
}
    lpbi->bmiHeader.biBitCount = 8;
lpbi->bmiHeader.biCompression = 0;
ret_val = GetDIBits(dc, bitmap, 0,
(WORD)lpbi->bmiHeader.biHeight, bits, lpbi, DIB_RGB_COLORS);
if (!ret_val) {
printf(“ERROR: second GetDIBits failed\n”);
hfree(bits);
free(lpbi);
return 0;
}
    // Open file for writing
int dec_image;
if (_dos_creat(dec_image_path, _A_NORMAL, &dec_image) != 0) {
printf(“ERROR: cannot create decrypted image file ‘%s’\n”,
dec_image_path);
hfree(bits);
free(lpbi);
return 0;
}
    // Write file header
BITMAPFILEHEADER file_header = {0};
file_header.bfType = 0x4D42; // “BM”
file_header.bfSize = sizeof(BITMAPFILEHEADER) + lpbi_size +
lpbi->bmiHeader.biSizeImage;
file_header.bfOffBits = sizeof(BITMAPFILEHEADER) + lpbi_size;
_dos_write(dec_image, &file_header, sizeof(BITMAPFILEHEADER),
&bytes_written);
    // Write info header + RGBQUAD array
_dos_write(dec_image, lpbi, lpbi_size, &bytes_written);
    // Write image
DWORD i = 0;
while (i < lpbi->bmiHeader.biSizeImage) {
WORD block_size = 0x8000;
if (lpbi->bmiHeader.biSizeImage — i < 0x8000) {
// Explicit casting because the difference
// will always be < 0x8000
block_size = (WORD)(lpbi->bmiHeader.biSizeImage — i);
}
        _dos_write(dec_image, (BYTE __huge*)bits + i, block_size,   
&bytes_written);
i += block_size;
}
    _dos_close(dec_image);
hfree(bits);
free(lpbi);
return 1;
}

The input arguments of save_bitmap are HBITMAP and a file path where the bitmap should be stored. First memory for BITMAPINFO including BITMAPINFOHEADER and RGBQUAD specifying an image resolution and color is allocated. Then memory which will contain bitmap bytes converted from HBITMAP is allocated. This allocation is made using halloc returning a pointer with __huge attribute indicating that memory can be larger than 64 KB. The call to GetDIBits copies the bitmap from the handle to allocated memory. Finally BITMAPINFOHEADER and BITMAPINFO and the bitmap are written to the file. I had to put the file writing code to the loop because _dos_write cannot save more that 64 KB at once.

As a result, we got a utility that solves the part diagram unpacking problem.

The final dependency axis

Conclusion

This concludes the database reverse engineering article series. Initially I had plans to write more articles but it is became clear that the basic, key information fits into three parts. No need to point a telescope at what is seen in binoculars, it is just narrowing a sight for those who will watch.

Future work in DBRE field could be on the following subjects.

  • Automate file format reverse engineering by creation analytical software that will use heuristic algorithms to reconstruct tables, records, and fields. It should be interactive to allow a user to fix and to complement data structures guessed by the program. It should be a positive feedback system and take user-defined data structures into consideration to attempt to reconstruct other structures. Think of it as “IDA Pro for reverse data engineering”.
  • Automate the cross reference research process by creation software based on previous one. It could implement heuristics to determine what bytes, words, and dwords are offsets point to file(s) of a database. This can be done by using knowledge of the database file formats and itself can expand that knowledge.
  • Develop other DBRE approaches. Those described in the previous article are not the last resort and I believe there are more ways to research databases.
  • Contribute reverse engineered file formats to public resources even if you do only file format RE and not database RE. For example, there is a format library maintained by Kaitai Struct developers (see references).

The end of the series also has a symbolic meaning for me. The end of year is the time to sum up, to transition to the next spiral coil. I felt my duty was to share knowledge to other researchers before switching to a different reverse engineering direction. To you to judge how much I succeeded.

References

Disclaimer

The text is for informational purposes only. The author does not liable for any misuse of knowledge obtained from this article. Copying and distribution of information without the permission of the rightholders is illegal.