PDF Forensics Workshop (Sample n1)

Peter Matkovski
4 min readApr 17, 2019

--

Obfuscation by character substitution

The goal of this workshop is to manually find exploits and shellcodes within obfuscated JS code delivered in PDF file. For effective defense, we need to know what vulnerability is attacker trying to abuse and how will malware call home.

PDF readers are popular targets as they need to support a large scale of objects (graphics, fonts, ActionScript,… ). That implicates a huge codebase of core functions and third-party libraries leading to the introduction of programming errors creating vulnerabilities.

JavaScript supported by every reader can be used to obfuscate the code in an infinite number of ways. Our first sample is an easy one, containing old known vulnerabilities. Sample1 hash is ff42379bcf89646613c334f0d6089578f9f230cc, you can download the sample from your favorite malware repository.

ANALYSIS

The analysis starts with the location of obfuscated code within PDF internal objects. We will use Peepdf in interactive mode -iignoring errors -f and catching malformed objects -l:

peepdf -fil sample1.pdf

Peepdf found multiple suspicious objects we will go thru.

Going thru suspicious objects manually leads us between linked suspicious objects to object 7 containing obfuscated code.

How we can export obfuscated code to the new file using peepdf internal js_beautifull function.

js_beautify output in the new file — line 10 is trimmed

We can already deobfuscate a few function names as only char replacement was used:

Line 2: jeroqurul == ‘fromCharCode’

Line 3: sedu == eval

The easy way is to let JS console to do a character replacement for us:

To find a eval function is crucial to reveal what code is about to be executed. We can show that code just by replacing eval with a print function.

line 16: sedu(x) => print(x)

We replaced eval with print and executed stage0.js file in V8 javascript interpreter getting the new code. New code is saved to stage1.js file.

The output of eval function replaced with a print (TRIMMED)

Let's take a deeper look on the code we revealed, it has only 90 lines. Fist interesting function is util_print() on lines 10–31:

This function contains Shellcode to download the next stage from an external address, heap spray code to fill a memory space and exploit for a buffer overflow vulnerability unit.printf() .

Next function includes similar Shellcode and Heap spray but different exploits targetting different vulnerability.

And again, this time the exploit targets getIcon vulnerability.

The last function explains why there are multiple vulnerabilities in place.

As we see, different exploits are triggered depends on the version of Adobe Acrobat Reader.

After successful exploitation, a Shell connection is created to download rest of malicious code. Lets saw first one to separate file to analyze it with peepdf.

The is not more obfuscated code and all functionality of the malicious sample was revealed. We are done here.

Summary of observations

PDF contains exploits targetting 3 different known vulnerabilities triggering based on the installed version of Adobe Acrobat Reader in this order:

END

--

--