Disclaimer: This is not really a write-up, but it’s just me taking notes. The content may chaotic, wrong or there may be better ways to do things. If you notice that, please point it out.
So far I was looking for bugs mostly in web apps, but the more I did it, the more I felt I was just scratching the surface. I want to dig deeper through the layers of abstraction and get down to the binary stuff.
https://www.youtube.com/watch?v=5tEdSoZ3mmE (part 0x00)
https://www.youtube.com/watch?v=yJewXMwj38s (part 0x01)
The first step would be to get a running V8 install on our machine. The following links will help:
It’s not a completely straightforward process and don’t worry if it takes you some time. Both tutorials are compiling it with is_debug=false, but you should set that to true because we want a debug build.
Note: As pointed out by Geluchat, a debug build has also disadvantages: slow and bugs that would trigger on a release build may sometimes not trigger on a debug build. Using use_jumbo_build=true and is_compoment_build=false will speed up the build process.
At one moment I was stuck on the last compilation command with “ninja” which was returning “Permission denied” errors (even when run as admin). Rebuilding V8 in C:\V8 instead of in My Documents resolved the issue (not sure why). Simply moving the build from one folder to another would be a bad idea because it would mess up your debug symbols and source paths.
For now, I have compiled the latest version, but it will be useful to learn to compile older versions for analyzing past bugs too at some point.
The V8 shell/debugger is called D8. It will be located here:
You should run it with the --allow-natives-syntax flag in order to get access to the native functions like DebugPrint (which is similar to the “describe” function LiveOverflow is using in his videos):
Internal types also appear to be called differently (i.e. array with integers only is PACKED_SMI_ELEMENTS while an array with integers and doubles is a PACKED_DOUBLE_ELEMENTS - and the array will become HOLEY instead of PACKED if there is an empty space in it):
DebugPrint is not the only internal function, and there are many more to explore:
LiveOverflow used lldb as the debugger. I’m gonna try to use WinDbg (no idea if that makes sense at all).
The first steps are relatively straightforward. You can either run D8 and attach to that process from WinDbg, or you can run the D8 executable from WinDbg directly. Once done, you can hit F5 to continue running the program. After that, you can start experimenting.
Lets try to create an array with couple of strings and do a DebugPrint on it:
Under “elements”, we see the address where the elements of the array are stored - in this case, the elements are two pointers to strings (we see the values of the pointers too). We can try to look at the address of the array elements in WinDbg (memory window — we need to pause execution in order to be able to search for an address).
There we’ll see the pointer to the first string. Don’t forget that memory is little-endian. If we would go to that address, we would find the string:
If we repeat the same with an array of integers (well, in JS those are numbers, not integers), the array would not be made of pointers, but of the integers we defined:
If we go to the “elements” address, we would see them:
Interesting to observe, the integers in the examples LiveOverflow was demonstrating in his 0x01 video had some high bytes set to FFFF. That’s because JS Core is using NaN-boxing, and V8 does not. The way arrays are stored appears to be different to. There are no mentions of butterflies or structure IDs. It seems those are JS Core concepts. But we’ll still have to find out how V8 internals work.
One more thing to try before we wrap up Part 1: Can we set a breakpoint on Math.max like LiveOverflow did it and step through the source code of that function? First we need to find that function in order to set the breakpoint. We can search for symbols like this in WinDbg:
We can then set a breakpoint:
We then run Math.max in D8 and the breakpoint is hit:
But I don’t see any source code? If I open the call stack window, I can see that the source code is available for most of the function in the call stack, but not for the v8!Builtins_* functions that are on top of the stack. But why?
After googling for a bit, I came across this:
It turns out that many builtin function are not done in C++, but are either hardcoded in assembly or implemented using the CodeStubAssembler infrastructure. But some are still written in C++. By browsing the source code a bit, I found that Math.hypot should be one of such functions. So I tried setting a breakpoint on bp v8!Builtins_MathHypot. I run everything again, hit the breakpoint, but still the same result. No source code. WHY???!!!
I asked on StackOverflow:
I did not expect to get an answer, but I did. As suggested, I have set the breakpoint directly in the source code, and when I hit it, i was able to step through the source code! I also noticed that my mistake previousy was that I was setting a breakpoint on v8!Builtin_MathHypot instead of v8!v8::internal::Builtin_MathHypot.
Ok, I’m happy with the progress so far. Now I’m gonna watch the 0x02 video of LiveOverflow and figure out what to explore next :)