The WHY behind the WAT: An explanation of JavaScript’s weird type system

So if you’ve been in the JavaScript world for a bit, you’ve probably come across Gary Bernhardt’s “Wat” talk.

For those of you who haven’t seen it, you are seriously missing out and I would highly suggest that you go and check the talk out.

A brief summary of the video:

Bernhardt (the person giving a presentation at CodeMash 2012) discusses some of the unexpected behavior in Ruby and JavaScript. Around two minutes into the presentation, he starts ripping into JavaScript specifically with his hilarious use of sarcasm to get one point across:

JavaScript is…weird (to say the least).

To illustrate this point, he brings up a few examples of illogical operations that produce unexpected results.

[] + [] = ''
[] + {} = '[object Object]'
{} + [] = 0
{} + {} = NaN
Array(16).join('wat' - 1) + ' Batman' = 'NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN Batman'

Now imagine that you are an undergraduate and you’re the instructor for an Intro to JavaScript class. You thought it’d be comical to show Bernhardt’s video in the first lecture to show how quirky and funny JavaScript is, and then you go onto to talk about basic JS syntax, types, functions, closures, etc. But before you do that, someone asks you

“Why is array plus array equal to empty string?”

That’s when it hits you that you have no idea what is going on in that video or JavaScript’s type system in general…at least that’s where I found myself about a week ago.

So, in order to prevent developers from giving semi-coherent, hand-wavy explanations of why JS works in the way it does (and of course to educate you, the reader), let’s dive into the Why behind the WAT.

I can do types good, right?

So let’s begin at the basics. There are 5 different literal (things you can declare instantaneously) types that exist in JS:

Numbers (e.g. 1, 2 , 1.28 , NaN , Infinity, etc.)…note that NaN (not a number) is a number. From the ECMA spec:

4.3.20 Number type: set of all possible Number values including the special “Not-a-Number” (NaN) values, positive infinity, and negative infinity

Strings (e.g. 'xyz' , "abc") Pretty straightforward

Boolean (just true and false)…there’s a whole article that can be written about truthy vs falsy values. But for the moment, we’re going to skip that.

Objects (e.g. {name:'abhi', dob: '1997'})

Array (e.g. [1,2,'hi'])

Of these literals, only booleans, numbers, and strings are primitives. There are also a couple of other primitive values (undefined and null).

Let’s do an exercise:
Given that the typeof function will output a string representing the type of a variable passed to it, what is typeof([1,2,3])?

Well…it’s actually 'object' .

In JavaScript, objects and arrays are handled nearly identically because arrays are just instantiations of objects. The difference is the following:

  • While objects are just an unordered map from string keys to values, arrays are an ordered list of values with integer keys.

Keep that idea in mind, knowing that an array is really an object helps you gain intuition for WAT is happening.

Okay…so I know how to type good, but can i do the adding thingy?

Well…let’s see.

https://www.ecma-international.org/ecma-262/6.0/Ecma_RVB-003.jpg

The addition operator in JavaScript (as formally defined in 11.6.1 of the ECMA spec) is the following:

The production AdditiveExpression : AdditiveExpression + MultiplicativeExpression is evaluated as follows:
1. Let lref be the result of evaluating AdditiveExpression
2. Let lval be GetValue(lref).
3. Let rref be the result of evaluating MultiplicativeExpression.
4. Let rval be GetValue(rref).
5. Let lprim be ToPrimitive(lval).
6. Let rprim be ToPrimitive(rval).
7. If Type(lprim) is String or Type(rprim) is String, then return the String that is the result of concatenating ToString(lprim) followed by ToString(rprim)
9. Return the result of applying the addition operation to ToNumber(lprim) and ToNumber(rprim). See the Note below 11.6.3.

Oh boy that’s a lot to look through.

https://tgimworklife.files.wordpress.com/2010/10/overwhelmed.jpg

Breaking it down, we’re just going to be calling this function GetValue on the left and right hand side of the addition operator. For the two results we get back from GetValue, we then call a function ToPrimitive. If the results of ToPrimitive on both sides are strings, just concatenate them. Otherwise, just add them (according section 11.6.3).

Alright…so what are these GetValue and ToPrimitive functions?

Per the ECMA spec, GetValue will just return the value associated with the variable passed into it (ref section 8.7.1). The more interesting part is what happens next.

ToPrimitive takes in an input argument and a PreferredType optional argument. The addition operation doesn’t specify a second arg to ToPrimitive, so the results are as follows:

Basically, for all of the primitive types undefined, null, boolean, number, string , we don’t do anything and leave them as is. For everything else (ie an object), we called another function called DefaultValue on the object itself specified in section 8.12.8 of the ECMA spec.

The spec says that when DefaultValue is called with no hint (what we are doing in this case), then it behaves as if the hint were Number. In that case, we call the valueOf method on the argument of DefaultValue. If that result is a primitive value, just return that. Otherwise, we call toString on the argument! If this str is a primitive value then return it. Otherwise, throw a type error exception.

The next question is what happens when we call toString (I promise we’re almost done here). Well, according to section 9.8 of the spec, we output the following

Okay, for an object, it says to go back to ToPrimitive and pass in a hint of String…going back, we see that calling ToPrimitive on an object with a hint (i.e. PreferredType of String ) leads us to go back to the DefaultValue method, but this time with a hint of String as well. In this case, if there exists a toString method on argument passed in, just return that string.

Whew…I think we’re done. But what does this all mean?

Well…let’s take a look at the first example in Bernhardt’s talk.

Array plus Array equals empty string..right?

Okay so, walking through our line of logic, recall that an array is of type object. What happens when I call ToPrimitive(GetValue(([])) on the empty array?

By the line of reasoning I went through above, we’ll just end up calling toString on the array. Per section 15.4.4.2 of the spec, we call join on the array with no arguments. And per section 15.4.4.5 (specifying the join operation), we just concatenate all the elements of the array separated by commas. So [].toString() = '' .

Recall that the addition operator actually does call GetValue followed by ToPrimitive on each of the operands. So we’ll end up with an expression as follows

ToPrimitive(GetValue(([])) + ToPrimitive(GetValue(([])) =
ToPrimitive([]) + ToPrimitive([]) =
[].toString() + [].toString() =
'' + '' = '' (per definition of addition with string operands)

And voila, we are done. Moving on.

Array plus Object equals uhh…why is it [object Object]?

http://i0.kym-cdn.com/entries/icons/mobile/000/021/464/14608107_1180665285312703_1558693314_n.jpg

Try looking up Object.prototype.toString() in the ECMA spec. You’ll see that in section 15.2.4.2, it defines that if toString is called on an object you output [object + class + ] . The class variable is determined by getting the internal class of a given object (the equivalent of calling variable.constructor.name).

For a regular object, this is just {}.constructor.name = 'Object'. So the final toString() output on an object {} is [object Object] .

Going through the whole chain again…

ToPrimitive(GetValue(([])) + ToPrimitive(GetValue(({})) =
ToPrimitive([]) + ToPrimitive({}) =
[].toString() + {}.toString() =
'' + '[object Object]' = '[object Object]'

Okay, this next one is very weird:

Object plus Object equals [object Object][object Object] right?

Kinda. So in some cases if you type in {} + {} into a web browser console, it will output [object Object][object Object] (it does this in the Chrome REPL and the node REPL). But for some other browsers, you’ll get NaN (e.g. Firefox). What’s going on?

It totally depends on how the individual browser implements the ECMA spec. In the first case, the browser just considers the first and second operands as objects, and calls the typical toString methods on them and concatenates the resulting strings.

But the first {} can also be interpreted as a code block which can essentially be thought of as…nothing. So our {} + {} actually boils down to +{}.

This +{} is referred to as unary addition. The main difference with unary addition is that it only works with one operand (in this case {}) and does ToNumber(ToPrimitive(GetValue({}))) instead of a ToPrimitive(GetValue({})) call. The ToNumber operation is specified in section 9.3.1, but I won’t go through explaining everything…because there are a lot of cases. General rules of thumb:

  • If the value looks like a number, it gets cast to a number.
  • If it is an empty string, it gets cast to 0.
  • Other truthy and falsy values get cast to 1 and 0 respectively.
  • Anything else is NaN

So let’s evaluate the original expression.

+ToNumber(ToPrimitive(GetValue({}))) =
+ToNumber(ToPrimitive({})) =
+ToNumber({}.toString()) =
ToNumber('[object Object]') = NaN

Ah so Object plus Array equals…0?

Yes! You’re getting it (at least I am hoping you get it…medium articles are not a great way to get user feedback). Let’s just walk through this quickly.

The first {} is considered a code block (this is surprisingly consistent across browsers). So, we’re now going to do unary addition on empty array.

+ToNumber(ToPrimitive(GetValue([]))) =
+ToNumber(ToPrimitive([])) =
+ToNumber([].toString()) =
ToNumber('') = 0

NaNNaNNaN… WATMAN!

http://www.dccomics.com/sites/default/files/GalleryChar_1920x1080_BM_Cv38_54b5d0d1ada864.04916624.jpg

Whee! We’re almost done. This is what we’re looking at next.

Array(16).join('wat' - 1) + ' Batman' = 'NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN Batman'

Walking through this:

  • Array(16) creates an array with 16 elements that have empty values in each slot.
  • The join call concatenates all the values of the array, appending the value passed into join's argument e.g. [1,2,3].join('hi') = 1hi2hi3 .
  • Lastly, we just concatenate a string ' Batman'

But what does the subtraction operation do? It looks like the 'wat' — 1 operation yields NaN .

Per section 11.6.2, we just call GetValue and then ToNumber on each of the operators. Using our previous knowledge, this will yield NaN — 1 . Per section 11.6.3, if one of the operands of additive operators is NaN , the result is NaN .

Aha! We are done. QED (well…this isn’t a proof…but regardless, you got through the article).

Takeaways

Okay, I’m just going to say this: JavaScript’s type system isn’t all that bad.

Yeah, it may be annoying to deal with random errors, here and there, but this system is an incredibly intuitive way to think about programming (cue screeches of thousands of Java developers).

For example, I teach CIS197 at UPenn, and a number of my students are beginner programmers. While the curriculum in the intro to CS classes stresses Java and OCaml, JavaScript serves as a nice break from the strongly typed languages, and I am really glad to see my students amazed when they don’t have to think about making sure a specific variable is declared as an int or a specific type of object. A variable should just be something that can change. By placing restrictions on it, you’re partially defeating the purpose.

While most people think that JS’s type system is incredibly illogical, there is a reason behind the madness for nearly everything you can think of. Yes, it may be frustrating, but that is the price JS developers pay for having a language that allows you to freely interconvert between types. And yeah…I’d pay that price every day.