JavaScript Weekly: Data Types and Mutability

An Exploration of JavaScript’s Data Types and Their Quirks

Photo by Samuel Zeller on Unsplash

One of the most fundamental concepts in programming is the idea of data types. It’s a concept that is shared near-universally across major programming languages. In short, data types are instructions to a program’s compiler (or interpreter) regarding how it should handle a given value. Considering just how fundamental this idea is to most programming languages, you might think that the behavior of a particular data type in one language would be the same as in another. After all, why should a string in Ruby behave any differently from a string in JavaScript?

Well, for those of you who prefer to keep things simple (like me), I have some bad news: data type behaviors are nuanced and often particular to their own language. And that’s why it’s important that when you’re using a language you have a thorough understanding of it’s quirks, lest you find yourself expecting one set of behaviors and ending up with another. And with that in mind, let’s dive into data types in JavaScript and see if we can figure out their quirks.


A Mental Model for Data Types

First of all, if we’re going to talk about data types we had better come up with a definition. I said earlier that data types are instructions to a program’s compiler/interpreter that tell it how to handle a given value, but what does that mean in practice? Let’s see if we can unwrap the concept a bit.

Imagine I gave you the values 2 and 3 and told you to add them together and tell me the result. You would almost certainly just give me a 5 back, never stopping to think what kind of values I gave you to add. But here’s the thing — in your head, subconsciously, you did assign a type to those values. Given the context of my question you probably assumed that I meant for 2 and 3 to be numbers, so of course 5 was the result. But what if I now gave you the values “two” and “three” and asked for you to add them together? From the context, I have clearly given you two strings, and how do you add two strings together? Should you assume that I meant for them to be numbers and give me a 5 back? Or, more strictly, should you assume that I wanted a concatenated string in return and give me “twothree”?

Compilers / interpreters are responsible for evaluating the meaning of the code they have been given and identifying an appropriate action. Without data types, it would be much more difficult for them to decide between potentially disparate behaviors, just as in the above example when you had to decide what “add” means for two numbers as opposed to two strings. Data types allow us to understand the values we are using in a more idiomatic way, which in turn leads to more predictable behavior when our code actually runs.


Types of Types

JavaScript has two types of data types: “primitive data types”, which are immutable (more on that later); and, compound data types, which are mutable. The five main primitive data types are: string; number; boolean; undefined; and, null. (ECMAScript 2015 also added symbol to the list of primitives but we won’t go into that here.) There are a variety of compound data types but the most common are: object; array; and function. Strictly speaking, array and function are both subtypes of object; however, they have some unique behaviors so we will address them separately.

So, now that we know that we know what the different data types are, how can we identify them in our programs? To do that, you need to use the typeof operator, which you provide a value and it returns a string representation of that value’s type. Let’s see it in action.

As we can see in the above snippet, in most cases the typeof operator gives us the answer we are expecting. String values give us “string”, numbers give us “number”, and booleans give us “boolean”. There are however some quirks:

  • typeof 7 and typeof 7.5 both give us “number” rather than differentiating between an “integer” type and a “float” / “decimal” type, as in some other languages.
  • typeof null gives us “object” rather than “null”, for legacy reasons that will almost certainly never be fixed;
  • typeof [] gives us “object” rather than “array”, which isn’t all that surprising since arrays are subtypes of objects (more on how to distinguish arrays later); and,
  • typeof function(){} gives us “function”, even though it too is a subtype of object.

Weak and Dynamic Typing

Of course, once we have a particular value its type isn’t the only thing that concerns us. Most of the time, we also need a variable in which to store the value so we can use it later. This is where two additional aspects of JavaScript typing come in. First, JavaScript is weakly typed, meaning that you don’t have to tell the interpreter what kind of value you plan to store in a particular variable. In C, for example, if you want to store an integer value in a variable, then you must use a variable that is specifically initialized to hold integers, as in int i = 7. In JavaScript however, there is no need to tell the interpreter what kind of value you plan to store in a given variable. You just declare your variable with var, or let, or const, and move on!

Not only can you store any kind of value in any variable, you don’t even have to be consistent about it. This is because JavaScript is dynamically typed, meaning that the type of a value in a particular variable can be changed. If you have a variable foo that holds at integer at one point in the program, it’s perfectly valid for foo to hold a string later in the program. Here’s an example:

In the above snippet we define a variable someValue on line 1 and provide it with the string “Hello, world!”. We can confirm that someValue is holding a string using our trusty friend the typeof operator. Subsequently, on line 5 we reassign someValue to hold the number value 2018, and indeed on line 6 when we check someValue’s type it shows up as “number”. The same thing happens on lines 9/10 when we reassign someValue yet again to an empty object and test its type once more. Note that when we check the type of someValue we are not checking the type of the variable (it doesn’t have one after all), but the type of the current value being held in the variable.


Checking for Special Type Cases

Earlier we discussed a few quirks in the JavaScript type system, such as when arrays are typed as “object” (which is accurate but not super helpful), integers and decimals are both typed as “number” rather than having their own types (which again is accurate, but lacking in specificity), and null types as “object” due to a legacy bug that can’t be fixed without breaking half the Internet. So what do you do if you want to know whether, for example, a particular value is an array or an object? Well, thankfully there are a few utility methods and other tricks you can use in such cases.

The above snippet walks through a few examples of special type cases. On lines 2–4 we can see that a null value types as “object” but thankfully returns true when compared to null. On lines 7–10 we use the Array.isArray built-in method to test whether a value is specifically an array rather than just generically an object. Similarly, on lines 13–16 we use the Number.isInteger built-in method to test if a value is an integer rather than generically a number. And finally, on lines 19–23 we play around with the infuriatingly confusing value of NaN (“not-a-number”), which returns “number” when typed and false when compared to itself. In order to test if a value is NaN you need to either use the Number.isNaN built-in method or check if the value returns false when compared to itself (NaN is the only value in JavaScript that has this odd behavior.)


Mutability

As briefly mentioned earlier, one of the key differentiating attributes between primitive data types and compound data types is that the former are immutable. This means that you cannot change a primitive value. It’s important that you internalize this principle because it makes a difference in how your code functions. Moreover, if you’re coming to JavaScript from another language, the rules about mutability may be different.

So what does it mean for a value to be immutable? Imagine that you have a variable called myInt and it holds the number value 5. No matter what methods you call on myInt, the value 5 itself will never change because numbers are one of the primitive types. 5 is always 5. This does not, however, mean that myInt the variable (as distinguished from the value of 5, which it happens to currently hold) can never change. You might run an expression that says myInt += 10, and indeed, myInt will now be 15. This is not mutation though, it is reassignment. 5 is still 5, but myInt is no longer pointing to it. Consider the following example:

In this snippet, we have a variable, someGreeting, that contains the string value “hello” (a primitive type). We also have a second variable, otherGreeting, pointing to the same string value. If we mutated the string value being pointed to by these two variables, then we would expect both of the variables to reflect that change. On line 7, we try to do this by calling the ostensibly transformative concat method on someGreeting; however, when we then log the values of both someGreeting and otherGreeting, neither has changed. This is because strings are primitive and cannot be mutated. The concat method returned a new string but we didn’t do anything with it. Similarly, on lines 13–15 we can see that individual characters in a string can be accessed by index (as with an array); however, when we attempt to reassign one of those characters it has no result on the overall string, because again, strings are immutable. Finally, on line 17 we call concat once more, this time using its return value to reassign the someGreeting variable, and we do indeed see our expected change. But because this was reassignment, rather than mutation, otherGreeting is still pointing to the original value.

OK, so primitive types cannot be mutated, but what about compound types? Well, they most certainly can be mutated. Note however that compound types, as suggested by their name, are really data structures containing individual elements. Those individual elements could be either compound types themselves or primitive types. As you dig into the data structure, once you reach the primitives at the lowest level you reach data that cannot be mutated. Imagine an array of strings — the array is compound and mutable but the individual strings are primitive and immutable. Let’s look at an example.

Here we have an array called favoritePlanets, which contains several strings. On line 4, we call the Array.prototype.sort method, which sorts an array in place (meaning that it mutates the original array), and as expected, our array changes. On line 7, we push a new string to the array, and once more, we see that the array is mutated. Next, on line 10 we attempt to mutate the first element in the favoritePlanets array, but as we learned earlier this doesn’t actually work and so we get no change — thus demonstrating that compound types are mutable but the primitive types contained therein are not. However, primitive types inside a compound type can be reassigned. We can see this in the second part of the snippet when we define an object called lifeDiscovered on line 13 and then reassign one of its elements on line 22. The lifeDiscovered object has been mutated by virtue of one of its parts being reassigned.


Coercion

Before we finish up our discussion of data types, there is one last thing we should cover: type coercion. In the beginning of our discussion we identified data types as being sets of values that have some set of associated behavior rules. Strings act like strings and numbers like numbers. But what happens when you need two different data types to interact with one another? Say, when you try to add a string to a number? This is where type coercion comes into play. Type coercion is a way of changing a value from one data type into another data type so that it can adopt the other’s behavior.

Coercion comes in two forms: implicit; and, explicit. In the former, the interpreter looks at an expression that uses two different data types and uses a set of internal rules to decide whether one or both of the values should be coerced so that they have matching rules. In the latter, the source code explicitly instructs the interpreter on how it should handle coercion rather than letting it rely on its internal rules. As is so often the case, this is easiest to observe in code:

On lines 2–6 we see several examples of implicit coercion, including:

  • Line 2: The number 18 is implicitly coerced into a string so that it can be concatenated onto the string “20”.
  • Line 3: The string “20” is implicitly coerced into a number so that it can be multiplied by the number 18.
  • Line 4: The boolean true is implicitly coerced into a number (1) so that it can be added to the number 20.
  • Line 5: The string “20” is implicitly coerced into a number so that it can be tested by loose equality against the number 20.
  • Line 6: No coercion takes place when the strict equality operator is used.

On lines 9–10 we see two examples of explicit coercion:

  • Line 9: The string “20” is explicitly coerced into a number and added to the number 18 (giving us a different result than the implicit coercion version of this expression on line 2).
  • Line 10: The number 20 and the boolean true are explicitly coerced into strings and concatenated (giving us a different result than the implicit coercion version of this expression on line 4.)

TL;DR

Data types are an important concept in most programming languages but implementation differs from language to language. A data type is a set of values that share like behaviors, such as strings, numbers, booleans, and more. In JavaScript, data types can be checked using the typeof operator and they come in two flavors: primitive types, which are immutable; and, compound types which are mutable. In cases where typeof returns quirky results, built-in methods can be of use (for example, in determining whether a value is a generic object or an array.) Because JavaScript is weakly and dynamically typed, variables can contain values of any data type and that data type can change as the variables are given new values. Finally, in cases where two values of different data types need to interact with one another, type coercion can be used.


That’s it for this week’s JavaScript Weekly — I hope that you enjoyed it. If you would like alerts when a new article is published you can follow me on Twitter or subscribe on my personal blog where these articles are cross-published. Happy coding!