5 Attempts At Type Safety In JavaScript
--
As we develop new features for Searchlight, we’re always thinking about how best to balance the work our application needs to do between the client and server. These days we find that we’re able to put increasing demands on the JavaScript running in a user’s browser as we implement functionality that previously we would have coded in Java on the server. Impressive forward strides in browser performance allow us to manipulate large amounts of data client-side in addition to providing rich, responsive interfaces for our users. Having a robust front end is an important factor in keeping the application scalable, as much of our reporting data can now be loaded once and used efficiently in a number of different ways on a single page.
The challenges inherent to implementing reporting logic in JavaScript code are therefore worth pursuing. One of these challenges is the safe reimplementation of Java code — written to assume that type safety was reliably enforced at compile time — in JavaScript, a language not designed for type safety. Debugging and testing a new system like this, especially one that passes around a lot of data in the asynchronous, callback-based style of modern JavaScript frameworks, benefits from a dose of preventative type-check medicine. Here are some of the major approaches we’ve used and which any developer should consider in order to write robust JavaScript code and reduce bugs.
=== : “All variables are equal but some are more equal than others.”
This is often the first approach when dealing with type ambiguity in JavaScript, and has primary benefit of helping differentiate number primitives from string primitives:
var a = 3 + 2; a == 5; // true a == '5'; // true a === 5; // true a === '5' // false
Pros: ‘=== ‘is an easy, readable addition to code, and effectively eliminates a major source of headaches in number-string confusion. Conductor has several independent application feature development teams, and we rely on this method as a first line of defense to keep us disciplined when integrating shared REST endpoints into base objects in our front-end framework (we use Backbone.js)
Cons: If you’ve been doing web development for long enough to have worked with prototypes, you’re probably already chuckling to yourself about the naiveté of the ‘===’ operator. It only performs a simple type check on the l-val: if it’s an object, the operator evaluates to true only if the r-val is an identical reference. If the l-val is a primitive, equality is determined by the internal equals method of that primitive type. It is insufficient for complex objects with a prototype chain, and has limited value when coping with boxed forms of primitives, e.g.:
var a = 5; var b = new Number(5); var c = 'hello'; var d = new String('hello'); a === b // false c === d // false
typeof: “6 flavors ought to be enough for anybody”
JavaScript, being an implementation of ECMAScript, has 7 recognized types (number, string, boolean, function, object, null, undefined). The typeof operator will (usually) tell you which of these types you’re dealing with:
typeof 3 + 2; // "number" typeof 3 + 2 + ''; // "string" typeof new String(3 + 2); // "object" typeof function(){} // "function" typeof false // "boolean" typeof nonExistentObject // "undefined"
One major flaw is:
typeof null // "object".
(This is explicit, as JavaScript’s null was apparently meant to serve the same purposes as the parallel null value in Java.)
Pros: It is very effective and reliable for us at Conductor when we’re trying to detect whether functions references are defined before calling them, and as an effective guard against the use of undefined values. It can also be used to distinguish number, string, and boolean from their object equivalents when it is important to do so.
Cons: As mentioned, native null in Javascript cannot be uniquely detected by this method. The third example demonstrates that, like the ‘===’ operator, typeof won’t help you classify prototypes or auto-boxed primitives; all objects are essentially equal.
instanceof: “Fee fi fo fum, I smell the blood of a base prototype”
The instanceof operator was implemented in Javascript 1.5, which means it was released in November 2000, but it has received more attention in recent years as people begin to write more object-oriented JavaScript. The most valuable feature of instanceof is its ability to walk the prototype chain to determine type equality. (There are numerous strategies for implementing inheritance in Javascript; we prefer the following approach for its simplicity and clarity. See here for more explanation of this particular method.)
function inherits(Base, Child) { var Temp = function() { }; Temp.prototype = new Base(); Temp.prototype.constructor = Child; Child.prototype = new Temp(); } function A() { } function B() { } function C() { } function D() { } inherits(A, B); inherits(B, C); inherits(A, D); var a = new A(); var b = new B(); var c = new C(); var d = new D(); [ a instanceof A, a instanceof B, a instanceof C, a instanceof D ]; // [ true, false, false, false ] [ b instanceof A, b instanceof B, b instanceof C, b instanceof D ]; // [ true, true, false, false ] [ c instanceof A, c instanceof B, c instanceof C, c instanceof D ]; // [ true, true, true, false ] [ d instanceof A, d instanceof B, d instanceof C, d instanceof D ]; // [ true, false, false, true ]
Pros: instanceof faithfully obeys prototype inheritance. (We would use this at Conductor if we didn’t rely on the inheritance model of Backbone.js, which we extend with the backbone-super library.)
Cons: instanceof is limited to objects. Constructors for primitive types are not implemented in the JavaScript language, and the specification says nothing about Java-like auto-boxing. Hence:
var a = 5; var b = new Number(5); a instanceof Number; // false b instanceof Number; // true
constructor: The faithful (?) function
In JavaScript, any function can be used to instantiate an object via the new operator. Using new will create a new object, set that object’s prototype property to the value of the function’s prototype property, and execute the function with the new object bound to ‘this’. Here’s the equivalence matrix from the last example, using constructor prototypes to check equality:
[ a.constructor === A, a.constructor === B, a.constructor === C, a.constructor === D ]; // [ true, false, false, false ] [ b.constructor === A, b.constructor === B, b.constructor === C, b.constructor === D ]; // [ false, true, false, false ] [ c.constructor === A, c.constructor === B, c.constructor === C, c.constructor === D ]; // [ false, false, true, false ] [ d.constructor === A, d.constructor === B, d.constructor === C, d.constructor === D ]; // [ false, false, false, true ]
Pros: Though we don’t make extensive use of this in our code, it does exist in a few critical places when seeking to access base methods or fields. It is also in some of Javascript libraries we use. The check it performs is precise when the exact type of an object must be known.
Cons: Unlike instanceof, the constructor property will not trigger a walk of the prototype chain; it is a single reference check. This can lead to unexpected behavior when module sandboxing leads to a failure to detect the equivalence of objects created by different imports of the same library code. To work around this, developers should consider leveraging the name property on the constructor, which can be used to disambiguate JavaScript function identifiers. In AMD, for instance:
define([ "apple", "shop/bakery" ], function(Apple, Bakery) { var apple = new Apple(); var b = new Bakery(); // Assume this asynchronous request returns a PieType object fully constructed elsewhere. b.getMostPopularPieType({ tags: [ "Thanksgiving" ]}, function(pieType) { var another_apple = pieType.getMainIngredient(); var apple.constructor === another_apple.constructor // false var apple.constructor.name === another_apple.constructor.name // true }); });
The problem illustrated above can also arise when comparing objects between frames. Constructors are defined on the global object, which means that iFrames, Workers, and other independent script execution environments will fail to consider constructor references equal between shared objects (as will instanceof, which checks prototypes). The name property can be quite helpful in these cases.
A “stringly” typed language
Popular JavaScript utility libraries such as Underscore.js or Lodash seek to supercede the basic type detection with a peculiar approach, adapted in the following example.
function whatType(obj) { return new RegExp('\\[object (.*)]').exec(toString.call(obj))[1]; } function G() { }; G.prototype.toString = function(){return '[object G]'} whatType(5); // 'Number' whatType(new Number(5)); // 'Number' whatType('5'); // 'String' whatType(new String('5')); // 'String' whatType(new Date()); // 'Date' whatType(new RegExp()); // 'RegExp' whatType(null); // 'Null' whatType(document.createElement('link')); // 'HTMLLinkElement' whatType(new G()); // 'object'
The technique on display here is the use the native Object prototype’s toString method, which is provided by all objects.
Pros: In a browser environment, this approach can identify native object types quite effectively, and provides the often-desired auto-boxing normalization. You can also use it in this context to identify all of the different DOM node types. We also get robust null checking. The Badoo Tech Blog has some interesting ideas about consolidating this into a useful standalone utility.
Cons: toString is another vestige of making JavaScript more Java-like, and the importance of faithfully overriding it in new prototypes is — at best — a mixed opinion among developers. Unfortunately (as in line 16 above), this method appears incapable of handling explicit extension in a new prototype’s toString. What a pity.
Some extensions to JavaScript suggest other approaches worth checking out if your project has a dependency on a particular framework or environment. Google’s Closure compiler offers comment-based annotations. If your needs are strictly numerical, typed arrays may be helpful, and offer impressive performance. GWT, Script#, TypeScript, Emscripten, Mandreel, and Duetto provide opportunities to sidestep type-checking problems entirely by treating JavaScript as a compiled target from more type-safe source code. The topic of type-checking in JavaScript has even received attention in multiple academic circles. In typical web development scenarios — especially when working with a preexisting codebase written in pure VanillaJS — the techniques native to the language described above should be what you try first.
There’s no one-size-fits-all solution (yet) to the problem of type checking in JavaScript. If I can leave you with one piece of advice, however — one warning that can be almost universally applied — it would be to eschew the == operator and thus avoid the following coding horrors:
[0] == true; // false !![0] == true; // true "potato" == true; // false "potato" == false; // false
If you’re looking for more information on the details of type coercion and equality, this informative post goes very deep on these topics. Whichever approach to type-checking you take, consider the merits and drawbacks of each, as well as the context of the code you’re writing. Consistent and effective use of a good unit testing framework and linting tool, like the ones we use at Conductor, will help you reinforce best practices.