Solving conflicts between form inputs and DOM APIs

Dima Voytenko
4 min readDec 21, 2016

--

The problem

The problem involves how HTML FORM inputs are exposed on the parent <form> element. It was originally introduced in the MS Internet Explorer as far back as JavaScript 1.0 and eventually copied by most of browsers. W3C DOM Level 2 spec has addressed this by introducing HTMLFormElement.elements collection. Unfortunately, the old behavior was left intact for backward compatibility.

To describe this in more detail, let’s start with the simplest form definition:

<form
id="form1"
action="https://example.org/submit"
style="border: 1px solid black">
</form>

Here’s how you’d access properties in JavaScript:

form1.id == 'form1'
form1.action == 'https://example.org/submit'
form1.style.borderWidth == '1px'
form1.getAttribute('style') == '1px solid black'
form1.submit() // Submits the form.

Now, let’s add some inputs:

<form
id="form1"
action="https://example.org/submit"
style="border: 1px solid black;">
<input name="id">
<input name="action">
<input name="style">
<input name="submit">
<input name="getAttribute">
</form>

Just because we added few inputs, the JavaScript above will not work as expected. Form’s properties id, action, style, and getAttribute have been overriden and will instead return the actual inputs. In other words:

form1.id != 'form1'
form1.id == form.elements.id // Reference to the `id` input.
form1.action // Reference to the `action` input.
form1.style // Reference to the `style` input.
form1.submit // Reference to the `submit` input.
form1.getAttribute // Reference to the `getAttribute` input.

Thus, for instance, calling form.submit() throws:

Uncaught TypeError: form1.submit is not a function

You can read more about this problem in this post. One quote from this post:

As a result, the convenience feature introduced in JavaScript™ 16 years ago still bites you like a bug in client-side DOM scripting today.

A typical advice goes something like this:

  • Use safe input names, e.g. “_id” instead of “id”, “_action” instead of “action”, etc. However, this is a hard rule to enforce in AMP — we try to support any valid HTML markup and it feels wrong to prohibit input names such as “id” and “action” that have very sensible business meaning.
  • Call form1.getAttribute('id') instead of form1.id. However, this is not always possible. For instance, form1.getAttribute('action') could return a very different value from form1.action, which is a fully resolved URL. Or form1.getAttribute('style') returns a string as opposed to the CSSStyleDeclaration object returned by form1.style. Or form1.getAttribute method could be similarly overriden by an input with name “getAttribute”.
  • Call the DOM APIs via the original prototype. E.g. instead of form1.submit() do HTMLFormElement.prototype.submit.call(form1). However, this turns our source code into mess. Having to resort to this for most basic APIs such as append or getAttribute is very messy. It’s also slower.

The solution

Unfortunately, hard as we tried, we could not find a generic runtime solution that would restore the original APIs on the HTMLFormElement itself. So, instead, we decided to adapt a combined approach:

  1. We would create the proxy object that would expose HTMLFormElement's DOM APIs as properties/methods and direct the calls to the original prototypes. The proxy object is set on the form as a $p property, e.g. form1.$p.
  2. We would rewrite our JS on compiler lever to use $p object when available.

Form proxy object

For proxy object re-implements HTMLFormElement DOM API via the original prototypes.

A method call would look like this:

form.$p.submit = function() {
return HTMLFormElement.prototype.submit.call(form);
}

A property definition would look like this:

Object.defineProperty(form.$p, 'action', {
get: function() {
return Object.getOwnPropertyDescriptor(
HTMLFormElement.prototype,
'action').get.call(form);
}
});

And so on. This way we expose a complete HTMLFormElement API on the $p object.

Code rewrite in compile phase

Now that we have form.$p proxy object with the “fixed” API, we need to call it. I.e. instead of form.id we need to call form.$p.id. We could, of course, make these calls directly in the source code. But this is error-prone. Plus, we don’t even always know if a node we work with is a form or any other DOM element.

Instead, we introduced a compiler pass that would rewrite a normal node.id to use $p.id. We use Closure compiler that makes it possible to rewrite the final JavaScript code on the AST level.

The rewrite we actually want to do would look like this for node.id:

(node && node.$p || node).id

Similarly, for a method, node.getAttribute() would be rewritten as:

(node && node.$p || node).getAttribute(name)

In other words, when node.$p property is present, it is used to access properties and methods. Otherwise, the original node is used.

The rewrite can be done for all known HTMLFormElement APIs or only for the most critical subset.

By the way, the expression above could be made simpler by just doing (node.$p || node).id. However, one benefit of the longer expression is that it keeps the original errors intact. For instance, if node is null, the error message will be “Cannot read property ‘id’ of null” instead of “Cannot read property ‘$p’ of null”. The first message is, of course, much more representative of the underlying error and thus it’s better to preserve it.

Conclusion

This solution is fairly complicated. It involves proxying original API and rewriting JavaScript code on the compiler level.

However, this complication does have significant benefits:

  1. It’s a generic and a fairly error-proof solution. We leave it up to compiler to call the right APIs instead of relying on the correct source code.
  2. We keep our source code otherwise clean. The best way to get an ID is to call node.id and the best way to read an attribute is to call node.getAttribute(name). This solution allows our source code to continue using original DOM APIs everywhere.
  3. We incur a cost of additional $p lookup. The proxy access for properties and methods is also slow. However, the forms are relatively rare and this solution allows access to all other nodes to remain fast.

The real conclusion

Inputs overriding form’s DOM API is a really old and long deprecated feature. The W3C spec has addressed this need via HTMLFormElement.elements API. At this point it’s unneeded and almost never expected. However, it causes a lot of pain. It could also cause XSS.

A much better solution would be for the Web spec to provide an opt-out for this feature. E.g. <form do-not-expose-inputs-on-form>. It would be even better to make it a default, but backward compatibility will likely preclude that. I filed Issue 2212 on WhatWG to request this opt-out mechanism.

--

--