Lesser-known browser APIs

Is it possible to execute a shell script, from a user’s web browser?

This inflammatory question from a much younger version of myself might have been based in naivety, but it definitely piqued my interest about the overall capabilities of modern web browsers.

Browsers, naturally, are capable of a lot more than just rendering markup. In fact, in recent years its become increasingly clear that browsers are pretty darn powerful and versatile — whether it’s the immense popularity of Electron (a Chromium-based web application wrapper for cross-platform desktop applications), or someone doing the entire Wolfenstein 3D in HTML5.

I’ve done some scratching and decided to compile a list of interesting HTML5 APIs, in no particular order. Note: some of these are still experimental, so don’t rely on any of this for your next push to production.

1. Web Speech

Imagine that — have web content read out to you, or have a microphone/input-enabled web browser recognise your voice in order to do something useful! Obviously a far-cry from Amazon Echo or Google Home, but still impressive given that this functionality is natively supported.

There are currently two major feature sets for the Web Speech API: SpeechSynthesis (Text-to-Speech), and SpeechRecognition (Asynchronous Speech Recognition).

Links:

2. Basic cryptography

Keen on some encrypting and decrypting in the browser? Enter, the Web Crypto API. Here’s an example of how to generate a RSASSA-PKCS1-v1_5 key:

window.crypto.subtle.generateKey(  
{
name: "RSASSA-PKCS1-v1_5",
modulusLength: 2048, //can be 1024, 2048, or 4096
publicExponent: new Uint8Array([0x01, 0x00, 0x01]),
hash: {name: "SHA-256"}, //can be "SHA-1", "SHA-256", "SHA-384", or "SHA-512"
},
false, //whether the key is extractable (i.e. can be used in exportKey)
["sign", "verify"] //can be any combination of "sign" and "verify"
)
.then(function(key){
//returns a keypair object
console.log(key);
console.log(key.publicKey);
console.log(key.privateKey);
})
.catch(function(err){
console.error(err);
});

Links:

  • Read more here
  • View list of examples here

3. Page Visibility

Ever had to write a JS metric plugin that would only record time users actually spent on a web page, instead of softly humming in the background? Check out the Page Visibility API.

Links:

4. sendBeacon

Here’s an interesting one that I came across developing a JavaScript SDK at a previous company I worked at. The problem statement:

Imagine you have to dispatch an Ajax call before page unload (typically an analytics API call, or some equivalent). The issue is that you don’t want to block the browsers main thread, wait for the Ajax call to execute and get a response from the server, and only then unload/redirect.

Well, turns out that there’s a browser API that caters for this: Navigator.sendBeacon(). The idea is that "the data is transmitted asynchronously to the web server when the User Agent has an opportunity to do so, without delaying the unload or affecting the performance of the next navigation" (from the MDN Web Docs).

Links:

5. Web Notifications

Okay, so Web Notifications are actually pretty well known, but it would be a sin to not include them on this list. Especially considering how versatile (configurable) they are.

Links:

6. Vibration (mobile)

Want to introduce some haptic feedback to your mobile application? Try out the Vibration API!

Links:

7. Network Information

Even though it’s an experimental feature, you can actually access information around the user agent’s network connectivity.

Don’t worry, nobody’s going to be able to sniff out too much. Here’s a breakdown of the information you do have access to — and even then, these properties will only be available on certain user agents:

[Exposed=(Window,Worker)]
interface NetworkInformation : EventTarget {
readonly attribute ConnectionType type;
readonly attribute EffectiveConnectionType effectiveType;
readonly attribute Megabit downlinkMax;
readonly attribute Megabit downlink;
readonly attribute Millisecond rtt;
readonly attribute boolean saveData;
attribute EventHandler onchange;
};
typedef unrestricted double Megabit;
typedef unsigned long long Millisecond;

Links:

8. Battery

Here’s a fun little one: if your application is running on a mobile phone (and even on a laptop!), you can make use of the Battery API.

Please note: the use of this API is discouraged and support for it will likely be deprecated over time.

Links:

9. Application Cache

Application cache is closely linked to the rise in popularity of Progressive Web Apps.

Hmm. What are PWAs?

Here’s a good definition of the core features associated with PWAs:

  • Progressive — Work for every user, regardless of browser choice because they’re built with progressive enhancement as a core tenet.
  • Responsive — Fit any form factor: desktop, mobile, tablet, or forms yet to emerge.
  • Connectivity independent — Service workers allow work offline, or on low quality networks.
  • App-like — Feel like an app to the user with app-style interactions and navigation.
  • Fresh — Always up-to-date thanks to the service worker update process.
  • Safe — Served via HTTPS to prevent snooping and ensure content hasn’t been tampered with.
  • Discoverable — Are identifiable as “applications” thanks to W3C manifests[6] and service worker registration scope allowing search engines to find them.
  • Re-engageable — Make re-engagement easy through features like push notifications.
  • Installable — Allow users to “keep” apps they find most useful on their home screen without the hassle of an app store.
  • Linkable — Easily shared via a URL and do not require complex installation.

Source: Wikipedia

A core feature of PWAs is cache-first networking. This involves leveraging the ApplicationCache interface.

For an excellent breakdown on how to set this up in your application, check out this tutorial.

10. NPAPI vs Native Messaging (Chrome)

Lastly, let’s revisit the question I posed at the start of this article: “Is it possible to execute a shell script, from a user’s web browser?”.

Is it, though? Seems like it would be a terrible idea. The answer is…sort of. Chrome used to allow for the addition of NPAPI plugins to be bundled for use alongside Chrome Extensions. This would essentially allow applications to call native binary code from within the browser/JavaScript ecosystem.

Ultimately, this experimental idea was canned, due to obvious security concerns.

Is this the end of the line? Not entirely. Enter: Chrome’s Native Messaging API.

Here’s a good breakdown of how it works, from Chrome’s Wiki:

Extensions and apps can exchange messages with native applications using an API that is similar to the other message passing APIs. Native applications that support this feature must register a native messaging host that knows how to communicate with the extension. Chrome starts the host in a separate process and communicates with it using standard input and standard output streams.
Like what you read? Give Ruaan van der Spuy a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.