Storing Elm History Safely

Alex Koppel
Aug 29, 2018 · 5 min read

This is the third in a series of posts about Elm’s exportable app history. You can read the first post here and the second post here.

Update September 16, 2018: both the ElmRings library and the example in this post have been updated to reflect Elm 0.19.

As web developers, one of our top goals is to make it as easy as possible to solve the problems our users face. One of our top challenges is understanding in enough detail exactly what those problems are.

Earlier this summer I wrote about how Elm gives us great tools to see what our users see. Since all data in an Elm program lives in one global model, if we record that model and its history we can review everything a user did and how they ended up in a mess. More than that, we can import that data in our own browser session and see exactly what they saw — all without having to do a screen sharing session or be on site.

Storing all data in Elm’s model and recording that history has implications, though. Sensitive information like passwords and personally identifiable information get recorded as users enter them or API calls return; even unfinished data like half-written messages and incomplete forms will show up there!

Any support solution that uses Elm’s history data must be both effective and safe to be considered successful. We have to sanitize that information before we store or display it.

To do that, we first have to understand the format Elm uses to export history data, a subject we explored in the second post in this series. With that, we can look at how we clean up the data we collect.

Setting up a History Sanitizer with ElmRings

The ElmRings Javascript library allows you to easily capture the history of a user’s Elm session remotely and upload it for your support or development teams to work with.

At first, we uploaded the raw history data and sanitized it on the server, but we quickly realized we never want to send sensitive information over even secure network connections. As of 0.2.0, ElmRings has included a HistorySanitizer in the Javascript package so you can clean that data before it ever leaves the browser.

In an earlier post, we saw how ElmRings can be initialized:

To this we now add two new options, watchWords and historySanitizer:

watchWords allow you to identify the data that might need to be scrubbed. Each Elm object constructor and every Elm record key are checked against this list of strings (checked case-insensitively) or regular expressions.

historySanitizer is a function that takes each of those flagged records in turn, allowing you to make any updates appropriate before returning the new value of the object or record.

Every Elm application is different, with a unique set of types describing the problem space that app operates within. On top of that, Elm history exports are primarily intended to be imported back into Elm (e.g. any changes must adhere to the type definitions).

If we have a complex type like “User Username AuthToken” and AuthToken is itself a complex type or record, we can’t just replace it in the JSON with a string like “[filtered]” , the way you would with an off-the-shelf tool like Rails’ ParameterFilter. Elm would reject that on import because the type data wouldn’t match.

There’s no universal algorithm that can sanitize our data. Instead, we’ll need to write code unique to our application to scrub the uniquely-structured history of a user’s session appropriately. With the watchWords and historySanitizer options, ElmRings gives you a tool that’s both thorough enough to catch all sensitive entries and flexible enough to remove private information appropriately.

An Example

Here’s an example. We have two secret values, MySecretData (a type) and password (a field on a record). For the type, we replace the contents of the type, keeping its shape intact; for the record, we replace the value.

If you want a refresher on how Elm history data is structured, click here.

Unlike a systems that requires you to specify the exact Elm message types to change, the watchWords system aims to be broad and flexible. As you add new types or record fields, they’ll still be covered.

In the historySanitizer function, you’ll identify what the message or object is that was caught by the watchWords and return the appropriately modified value for that type. The returned object can be the different than or the same as the one passed in — it’s up to your code to determine if a change is needed. There will be false positives: for example, a watchWord of ‘password’ will capture both the sensitive information above and also a record like {hasPassword : Bool}.)

This may look a bit complicated, and to be honest it can be. Your historySanitizer function has to take machine-generated output and process it before another program consumes it. In practice, though, it should be manageable — most implementations I’ve seen have only required a few clauses and associated tests.

An Example

In the ElmRings repo, I’ve included a sample app with history sanitization — check it out to see the code in action!

Why block rather than permit?

For this kind of security, it would be abstractly better to list which terms are permitted rather than which ones are forbidden. That way, if you missd anything or add new fields, there’d be no risk that sensitive data would leak.

Unfortunately, such an approach would be pretty unmanageable — your Javascript would need to know the entire shape of your Elm program. Every time you added a field to a record or changed a type anywhere in your model, you’d have to update the sanitizing data (and as the library maintainer, I’d have to figure out how to make it easy for you to manage that data).

Such an approach would be a huge burden on development. Enumerating the disallowed terms definitely has risks, but it seems better than a system that can’t be maintained.

(I’d love feedback! There might well be a better way to do this.)

What’s next?

At eSpark, we’re using ElmRings to get greater insight into student and teacher issues. I also have a personal project using ElmRings focused on recording and sharing quotes and thoughts from books I’m reading.

Keep an eye out for future blog posts as we develop our support process further to use the detailed data Elm produces.

If you try this out, let me know! I’d love to know about what project your working on and your experience with ElmRings.

eSpark Engineering Blog

A collection of writings from our team on challenges, successes, and other (sometimes) insightful things

Thanks to Caroline Artz, Blake Wesley Thomas, and Xavier Shay

Alex Koppel

Written by

book reader, principal ☃ engineer at @esparklearning, Chicago Awesome Foundation trustee

eSpark Engineering Blog

A collection of writings from our team on challenges, successes, and other (sometimes) insightful things

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade