Open sourcing our #Budget2017 work
A conversation between Times Development team members about the tradeoffs between code quality and working against a deadline
TL;DR: We’re hiring, and this is how we work and some of the things we do.
Wednesday 22nd November was, as you might expect, a busy day in the newsroom. Philip Hammond unveiled the government’s budget for the year to come, and there was much to write about it to cover it properly.
The paper included a budget supplement the following morning, and the Digital team had to provide a few things for the day. Set pieces like this are — to an extent — prepared ahead of time, but it’s impossible to know how everything is going to look before the chancellor’s speech ends.
We then planned a few things, and as we’re hiring for several positions (see the bottom of this article) we thought potential candidates would like to know how it’s been here over the last few days.
Here is a rundown of two projects we selected:
- A bespoke treemap visualising where the money comes from and how it is spent based on the figures released
- And our interactive tax calculator based on PWC data telling you if you’ll be better or worse off after the budget.
We are also opening access to the code powering these digital components. They’re only one thing among many others that our team does, but they provide examples of some of the challenges of working in news and coding against a deadline.
This post represents the discussion and review process that we go through when releasing code in the newsroom. Chris and I highlight specific lines or blocks of code in these two projects, and consider the decisions made in the moment, in order to achieve the rapid turnaround required in the newsroom. Each decision is assessed for its value and purpose, and with the benefit of hindsight (and more time ⏰), we provide a neater, more maintainable examples of writing this code. It’s worth noting that much of our code is written to work within the structures of The Times website, which in these cases rely on using Web Components (via Polymer), although the code samples are applicable to all front end development, no matter your framework of choice.
Our interactive budget calculator
Unhappy with our third-party calculator from last year, the Business desk asked if we could build our own to replicate the tables published in print every year. PWC provides us with this data, and our goal was to display simple figures from it.
Polymer conditional template and naming conventions
Basile: “Render this thing if there is not no headline” is really what I’m telling the app here. You can read that sentence again: it will make sense but is highly illogical. My purpose here was simple: by calling this parameter this way, I give the production people two options:
- Omit the parameters and the component will render as expected: with a headline
- Or, for those in the know, pass in
noheadline="true"and it will remove it. This makes sense and is intuitive.
Chris: Writing web components is like creating custom HTML elements. To make the user experience as easy as possible, passing a property like
<my-component no-headline> makes a lot of sense. It’s clearly defines the behaviour, especially to non-technical users. Unfortunately, this can create confusing double negatives in your codebase. This can be mitigated by creating an internal-only property (often referred to as a computed property), which is more meaningfully named. In this case, we might create a property called
showHeadline, which stores the reverse boolean value of
noHeadline. This means our template can be much simpler, and can eliminate the double negative. The below code example demonstrates how this could work:
A very manual <form>
Basile: It’s not the best form in the world. After faffing with
formData() for a while, I had to give up when I noticed its rather poor browser support and wasn’t sure how well polyfilled it was. Plus, you know, iterators are pretty but still hurt my head.
Chris: FormData is a great way of managing data in forms, and can make it much easier to access your data where you have multiple fields. Unfortunately, as Basile has raised, the browser support for all of FormData’s features isn’t great, and in the newsroom context, loading in and testing an untested Polyfill can be a recipe for disaster. In this case, where we only had one or two inputs, access their values directly (using
.value) is not a bad approach. If the form was to grow in any way or need to be submitted multiple times, I’d advise putting more time into researching available FormData polyfills.
Chris: Here we manually set up a
<form> tag, what method handles a submit. When we revisit this code in the future, it’ll be much easier to see by just looking at the
<form> what is happening when it’s submitted.
Defending against a free text input
Basile: Two minutes before deadline, we opened the calculator on one of the editors’ computer, who typed in her income, with a comma. Like Brits do. Oh my, I thought, no way that’s going to work, I’ve never tried with a comma in there. Turns out I did defend against this and cleaned the input when I wrote this line the first time round. It also turns out that I didn’t plan that some people would put a £ sign before their income.
As a bonus, note the very artisanal so-called error handling process below.
Chris: Handling user input is always an area worth extra consideration. If your user input is being delivered to an external service (possibly an API or database) you should be validating and escaping it, to prevent any malicious activity. In this case, it was all handled internally on the client (in browser). The code here that processes and cleans the input, and sets the error states could be cleared up. We might consider breaking this up into a set of smaller, well named functions, which might run something like this:
- Reset error state
- Clean input data
- Validate input data
- If invalid, show error
- If valid, show results and send analytics event
Semantic HTML where art thou
Chris: Writing semantic HTML can sometimes be a chore, but it’s really worthwhile, and can help in more ways than just improving the readability of your code. In this example, a
<div> was used, despite this being a clickable element. In almost all cases, a
<button> would be preferred here, because:
- It’s natively clickable
- Your intention is much clearer in your HTML
- It’s got accessibility support built in 🙌
- You can tab to it, and use your keyboard to press it, without writing a single line of code
Again, as with the form earlier, we would also want to bind our click handler to it directly, using an
Writing things by hand
Basile: I didn’t know what I’d get from PWC. I had dummy data, but… you simply never know. That error-prone copy-pasting was done when I received the data, and it’s dirty but works. ¯\_(ツ)_/¯
Chris: This one is always tricky due to the nature of not knowing your source data until the last minute. In this case there was little that could have been done to make it easier, however writing a mapping array like this at least means we have a structured approach to handling the incoming data. One thing to highlight, the
notes key in each example is empty, and actually wasn’t used in the final product. In reality, we’d want to remove this from our codebase, as it’s unnecessary, and adds complexity if we wanted to come back and look at this code in the coming weeks and months.
A magical dropdown
Chris: This is another case where we can use our template / HTML code to be clearer about the behaviour and intention of our application. If you only look at the HTML, there’s no understanding that the
<select> box will ever have more than one item in it. In many templating languages you can use a loop or map to create your options inside your HTML, based on some dynamic data (this is, for example, how you would do this in React). However, in Polymer 1.x, this isn’t actually possible, and so the way it was done here is best.
Scrolling like you don’t know where you’re going
Basile: Design wants the app to nudge down on mobile when searching, to make sure readers see something’s going on. Let me check if I can do that properly in five minutes. Wait no I can’t, it’s deadline day, I’m struggling to target the element I want, oh sod it I’ll scroll them 200px down. Seems about right: 100px is too little, 300px is too much. Commit, push, deploy. Typical lazy deadline work.
Chris: Here we were able to use a great polyfill for the
window.scrollBy() method, which allows us to implement smooth browser scrolling with little effort. Unfortunately, this method only takes exact pixel values, meaning, when working to deadline, providing a rough estimate is usually fine. This, however, might breakdown following further testing across a wider variety of device types. To get the scroll distance right, we might first look to calculate the offset between our current scroll position, and the element we want to scroll into view. Once we have this value, we would then pass it to
scrollBy() to scroll the correct amount.
Unclear requirements that we forgot to check
Basile: PWC gave us figures for £10,000, £20,000 annual income, and so on. Not the maths in between. Which means that we don’t know what to tell to readers who tell us they make £27,000 per annum. Initially, we rounded things down. It became apparent after deadline that this was wrong, and that we should give readers the closest possible number.
So we refactored and it took me way too long because I’m no good at algos since I studied law at uni. I didn’t even realise it was useless to keep the bit above that sorts the array of objects in ascending order of income. Or maybe I didn’t dare to remove it.
Chris: This sort of issue is really common with short deadline work. We can only do our best to try and get everything right first time, but naturally things will be missed, or it’s difficult to recognise the importance of the calculation we perform until we’ve built a started testing the end product.
For loops and conditionals
Basile: I didn’t know how to manipulate these integers and render them properly in Polymer. So I went for a fragile set of conditionals and iterations that rely solely on my knowledge of the dataset. It shouldn’t be like that.
Chris: Let’s admit, this code isn’t that fun. It’s not as clear as it could be, there are a bunch of nested
elses, all wrapped in a glorious
forloops, which we can implement here to help with some of the complex formatting requirements.
Our money in/out treemap
I use automation, except when I’ve got to type in JSON datasets
Basile: This commit is simply shocking. But for me, the quickest way to have the budget data was to type this in once the documents got released. I had the Spring budget to model this, but still. Let me know if there is a typo, please.
Chris: This, again, is often a necessary evil. Hosting the CSV or JSON file separately, and loading it in would be an option here, but it introduces another HTTP request we might otherwise choose to avoid. I find that across the different projects we do, managing external (but otherwise core) data is an ongoing challenge. Thankfully in this case, it’s not a huge amount of data in the first place, so while hard coding it might not be pretty, it may be the lesser of two evils.
I use automation, except when I’m using a template
Basile: The key is manually created. One of these legacy things. Oh, and it was the wrong way round when it hit the website.
Chris: This would benefit from having a function at the beginning of our code that automatically helps us generate things like keys, basic on the incoming data. For one-off projects, that’s unrealistic, but as we continue to use this codebase, we should start to look at how we can add these convenience methods, which will help introduce more consistency in the code, reduce bugs, and make it easier to maintain.
Taking the time to rewrite
Basile: These treemaps were created for the January 2017 transfer window… and they used to be a bit complicated. When it became clear we’d be using them for the budget again we felt it was necessary to spend some time cleaning up that code and making it better. For example, by using d3v4’s modular approach instead of shipping the whole library…
Chris: Making ongoing improvements to our code is great, and finding ways to use what we’ve built already is an excellent way of doing this. Naturally it’s hard to see from the outset how code might be reused, but as we’ve used this template a number of times now, it’s clear to see the improvements we’ve already made, and can continue to make. Next steps here might include:
- Breaking the code up into smaller helper functions, such as
- Moving our D3 dependencies into HTML imports, a feature used extensively in web components (and Polymer) to allow for caching of our external resources
- Separating our transitions and interactions away from our core business logic
- Optimising for use in responsive context (relying on percentage widths, rather than fixed values)
Basile: To decide whether we’ll label an area of the treemap, we’re checking that there’s actually enough space. Regardless of the label length or other practical consideration… I end up tweaking these numbers to find a sweet spot.
Chris: Setting classes, then checking them later (and relying on nearby DOM elements) is incredibly fragile, and could break for any number of unexpected reasons. In addition, this sort of tight coupling makes it really hard to know if a change in one place is likely to cause a bug or regression in another part of the codebase. In this case, we might look to:
- Always render the label text, and use CSS to conditionally hide it
- Make each cell check it’s own size, and determine whether or not is large enough to show the text
- Create a helper function that can encapsulate some of this logic, so it’s easier to change across the application
Basile: After several iterations of the chart, we’re still not at a fully responsive, percentage-based version. For shame.
Chris: This isn’t something that common in D3, and perhaps because of the complexity involved, however I’m keen to see more of the charts we produce work using percentage values, rather than fixed pixel values. By doing this, we’ll be in a better position to render our charts in the best way across all the devices we serve our readers on, including those devices and screen sizes we might support in the future. In most cases, this involves writing calculation function that turn our bound data into percentages, and correctly setting the width and transform properties of our D3 selections. I think this chart is a prime candidate for this as we continue our rewrite.
The legendary manual legend
Chris: This legend is, effectively, a UI element. If this code wasn’t written using D3, the legend would live in HTML, and arguably we should be doing the same here. This would mean we could use normal DOM elements (
<aside> ), and CSS to provide our styling. Not only would this simplify our D3 code (we would rely on our templating engine to bind our data to our legend’s DOM), but it would also make it much easier to understand how the legend is created, and how we can make changes to it. HTML is a templating language, and as such, we should use it where it is best suited.
👋 If you made it to the bottom of this long article, you probably should check out the three job ads we’ve got going to join our team!
Junior newsroom developer:
👇👀 Here are the two repos containing all the code powering these two projects, for your perusal.
Contribute to budget-calculator development by creating an account on GitHub.github.com