Progressive Enhancement is a good practice. When we try to apply it to digitised objects of cultural heritage, we run into some interesting problems. These in turn force us to think about what we want web addresses — the URLs of our content — to represent. In March my team will be joining The Royal Society at a workshop to consider content, user experience and technical next steps for a pilot project we (Digirati) have been developing. Here, I’ll explore a way for a collections site to present complex digital objects, which we can explore further in the workshop.
On the web, it is good practice to provide something that everyone can see or read, regardless of their browser capabilities, the device they are using, or the connection speed available to them.
Graceful Degradation attempts the same outcome from the other direction: build your ideal UI, taking full advantage of the state of the art, but then carefully ensure that you don’t break older browsers.
Although they may sometimes appear to amount to the same thing, Progressive Enhancement is generally preferred. It’s quicker at producing working software that real users can try out. It’s a better fit for iterative development and testing. Graceful Degradation can descend into browser compatibility hell as you start trying to untangle complex features from their dependencies on specific browser capabilities.
Sometimes this doesn’t matter. It may not be important to surface deep links into your application for search engines to provide to their users. You may just want to bring people to the front door, rather than direct to a particular room.
But sometimes, it really does matter. URLs are important, for accessibility, findability, and the significance of your content as distinct resources on the web, accessed through a browser, through a stable address. These resource concerns can be a deal-breaker even if you don’t care how your content appears to older or unconventional web browsers and non-visual clients such as screen readers (although, you should care).
Decisions about application style are often obvious. Wikipedia is not a good candidate for a single page web app. But a web site that offers a summary dashboard view of trending news stories might be.
Books, manuscripts and other compound objects
Applications like the Universal Viewer deliver many and varied rich viewing experiences. You can get an overview of a complex object through thumbnails and structural navigation. Sometimes you can read the text, or even search within it. Some are interactive and allow annotation, sharing, embedding and more besides. Some will even plaster the pages of your object over the walls of a virtual gallery, in which you can wander from room to room, chapter to chapter.
If the viewer has an API, it can communicate with the external page to notify it of events as the user navigates around. It can update the browser’s address bar to facilitate deep links into content. It can participate in more complex user experiences, built around the viewer component. It can show the text of each page alongside the image view. All this is very compelling, and viewers like those above are the de facto user experience of many of the world’s cultural heritage collections.
But what about progressive enhancement?
We could render an HTML-only presentation of the object. The viewer’s job on the page is to render navigation around the different views (e.g., book pages), and a large enough image of each view to be useful. We can try to do this with regular HTML.
But we run into a problem of scale.
If we only have HTML, we need to generate a lot of it, and the bigger the book, the more HTML we’ll need. More pages mean more thumbnails, more structural navigation, and more of those large images. Without client-side logic that allows us to ask only for what’s needed, we’re going to have to build the whole book in HTML — and if it has hundreds of pages, that means hundreds of repeated blocks of HTML. That might not be so bad, but we also need hundreds of fairly large images of each book page — they need to be big enough to see properly. We’re throwing an awful lot of content at our baseline, non-enhanced user, and they are the ones we’re supposed to be helping!
Our library catalogue web page that uses only HTML ends up with a titanic page weight (the total of the all the file sizes of all the content the page has to load) if the job of this one page is to present a large book.
This viewer is an experiment in how-small-can-you-get rather than a serious candidate for a viewing experience. But adding a few more KB of code and CSS to it would make it prettier and even more efficient (e.g., only loading thumbnails visible in the left panel and deferring the others until they are scrolled into view). A much bigger jump in capability (but still no larger in its contribution to the page weight than a typical medium sized JPEG) would be to add deep-zoom support, and/or the ability to tailor the sizes of the requested images to the user’s screen. It’s then better for mobile users, much faster to load, much friendlier to low-bandwidth users (deep zoom is also a great bandwidth conserver, allowing hi-res access to images of enormous size by supplying only those image tiles required to populate the viewport).
A simplistic approach to progressive enhancement means our HTML version of a large object is just too big, and will result in a worse user experience for all users (because of the initial page load). The experience will likely be especially bad for those we’re trying to help most by adopting the practice!
So why not break it up into separate pages, one for each view?
HTML-only viewing experiences (aka, a web site)
As we would expect from simple HTML, this page of Persuasion is a first-class citizen of the web. For me, it’s the second hit in Google, and there’s the text I searched for, as a snippet in the result:
That is, I was able to find a search result specifically for this page of Persuasion (distinct from the manuscript it belongs to). It’s very findable. To explore the issue a little more, the first hit in Google for this same query is also a distinct web address for a page at the British Library that contains, as plain HTML, the text of the entire 33 pages of the item; images of these pages are launched in a separate viewer, in which I need to navigate to for the deep zoom experience. But I can’t get a search result that leads into that viewer.
Page per View vs Page per Work
We started from a purely practical concern for reducing bandwidth consumed, and ended up with a different approach that spreads the “viewer” across many web pages. We now have a different user experience — we’ve added a web-page-centric experience that elevates each view within the work to its own page with an address on the World Wide Web.
I have written elsewhere about the users’s different focus of attention when looking at objects and their views. The problem of focus is that you don’t know whether a web page per view or a web page per work is going to feel more natural and useful to a user, because you don’t know what they are there for. Even the same user may have a different focus at different times. For one user, a single manuscript page of Newton’s Principia may involve months of scholarship. All of the transcriptions and annotations available for that page definitely warrant a distinct web address, it’s a rich island of web content on its own. It should turn up in search results as a resource in its own right, it should have the status on the web as a page, all to itself.
But that same publishing mechanism would result in every page of every digitised printed book getting its own web address too. This might be overwhelming, as search results, or as a navigation experience. For the user riffling through (digitally) or just reading, the web page belongs to the Persuasion manuscript as a whole, the work, and they are inside it, somewhere inside a viewer on the Persuasion manuscript’s web page. Separate web pages, image after image, could seem unnecessarily clunky, especially if there is little or no additional content (transcriptions, annotations). “Why didn’t they use a viewer?”
Is it possible to construct a user experience that allows both kinds of focus at the same time? To have all the benefits of separate web pages and all the benefits of a single page viewer, without the user having to think about that distinction at all, or suffer the drawbacks of each approach?
I think it is.
A partial experiment: one third of the answer
I work for Digirati. In a pilot project we developed for the Royal Society called Science in the Making, we decided that the archival material, and the interactions offered to users, warranted a web page per view rather than a web page per object. The site features archive material connected with published articles in the Philosophical Transactions of the Royal Society, the world’s oldest scientific journal. These items are original manuscripts, drawings, referee reports, photographs and correspondence. Although the pilot was aimed at the general public (leaning towards a viewer), users can transcribe and comment on individual images within a work (leaning towards one web page per image). How much of a viewer-like experience could we deliver for most site users, while keeping each view a distinct web page, with its own URL? We wanted to make sure any transcription text for an image was discoverable by search engines, as part of the HTML content of a specific page. If someone is working on transcribing a page of a manuscript, can it be a real web page? And if they are just browsing, skimming through images in a work, can it feel like a viewer?
We built two distinct views. A web page for the work looks like this:
This page carries catalogue data for the object, some tags from users that belong to the object, and a strip viewer that gives an overview of the work. You can scroll or swipe it.
If you click on an individual view, you go to another web page for that image. It’s a whole new page, but the default behaviour is to scroll that page down slightly, to the “viewer”, and expand this viewer to fill the viewport height:
The thumbnail strip and the back/forward arrows feel viewer-like, but any navigation within the work is a web page navigation; the browser take you to a new URL. You skip straight past the page furniture on the new page, and the viewer-like parts expand to fill the vertical viewport. This trick depends on the site being fast, of course, otherwise you end up with what feels like a very sluggish viewer as it loads in whole new pages.
If we view the transcription, we’re seeing text that’s basic HTML content of the page. Search engines can find it and index it:
The Exploded Viewer
If we rethink what we are progressively enhancing, we can be cleverer about server side page composition to avoid the page weight problem. If we have control of and can do more work at the server end, we can eliminate some of the design and implementation constraints of a self contained viewer. We can make both ends cooperate. Both server and client are capable of generating specific views of an object with the right amount of contextual information and navigation.
This means a server side viewer, generating web pages per view within a work, with enough navigation to get to any other view (but not always in one step). It doesn’t have to provide the entire work, it doesn’t have to provide every possible thumbnail. Just enough of an HTML window on the work, at that view, with perhaps the thumbs around the view, with perhaps with the start and end thumbs as well, but not necessarily all the thumbs in between. And similarly for a table of contents: a tree opened to the current section, but not opened (or even open-able) to all sections. No matter what page you land on, you can see the page image, other relevant page content, commentary and editorial; you can navigate up, down and around. This HTML experience is just fine — it should be a good one, not an afterthought.
Now the Exploded Viewer steps in, if the browser supports it (almost all will). It loads the source data — probably the very same source data the server used to generate the HTML window on the work at this page — and bootstraps itself as a client-side viewer, open at the current page.
With the Exploded Viewer, and a server-side rendering of a tailored, contextual view of each page, each address is a real and distinct HTML resource on the web, delivering all the accessibility, addressability and findability benefits mentioned.
There is more design flexibility in this approach, too. The viewer is more adaptable to the content, and there’s no need to confine application functionality to a box on the page. It can spread itself out over the web page… exploded on each page, as well as exploded across multiple web pages.
To restate, the principles of the exploded viewer are:
- The server can provide one web page per view (e.g., each image of a book page). That is, each view has a distinct URL.
- The server’s generated HTML for that view does not have to provide the means of accessing all the other possible views. It could maybe render a window view, a subset of all possible thumbnails. It can render links around the current view, but not necessarily links to all possible other views. Maybe it shows thumbnails like this, with gaps that would need an additional navigation step to land in:
  ….    ….  
And if structural content such as chapter information is available, the page HTML provides a partially expanded navigation tree, but not the whole tree if it’s going to be too big. A user can navigate upwards and downwards and around, but might need two page navigations to get to every possible other view, via additional aggregate views of the content. The server end of the exploded viewer is capable of rendering these supporting, aggregate views to aid navigation around the work.
- There is then a natural upper limit to how much HTML needs to be delivered for a workable, server-side-rendered viewer, no matter how many different views the object has. The HTML for page 800 of War and Peace need not be significantly larger than the HTML for page 7 of The Tiger Who Came To Tea — beyond a certain point it ceases to rise in proportion to the whole object size.
- Make this basic HTML version elegant and fast…
- The server-side rendered viewer gets progressively enhanced into a client-side viewer that manages its own resource loading, just like a real viewer. The client-side viewer simulates the address-bar changes that would result from equivalent navigation in the server-side rendered viewer.
What’s the start page?
There’s one problem here. What does the home page for the work look like? What is its URL, and does it have a different URL from the web page for the view of the first image (or whatever view in the work is deemed the starting, initialisation view, such as the title page)?
I think this is an implementation decision to be made, rather than a show-stopper. For example, you could decide that while page URLs like /war-and-peace/cover and /war-and-peace/page-1 imply the existence of a homepage for War and Peace at /war-and-peace/, you can decide whether you want to actually provide such a page, distinct from, say, /war-and-peace/cover. You can do one of:
- provide /war-and-peace/ as a distinct page, about the work as a whole in some way
- make /war-and-peace/ redirect to /war-and-peace/cover as the canonical URL of the work: you have to start somewhere.
- make /war-and-peace/cover redirect to /war-and-peace/; the canonical URL for the page chosen to represent the initial view of the object is the object’s URL.
In the Royal Society example, we chose the first option. This works well for multi-image items, because (as seen in the examples above) the view on the work page is a strip of large thumbnails, rather than a full view of the first page. But it isn’t a perfect solution for items that only have one image (such as a photograph as a distinct archival item). In this case, we merged the functionality of the “work” page and the “view” page, and the “view” page never appears. If there is a transcript available, it appears on the same work page as the work metadata.
The Holy Grail
- Solves the problem of focus — it can be used for page focus and work focus
- …when it can, it feels like a viewer — really feels like a viewer, for search and other functions too
- Is search-engine-friendly, at the page level
- Makes sense for single-image items as well as multiple image items
- …but does not yield a heavy HTML page with markup for the whole object at once (unless the object is small enough).