Digitised books: reading or viewing?

Tom Crane
digirati-ch
Published in
6 min readMay 13, 2019

You often curl up with a book. You might even read your Kindle in the bath. But have you ever curled up with a viewer? Would you even consider it? If not, why not?

When a library photographs the pages and obtains the text of printed books, and presents those books online in a digital object viewer, how well does this serve the reader who wants to spend the next three hours with one text? Digitised books can offer interactions that a pile of books on a desk cannot. But what exactly are those interactions for the reader, and are they used? Do readers stay in the viewer, or do they feel better served by choosing a format appropriate to their needs and taking the digital object elsewhere, when they know they have a few hours of reading ahead of them?

In and out of the Internet Archive BookReader

Would a potential reader use a viewer like the Internet Archive BookReader or Universal Viewer for a rapid appraisal of a book, maybe search a few terms and scan the index — but then shift to a different format to settle in for a longer read? Maybe that format has text re-flowing and resizing, highlighting for copy and paste, augmentation for visual impairments, bookmarks, note taking and other features that the reader would prefer to use.

In and Out of the Universal Viewer

This example from the Internet Archive offers for download a variety of formats in which the relationship of the text to its original printed form is dispensed with to greater or lesser degrees, depending on the nature of the work and the needs of the reader.

Thinking about digitised printed books in particular, is there more work to do in the Universal Viewer to bring some of this functionality inside? If it comes down to priorities, should providers of digitised printed books invest more in:

  • The capture of more data (text, structure) at digitisation time
  • …leaving the door open to capture more data later, and the means to add it to already-published items
  • The provision of that data via open standards for others to build different ways of interacting with it
  • The transformation of that data into established publication formats for different audiences and devices, as seen above
  • The further development of browser-based viewers/readers to better serve multiple audiences for text and text-centric functionality (as well as image-centric functionality)

Is it worth trying to build a better reading experience than today’s viewers offer, or is it better to work on making the best possible ePubs, PDFs and other formats to offer alongside the on-page viewer? And if the former, what does that reading experience look like?

Raw material

OCR technologies are rapidly moving from hard problems to solved problems. Machines get better at segmentation. They are increasingly able to understand the visual grammar of layout beyond simple lines of text. They understand which bits of text go together and the sequence in which the bits of text flow on from one another. They recognise footnotes and glosses. Machines can also identify typefaces, and relative point sizes, and typographical emphasis. They can identify illustrations, tables and figures, and analyse them.

Digitisation is not just taking pictures. It also means the capture of text, the capture of structure, the capture of typographical and layout information. The capture of the means of later reconstruction, in some form. Those various forms of reconstruction should serve the needs of the reader with 15 monographs and a couple of biographies to plough through before Monday, as well as serve the visitor for whom the book is some sort of digital exhibit, in its elegant deep-zoom capable case, to pause and gaze at before moving on to something else.

Should both these needs be served by the same tool? Is it worth trying to do that?

In the former Penguin English Library and Penguin Classics, a reproduction of the title page of the first edition would appear before the work got under way:

For most readers, this is just an interesting diversion, before the printed book reverted to its modern (or in this case, 1965) typography, spelling and punctuation. But some readers need to see the title page and everything else exactly as-was, and not just the first edition but the second and third editions too. And that’s what they can now find in digitised collections, as well as on the shelves.

So some readers might stay in the viewer, where they can zoom in on every detail of the printing and paper of that 1818 edition. Some might download the PDF, where the images aren’t quite as detailed, but the text is readable, and (ideally) is searchable and reusable. Some might just care for the words rather than their form on the original page, and pick the ePub version where they can choose whatever typeface makes them a happy reader.

Formats, applications and devices

A deep zoom viewer of digital objects offers some of the functionality of a desktop electronic magnifier, but without the need for the presence of the physical object. These devices are a means to reading, beyond mere enlargement.

And in other formats and applications, the same digital object, with its rich captured text, can be read out loud, or have its text re-flowed and formatted, increasing the font size and cleaning the layout so you can see it without your glasses.

In a PDF, zooming preserves the spatial layout. Zoom preserves the appearance of the printed original while still (if the information has been put into the PDF) offering document affordances like search and text selection for copying and using elsewhere:

There are other possibilities here too, for PDF-like formats. A 500 page PDF of digitised images is a very large document, because it has 500 photographs embedded in it. If the digitisation process captures layout and typographical information accurately — font styles, sizes, emphasis, even colour — then the textual layout of the page can be used to create a PDF without the original captured images at all, to offer as an additional download option. The relationship between the physical text-bearing object and that textual content is loosened, introducing more ways of interacting with the text.

Should the viewer have a mode where it can do that too, right there on the web page, with an HTML representation of the text? Client applications can even create new OCR data to do this, either by invoking external web services, or by running a client-side OCR engine like tesseract.js and making it themselves. But ideally, providers of digital objects publish these formats, as well as an open standards representation of the digital object from which all these derivatives, and more, can be constructed.

Is there an evolution of the Universal Viewer that should try to do these things? To make for a reading experience that you want to curl up with? Or are efforts better focused elsewhere, leaving other applications to serve their respective groups of users?

— -

Notes

The text borne by the page is a special case of the content of a digital object — the transcription of text visible in the image is the raw material for a bookreader. The Universal Viewer has to grapple with all sorts of other content too. Some of it textual, some of it structured. These concerns are explored further in these slides.

--

--