Books, in a browser

Libelli Portatiles, “Aldus Manutius:: A Legacy More Lasting Than Bronze,” The Grolier Club, New York

In October 2014, the World Wide Web Consortium (W3C) and the International Digital Publishing Forum (IDPF) stood together on stage in San Francisco at Books in Browsers, a conference on the future of publishing that I convene annually (usually) with the Frankfurt Book Fair. W3’s and IDPF’s bold message: the next ebook standard will marry itself fully to the Open Web Platform, and books, magazines, and pamphlets — online or off — will be the peers of any document on the web.

There had already been a vast amount of work to get to that point, with years of joint technical collaboration, striving together to combine distinct concepts of document structure, presentation, navigation, annotation, and analytics. This investment hit a turning point this past week, when the W3C and IDPF announced at Book Expo America in Chicago that the two organizations are exploring a combination in which the ongoing activities of the IDPF would be folded into the W3C. If approved by both organizations, the merger will complete by January 2017, and the final revision of EPUB, the open ebook format shepherded by the IDPF, born from the roots of the Open Ebook Forum, will be released before the end of 2016.


What is this “book?”

On the face of it, there doesn’t seem to be much in common between the open web and a book, whether digital or paper. A “book” seems like a tightly packaged object with well-elaborated components including, e.g., a table of contents, chapters, and sometimes an index, with an occasional photograph or map thrown in. The web, in contrast, appears to be a richly interactive, media-laden heavily networked skein of short documents referencing each other.

Yet appearance belies heritage. When the very first ebook formats were established, they had basic requirements: they had to embed a structure list (Chapter 2 follows Chapter 1, and so forth); the text needed to incorporate essential information about its presentation (This chapter title is in bold); and all components had to be packaged in a single file to make transmission from a retailer to a user a simple matter (ebooks are actually Zip files with a different extension). The solutions were predicated on an early Internet in which the web was just getting started. But the markup and presentation of content - that was always HTML and CSS in one form or another.

However, because ebooks became popular in a early-web environment, they had to be read using specialized reading software that often existed only on dedicated hardware readers; web browsers were not well standardized. The HTML used to mark up ebook texts and control their display had to be tightly controlled to make reading ebooks even possible. As Micah Bowers of Bluefire Reader, a longtime contributor to ebook development, notes: “In the early days of ebooks, the web standards adopted in ebooks were strict and basic subsets that … enabled reading system developers to write their own specialized rendering engines.”

Those predicates are long vanished. Online publications can take full advantage of the open web not simply for distribution, but presentation and interaction as well. Referencing educational textbooks, Sir Tim Berners-Lee commented at the merger announcement, “The book content we know today is becoming highly interactive and accessible with links to videos and images from actual historical events and original research data. This provides greater authenticity and a more engaging learning environment … .” Such new forms of engagement are possible with non-fiction ebooks as well, and even with fiction, where, e.g., a Romance or Sci-Fi title can be reimagined to include different media, structures, or even support immersive experiences.


Getting there.

Replicating the highly-styled features of a beautifully designed book, textbook, or magazine requires additional web standards work. Annotation management; mathematics and symbolics representation; support for international writing forms such as right-to-left, or vertical; and responsive reflow for styles such as columns and panels all need more refinement. And one of the most important features of the W3C’s vision for a Portable Web Document — that documents are available online, but handle degraded network environments gracefully, and be usefully accessible offline — will require further engineering of web browsers and possibly even how they communicate with web servers.

Beyond that, we still need to reproduce a core definitional aspect of the traditional book: its self-containment. As the W3 notes in the FAQ for the standards work, “The high level technical objective is to specify a portable publication format that identifies a collection of web resources as one conceptual unit on the Web, with all the components that are necessary so that the collection can be handled by traditional Web browsers as well as hybrid applications and specialized eReaders, both online and offline.”

But the promise is so amazing. In his BEA address, Berners-Lee noted four crucial affordances for publications on the web. Permanence: the open web is built on our most reproducible standards; materials published using HTML5 and its successors stand an excellent chance of remaining available in future networked environments. Seamless: ebook and digital magazine content can flow across different kinds of devices, from laptops to mobiles, with no fundamental disruption. Linked: publications can incorporate content that resides across the globe, in whatever form, as long as it is addressable on the web. Trackable: publications embodying standard identifiers such as DOIs or ISBNS, and persistent author identification, can be identified, and their use analyzed around the web.


Ebook Barn-raising.

Delivering on this vision requires something more than wishful thinking. It requires serious engagement and commitment to the development of these standards. Micah Bowers notes, “I believe that the ebook industry has much value to bring to the open web standards efforts, and that the open web has much value to bring to digital publishing. To realize this value will require a far deeper and more sustained cooperation and collaboration than has happened in the past.”

Unfortunately thus far, there has been a dearth of involvement from the organizations who most need to contribute to these standards: publishers. Only staff from two large trade publishers have contributed to the W3/IDPF effort: J. Wiley’s Tzviya Siegman, and Hachette’s Dave Cramer, and they have done so tirelessly. These two individuals, by dint of their continued participation in the W3’s Digital Publication Interest Group, have made defining contributions to the evolution of the Portable Web Publications format that will succeed EPUB.

It’s not enough. The absence of greater publisher participation is stark. It is an abnegation of a publishing house’s fundamental business obligations to maintain the continued viability of its press. For publishing, that mission has always been, at least in part, to help find the voices that must be heard, and make sure their stories are told. At a time when publishers have the greatest opportunity to bring greater diversity and awareness to a global audience than ever before, most have excused themselves from a place at the table with rote demurrals:

1) “We don’t have anyone on staff with the required knowledge and skills.”

Well hire them, for God sakes. This is not rocket science, although it does necessitate familiarity with web standards and software development. Yet there are thousands of software engineers who would love to make a lasting contribution to this goal: help create a way of reaching everyone across the globe: communicate ideas, deliver education, and provide entertainment.

2). “We are too busy operating our core business. We are not a software firm.”

Oh, please. Your essential business is to stay in business. If technology companies like Apple or Google can pivot and invest billions of dollars in creating software and hardware infrastructure for virtual and augmented reality storytelling, which didn’t exist beyond crazy-assed science fiction just a couple of years ago, publishers can make time to contribute to standards for publications. Publishing books on the web isn’t a risk: it’s the future.

3) “Others will do the work for us.”

Not your work. I am certain that Kodansha Japan recognizes the need to ensure that Japanese cultural expressions such as manga are accessible and display well on the web, on mobile and tablets. McGraw-Hill Education and Macmillan surely acknowledge the importance of joining the W3C to ensure that their requirements for enabling interaction and presentation of learning resources within K-12 and Higher Ed platforms are met. Indexed discovery firms like Proquest and EBSCO know their effort to integrate ebook readers into their discovery services must have better articulation. Community not-for-profits like Benetech have been making ebooks accessible for readers of different abilities for years, helping to define functional requirements, and then meeting them. Trade publishers? What say you?

The combination of the IDPF with the W3 is one of the most significant developments in the last 500 years of human effort to distribute information to as many people as possible, wherever they live, and whatever their circumstances. As Pierre Danet, the Group Manager for Digital Innovation at Hachette Livre said, “Publishing will fully utilize web technologies. The IDPF and W3C combination is galvanizing, providing us with an essential strategic vision for the future of publishing.”

This strong embrace of the future by one of the world’s leading publishers could auger well for the broader future of online publishing. Only through working together can we deliver an omnibus future of books, in a browser. The W3 is open to new members, and holds its arms open for engagement. Let’s build this barn.