Why the web should take a look at eBooks

An english translation of a (loose) transcript of my Paris Web talk. Please take into account I am taking a trip down (my random access) memory lane.


Disclaimer: some pieces might be missing and that’s alright since human beings can’t remember absolutely everything. Please feel free to correct me if I’m wrong or if something is missing in this transcript.

Additional notes and remarks enclosed in square brackets. Snippets of code have been added to illustrate some points.

Extended slides are available here (english) and there (french).

For your information, the conference organizers recorded this talk and will upload the video in a few weeks/months.


Introduction

Hi!

I am here today to talk about eBooks. I’ll try to convice you you should take a look at eBooks. This won’t be easy but I truly wanted to take on this challenge.

Uninteresting mini bio.

A talk on documents is somehow “obsolete” in this age of Web Assembly, React.js and Progressive Web Apps. At least that’s my impression, especially when browsing, say, Medium. It feels like very few people are still discussing documents.

And yet, there are big companies investing in documents…

Apple News. Google Amp. Facebook Instant Articles.

[OK, it’s about articles but hey, articles are documents.]

And oh yes, there is one website doing pretty well…

It’s Wikipedia. Wikipedia is all about documents.

Documents are not going anywhere.

There’s even a Digital Publishing Interest Group at the W3C. But since there are few publishers interested in W3C membership and we’ve borrowed HTML, CSS and JavaScript for eBooks, this Interest Group is naturally turning to our very own W3C, the IDPF.

There’s currently merger talks but it looks like there are some cultural differencies which might be tough to deal with (e.g. publishing wants pagination implemented into browsers, at any costs).

Part One

Well, let’s talk about culture, at least at my own level: eBook Production. Let’s start with the issues that cause anger…

We didn’t learn anything from the web.

Let’s list all our current issues. That might even bring back memories if you’ve been working in web dev for a long time.

Unsemantic tag soup

Been there, still there.

We’re still doing stuff like

<div class="heading">
<span class="h1">Heading 1</span>
</div>

instead of

<h1>Heading 1</h1>

Fixed layout

When Responsive Web Design made a splash, we standardized fixed-layout.

Well, obviously, this offers a terrible user experience on small screens.

And if you combine fixed-layout with unsemantic tag soup, what you kind of recreate is a PDF-like format with HTML and CSS.

<div style="width:5675px; height:4955px; position:absolute; top:13.86px; left:13.86px; -webkit-transform-origin: 0% 0%; -webkit-transform: rotate(0deg) scale(0.05); transform-origin: 0% 0%; transform: rotate(0deg) scale(0.05);">
<p class="fuckyou ParaOverride-1"><span id="_idTextSpan028" class="CharOverride-12" style="position:absolute; top:10.58px; left:1497.15px;">This </span><span id="_idTextSpan029" class="CharOverride-12" style="position:absolute; top:10.58px; left:1793.93px; letter-spacing:-0.4px;">book </span></p>
</div>

[Yeah OK, that feels like bashing fixed-layout, which has its purposes but also its limitations — it wasn’t designed for longform and eBookStores strongly discourage using it when there is a majority of text. The biggest issues in real life are its uneducated uses and the way authoring software developers/designers see it i.e. as a replacement to PDF—a format a lot of eBookStores won’t accept—and try to brute-force web rendering into behaving like PDF rendering, which is terribly misguided.]

Fragmentation

We’ve got an awful lot of optional specs. You guys will freak out… JavaScript is optional.

Proprietary technologies

Not only formats but also authoring software and tools.

eBooks made for [platform]

This platform is iBooks. The file might work elsewhere like, say, Readium… with any luck.

Quite a few brutal specs

For instance, we quickly realized fixed-layout offered terrible UX on mobile devices so we kinda designed a spec reimplementing media queries but for fixed-layout.

As far as I can tell, there’s no implementation to date.

[Correct me if I’m wrong but please also take into account I’m talking about implementations in Reading Systems which the general audience has access to i.e. the ones Average Joe is using.]

Internet Explorer 6

[Some laughter.]

And oh yeah, we’ve also remade that. It’s called Adobe RMSDK and I think it may be even worse than Internet Explorer 6.

eBook dev?

As for the dev ecosystem, I’ve done a health check two weeks ago. Well, it speaks volumes.

About 2500 repos, that’s very few. And take into account a large part of those projects are not about eBook production but distribution and reading systems.

Two active frameworks. Mine. I know frameworks have a bad rep right now because performance but if you think about it, frameworks are also a good indicator people are interested in and committed to an ecosystem.

Contributions are nowhere in sight. For the record, I’m the only one opening, fixing and closing issues on my projects.

As regards dev tools, it’s so bad I’ve started reimplementing them in JavaScript because I can’t cope with their absence anymore — how exactly do you debug JavaScript without a console?

Part two

So yeah, this is terrible. I know I won’t convince you like that, I would shoot myself in the foot.

Now, there is stuff we’re using and making, and which might be interesting to you. And maybe if we build some synergy, we could all benefit.

I picked 3 because I’ve got 15 minutes but there’s a lot more than that.

Columns

First, columns!

Who played a little bit with multi-columns in 2010?

[Actually a lot less than I thought it would be.]

OK, you probably realised pretty quickly that they were limited.

The spec hasn’t evolved much since 2011, there are old bugs yet to be fixed, etc.

Truth is we are using columns extensively, to achieve paginated spreads for instance.

But there’s a small problem…

This is a note in the EPUB spec. It basically says that because some Reading Systems are using columns, we can’t rely on the viewport, which affects absolute positioning and… media queries.

This is where you tell yourself “If only I could use…”

Container queries!

It’s currently being discussed in the RICG, it seems designing that is a little bit difficult… but if they need one [practical] use case [impacting an entire industry], here it is.

And since we’re talking columns, let’s talk about CSS figures.

This brings “float: top|bottom” and integers for “column-span” — currently, it’s all or none.

That’s a spec which was implemented in Presto in late 2011. And it was actually pretty good.

@media -o-paged {
html {
height: 100%;
overflow: -o-paged-x;
}
article {
overflow: -o-paged-x-controls;
columns: 25em;
}
figure {
column-span: -o-integer(2);
float: -o-top-corner;
}
}

Apparently, nothing came of it but I’m pretty sure that if Google or Apple made that, and we had had a -webkit- prefix instead of an -o- prefix, it would have been a lot more popular. And maybe the activity around those specs would have skyrocketed.

Latin text layout and pagination

Number two, latin text layout and pagination.

This is a document edited by Dave Cramer (Hachette). As a matter of fact, if you take a look at W3C documents related to layout, typography, generated content and pagination, you’ll come across this name.

Even if you don’t care about pagination, you’ll find interesting stuff in there, especially if you do care about typography.

This doc is partly responsible for “initial-letter” in webkit (Safari 10), which allows for easy drop caps. No more “negative margins crap”, just numbers (font-size as number of lines, number of lines the drop cap should sink).

.first-para:first-letter {
initial-letter: 3 2; /* size = 3 lines, sink = 2 lines */
}

But there’s also using the “content” property directly on elements — and not only on pseudo elements — , which might come in handy when separating sections with asterisms, which we use extensively in books.

hr {
content: "* * *";
}

Or aligning text based on a string (character) like, say, in tables…

  1240.59
730.79
12045.05
142890.90
22.99

This doc, which is actually a list of requirements, is relatively short at first sight. But truth is it could have a lot of ramifications…

Say for grids, for which a lot is happening right now.

CSS Regions, for which Jen Simmons has just made a proposal at TPAC — the idea is to redesign them on top of grids. If you don’t know about Regions, it’s just reflowable text frames like the ones you’ll find in Word or InDesign.

And there’s everything typography, especially APIs for Houdini — so that’s the CSS Object Model.

Portable web publications

Finally, portable web publications is a draft from the DPUB IG. The idea is to build a bridge between EPUB and the web — as of today, you can’t expect your browser to display the contents an EPUB file natively… while it does for PDF.

Use cases have been published recently so maybe it would be a good idea to check that document and report issues in the github repo.

Obviously this is about archiving. Could be cool if you’re doing webzines for instance.

We already know the portable part might be an issue raised by browsers’ implementers. But that’s not it, a lot more could come out.

Like semantic inflections to tell a section is a chapter or a title page. We’ve already done that for EPUB but it’s difficult to port because XML.

Or relationships between the resources in a publication. Think about a table of contents for instance.

And there are two specs the DPUB IG is likely to kick start: annotations and user settings. It needs them.

By the way, user settings are not standardized in eBooks, which makes them a nightmare to manage.

Conclusion

[I screwed this up big time 🙌]

What is important, what I would like you to understand is that we still need to improve documents.

The web was created for documents… There are millions and millions of PDF files online. And for the french press, we’ve learned one week ago that PDF is up 25% from a year earlier.

It shows we haven’t finished with documents yet. And we shouldn’t set them aside. We mustn’t set documents aside.


Related Links


Thanks

I really would like to thank the Paris Web staff, which turned this conference into an outstanding experience. Add to that the positive vibe of the event — thanks to participants — and that made this talk the most enjoyable I’ve ever done. 👍

Thanks to Luce and Jean-Pierre, who positively enriched my knowledge and help me relax the previous evening. That helped a lot. 😉

A huge thank you to Daniel Glazman, who helped clarify some technicalities of EPUB in plain words, thus extending my answer to one question for which I may have proved to be unprepared — 15-minute talks normally don’t have QA sessions but that’s OK. 🙏

Finally, kudos to Gaël Poupard, which referred to this Medium article in its own talk, which was a stellar talk, a few hours later. 👌