Browser-Design Experiments (2010–2017) #4

“Ubiquitous Firefox” Revisited (2012)

Building a zoomable Web operating system.

David Regev
Aug 29, 2018 · 31 min read

This post is part of a series of experiments in redesigning the Web browser. This design, from 2012, evolved from the previous ones, and so it answers some of the same questions, but goes even further: How do we solve the ‘too many tabs’ problem? How do we design a zoomable Web browser? How do we build an entire operating system around the Web?



Last year, I presented my Ubiquitous Firefox concept for redesigning the browser. (Don’t worry! Reading that is not necessary for this discussion.) The discussion proved insightful and thought-provoking. Towards the end, we discussed a number of interesting modifications. Since that time, those ideas have been slowly developing in my mind, and I would like now to revisit the issue. I will first list the ideas that we will keep from the previous discussion, the lessons we learned, and the principles that will inform our design. Then I shall present a rough sketch of my new concept: a true Web operating system. There are still a number unanswered questions, and I hope the community can help me answer them and improve the concept, yielding something truly revolutionary.

Lessons & Principles

Administrative debris is bad, content is good. Administrative debris is anything that is there for the purpose of administrating your computer (buttons, toolbars, indicators, and so on — what is called ‘chrome’), rather than content (the stuff you actually care about). The interface should have as little debris as possible. Instead of using buttons and chrome to manipulate the content on the screen indirectly, we should try to design the interface so that direct manipulation of the content is the primary way of using it. Similarly, instead of popping up dialogues that appear out of nowhere and presenting information in ways that do not fit in with your mental model of what’s on the screen, information should preferably be presented inline within the way content is presented. The way to do this is with good information design. This lesson is especially important with modern touch interfaces, where screen real estate is precious and where we can finally have real direct manipulation — with our fingers!

The fewer the mental models and metaphors, the better. Modern computers have a dizzying number of different concepts that we must grasp. Just within Firefox, how many different ways are there of issuing a command? Menu bars, toolbars, context menus, and all of the above duplicated within web apps — that’s quite a lot of different ways of telling your computer to ‘do this’! Mozilla Labs’ Ubiquity attempted to unify all these ways into one mental model, and its model may yet succeed. What about the number of separate locations for different representations of pages in Firefox? The tab bar, tab history, bookmarks (including bookmarks toolbar, bookmarks menu, and unsorted bookmarks), history menu, browsing history, location bar — isn’t it too much? Let’s reduce the number of mental models we need to internalize.

The location bar still has to be replaced. Despite sensationalist headlines, I still think the location bar needs to be go. It is administrative debris that is forced in your face at all times. Worse, it doesn’t fit in well with any mental model of how pages are displayed (yes, even with tabs-on-top). (Those interested can read in detail my arguments against the location bar’s standard design.) It should be replaced by two mechanisms: (1) displaying the page title/address/metadata inline, attached to the page, where it makes sense; and (2) using a wholly separate command mechanism for telling your device where you want to go, which would not take up much screen real estate, if any at all.

Design with an upgrade path for Ubiquity in mind. Ubiquity is still brilliant and has more potential than any other mechanism for executing commands I’ve ever seen. Nonetheless, it’s not ready yet. I’ve, therefore, made sure my design made sense with or without Ubiquity, leaving a clear path for incorporating it in the future.

Unrelated pages should be separate; spawned pages should be together. Say you’re browsing in a standard browser, and you have 3 random tabs open: ABC. You then open three links from tab A in new tabs. Now your tab bar looks like this: ADEFBC. You continue this way throughout the day and you find that your tab bar is a jumble of pages with no simple way of finding the page you want or seeing what’s in front of you. This is the problem of tab proliferation. In my previous concept, we learned a general solution to this problem: when you open pages from another page, these pages should be grouped together; when you open a new page from scratch, it should be separate from other pages. In other words, A should be grouped with D, E, and F, separate from both B and C.

The Back/Forward system is broken. Similar to tab proliferation, there is a lot of redundancy between each tab’s back/forward history and and the tab bar. When you follow a link, sometimes you stay within the current tab, adding a page to the tab’s history, and sometime you open a new tab, adding a page to the tab bar. Conversely, the previous page in the tab history is sometimes one that was viewed in the same tab and is sometimes a from different tab altogether. Essentially, when you spawn new tabs, the tab bar replaces your tab history. Can’t we unify these two ways of browsing?

Open vs. Open In New Tab is not monotonous. Related to the previous issue is the choice we must make every time we click on a link: should I open it here or in a new tab? This choice creates a small delay every time you click on a link. Over time, these delays add up. They also contribute to a mental burden that builds up over time, especially every time you realize you should have opened a link in a new tab, so you must go back and lose even more time undoing that mistake. Worse, for those who are not comfortable with opening links in new tabs, the benefits of this form of browsing are out of reach. What if we removed this “choice” and optimized the interface for one form of browsing? Then the interface would be more monotonous (in a good way): you don’t have to think about using it — you just use it.

Note: The very first browser, WorldWideWeb (later renamed ‘Nexus’), actually didn’t have the previous two issues. Instead of relying on a Back button, clicking on a link created a new window. Of course, such a system quickly leads to too many windows, which is why the Back button was later created in the first place. An alternate solution to that problem, however, could have involved better window/document management.

The History Scroller is probably too much. My previous solution to the issues outlined in the previous paragraphs involved showing all spawned pages within the same tab, merged into the tab history. Although this seemingly solved the problems, it created a new widget that must be learned. This would have contributed to the proliferation of mental models: the tab bar (or Panorama) for organizing tabs, and the History Scroller for organizing tab histories. This actually isn’t very different from today, where we have the Back/Forward system in addition to the tab bar. On top of that, the tab history model fits in even less with Panorama (tab groups): it looks like you’re arranging all your pages on a flat plane but, in fact, each page you see comes with many others hidden. What if we truly flattened this hierarchy into one consistent interface?

Panorama rocks. In my previous concept, Panorama (formerly Tab Candy) was more of an after-thought, but it should not have been. Panorama is a simplified ZUI (Zooming User Interface). It has a lot of potential (nearly infinite!), but it currently has many limitations. For one, there are only two zoom levels, limiting how much you could put on screen. If you could view the canvas at any arbitrary magnification, though, you can place objects at different levels, at different relative sizes, and also place an arbitrary number of pages there. You could even use it to replace bookmarks! Secondly, Panorama competes with the tab bar as a way of organizing pages. There should be only one interface, not two.

If you haven’t done so before, I highly recommend trying out Panorama prior to moving on to the next section.

The browser is the operating system. Mozilla is already working on Firefox OS, which allows devices to boot directly into the Web, without the increasingly-unnecessary baggage of foreign applications and the operating-system environment. Firefox should be the operating system. Although I had this in mind all along, we can now focus on creating a design that will truly make this work. My goal is nothing short of creating a concept so compelling that people will want it to replace their operating systems.

Design for touchscreens first. In order to make this concept compelling as an interface for an operating system for the foreseeable future, it will be designed with touchscreens, and especially tablets, in mind as the primary environment. I will also, however, have desktops and smaller devices in mind, making sure the interface works well on most types of devices.


Keeping all of the above in mind, I will first present a design for the environment of Firefox-the-Operating-System. Then, I will present the details of how tab proliferation and related issues are addressed in this environment.

I. Panorama Enhanced

A Firefox Tablet (annotated version)

The above mockup represents a direction in which Firefox could evolve were its Panorama feature used as the basis for an entire Web-based environment (operating system interface). We see here a tablet displaying the home screen. You see pages arranged spatially at various locations and various sizes. Unlike Panorama, this is a true Zooming User Interface (ZUI): you may zoom in or out from here as much or as little as you want, and you may place objects at any zoom level. (Compare the original design that eventually led to Panorama, or a much earlier ZUI demo, all of which were created by Aza Raskin.) Once you zoom in enough for a page to take up the entire screen, you may use it like any other Web page. The interface includes the following elements:

  1. Command button. The interface has virtually no administrative debris — it is chromeless. This is one of the few buttons in the entire interface. Its purpose is to bring up commands or actions for you to run. In most software, commands are exposed in different ways, each using a different conceptual model: menu bars, buttons, drop-down menus, and right-click menus. Worse, all these actions are placed in different spots all over the screen, forcing you to hunt and peck for the command you want. With the Command button, however, all commands get added here in a unified manner. In the beginning, it would work similarly to the current Firefox button. It should, over time, be evolved to invoke a Ubiquity-like command system. These commands would work on any page you like (unlike today, where commands are stuck within each application). People should be able to add or remove commands here as they please. If you hold down the button, you can use your own voice to command the system. For example: select some text, hold down the button, say ‘highlight this’, let go, and your selected text is now highlighted. Ideally, this would be a hardware button somewhere on the device, but here is where it would go otherwise.

To recap: how many separate metaphors and paradigms have we simplified into one? Instead of icons, widgets, tabs, and windows, we have semantically-zoomable pages. Instead of bookmarking, you simply leave a page around in the environment, and possibly move it to a better location (like with objects in the real world) and better zoom level (like you wish you could do with objects in the real world). Instead of downloading, we just open (and leave the new page around if we want to keep this “download”). Finally, instead of both the tab bar and individual tab histories, we have stacks. Pages are the sole metaphor.

II. Stacks: An Alternative to Tabs

Browsing with Stacks (annotated version)

Say you created a new page by tapping anywhere in your zoomable canvas, such as was done in the tablet mockup. What happens next? Everything zooms in on the area where the new page will be created, at the spot you chose…

  1. Create a page. You then get something like the Awesome Screen — a page designed to get you to where you want as fast as possible, including a spot to type in text. This page appears partially zoomed out so as to introduce you to the stack view. This is the zoom level where the current page and other pages in its stack will appear together, and where the interaction is designed for working within the stack of pages. Showing the stack view at this point will serve to reinforce the idea that you can always zoom out to this view. Once you’ve chosen your destination, the new page loads in place, replacing the Awesome Screen in its entirety (including the input area). When the page has loaded, the view is zoomed in even further, showing you the page and absolutely nothing else: 100% content.

Did you notice how we solved the tab proliferation problem? The solution here is conceptually identical to the one in my original Ubiquitous Firefox concept, but this one is far more discoverable and obvious (and hopefully more enjoyable). Moreover, this one has the advantage of only keeping around pages you care about. By always placing new pages in a fresh space, you never have unrelated pages grouped together. By presenting all spawned pages together in a logical order in one stack, that stack in essence becomes your reading list, ordered the way you naturally would want it. By merging tab history and the tab bar, we’ve made it a much simpler task to find your page within a stack of related pages. And this was all done by removing metaphors until we were left with just one: the page.

Questions

  • How should the stack be visually designed? I had a lot of trouble trying to figure this out. It needs to fit within the general visual metaphor (a 2-dimensional infinite space with flat objects placed on it). It needs to be tight enough so that you can see many pages at a glance without needing to flip through the stack too much. It needs to work for very large stacks as well. And it should generally be fun and easy to work with, while also looking pretty in the overview. After all, if it looks too messy, people might have a strong urge to “clean” it up when they don’t really want to.

The Future!

Very little of all of this is genuinely new. Quite a lot of the concepts here are based on The Humane Interface (and most of this should be familiar to anyone who has read this book), together with design patterns found in many different digital environments (such as webOS). True zooming interfaces have been around in research form for a long time, but have never quite made it to the consumer space in any large form, outside specialized applications (such as Google Maps). I’ve spent several years thinking about how a ZUI browser would work (including one mockup), but I’ve never come across a truly satisfying answer. It was only once I finally got an idea of what the essential problems with browsers were and some possible solutions that I finally figured out how a zooming browser might look — one that has a real chance of succeeding. In such an environment, we no longer need the classic Desktop Metaphor, no windows, icons, menus, or pointers, no opening or saving, no downloading, no applications, and no other outdated counter-productive concepts. Nor do we need the metaphors that Web browsers have had for a long time: the Back button, bookmarks, and tabs. (Wouldn’t it be great if we could get rid of Reload too?) Finally, we don’t need chrome any more either. All we have is content. Everything is unified in this environment: the browser is the operating system. (And, unlike Firefox OS, there is no odd browser-app–within–browser-operating-system.)

Let’s make it work!

Afterword: Making It Work

Thanks to a fruitful and insightful discussion, I’d like to revisit the concept with some answers to the questions I asked at the end of my proposal, along with some new ideas to modify and compliment the design. I will also discuss how we might implement this system, and what needs to done now.

Some Answers

Stack Design. Instead of the physically-confusing way in which I originally placed cards in a stack, the stack should look more like a left-handed fan of cards, the topmost one at the left.

Additionally, the fan would be parted in the middle, with the currently-focused page sticking out. Thus, two pages are always fully visible: the first page and the focused page. Since the first page is often the page that originated the entire stack, and since that page is often a Web app (such as when you’ve opened many links from Gmail), having this page visible is useful.

How do we make sure the stack isn’t too cluttered? Semantic zooming. In the canvas view, the stack is compressed tightly all within the space of the original single page. As you zoom in, the stack magically fans out within that space, revealing more detail, as the view seamlessly transitions to stack view. Initially, when zooming in to stack view, you see all the pages in the stack. From there, you can continue to zoom in, until you have reached the point where the system seamlessly switches to page view.

Making stack view zoomable (like with canvas view and page view) has several benefits. First, large stacks become easier to browse. Instead of slowly flipping through each page until you find the one you want, you can just zoom out a little to get more of an overview of the entire stack, arranged in a way ideal for that zoom level. Imagine if zooming out from page view (or zooming in on a stack from the canvas) showed you the stack arranged as an easy-to-browse grid (kind of like zooming in on an album on the iPad). This need not be the exact visual design, but the point is the same: semantic zooming allows us to make the stack as easy to browse as the overview. With proper design, we don’t need to introduce a button or gesture in order to show a different view of the stack.

A second benefit is what happens when you open a new page from a link. Instead of the display zooming out to show the entire stack, we zoom out only enough to show the new page’s placement within the stack. This ensures that the current page is given the maximum screen real-estate, leaving it usable even without zooming back in. Similarly, the currently-focused page in the stack should be made usable as soon as the stack view’s zoom level is deep enough such that we see the page and only a few nearby pages.

History Area. In order to solve the history area issue along with some others, we first need to redesign the system area. The system area should appear as a strip across the top of the canvas. All system objects should appear there. The history area should be placed in the middle of the system strip, rather than at the corner of the content area. The rest of the space below the system strip is 100% reserved for your content, organized as you wish, without limitation. This solves the page-deletion problem, because we can visualize throwing a page upwards as the history object sliding in from the top of the screen while the deleted page gets smaller and is made history, as it were. It should also be possible to throw a page (or stack or group) away from the overview (like on Android). The system strip should also include a command area by the Command button. If you zoom in on that area of the system strip, you can manage your commands from there. Search should probably also be moved there as another command.

This separation of content and system information allows us to present the environment as a clear physical hierarchy. The system strip and the content are are both subsets of everything that belongs to the user. Zooming out from here will show you other users along with other devices on the network. Perhaps zooming out from there will also show you other computers elsewhere.

Besides cleaning up the system area, we also introduce a new element: a status bar that appears at the top whenever we are not in page view. Although this kind of breaks the no–administrative-debris rule, this is a necessity due to the items within it: Command button at the top left (as before), and notifications at the right end of the bar. These notifications can come from any open page, both system pages (like the ones that display time, connection, and volume status) and regular pages (such as your messages). When you access one of these items by pulling it down from the top, the item should increase in size and slide down, appearing as a page overlaid on the screen. These pages can also be swiped away just like a page in a stack. Finally, notifications can temporarily appear on the status bar, sliding down the status bar into view if you’re in page view.

Multitasking. My hunch is that the main types of multitasking that we should concentrate on are side-by-side tiles and a small floating window in the corner, though research is needed. Each window is a separate viewport onto the environment. Theoretically, two people could use two viewports simultaneously, each viewing a separate user account. What we absolutely do not want is partially-overlapping windows (nor tabs).

There were several ideas about how we can facilitate multitasking. First, if you drag a page to any edge of the screen, you create a tile on that side displaying that page zoomed in. The other half of the screen will hold another tile displaying whatever zoom level you were at before this action. If your drag a page to a corner instead, you get a floating window in that corner.

Alternatively, we can take advantage of multitouch to allow you to create a “paradox” for the system: hold a page in place (at any zoom level) and then perform an action that would have moved the page. For example, tap and hold on the screen in page view with one finger and perform an edge swipe with another finger; the system must then interpret this gesture as displaying the page in one side of the screen while showing you stack view in the other side. Similarly, you can hold a page in place in stack view but then tap on another page. Or, you can tap on two pages at the same time.

Another possibility is to use a special gesture, such as dragging the screen from the centre outwards with both hands, as if you were ripping the screen into two pieces. This proposal, however, is the least discoverable. Alternatively, we could simply add a New Window command. Such a method would much more discoverable, but would not follow the direct manipulation feel of the other proposals (nor would it be as much fun). Whatever method is chosen in the end, it should be discoverable, quick, consistent with how the system works (especially with direct manipulation), and, importantly, fun (so it’s not a mental burden whenever you think about multitasking).

Special Groups. As I mentioned briefly in the concept outline, you should be able to install online services such that all their objects automatically appear in you environment. For example, if you install the Google Drive services, all your Drive documents appear as pages automatically synchronized within a special group. We can take this idea further: what if we allowed online services to escape the boundaries of the page? Web apps could then manipulate their own groups to display their content in any way they like. So, for example, Flickr could give you not a page, but a whole group that displayed all your photos in a way optimized for photos, together with proper controls or whatever else is desired for that specific task. A music service, on the other hand would organize your music in a wholly different manner. The system could take advantage of this functionality by showing you your history of pages in a manner optimized for viewing your history and restoring old pages. Meanwhile, the main unit of content (pages) from these services would continue to be exposed to the system, allowing you to search for these pages and to copy them elsewhere. Thus, the system is built with extensibility in mind, allowing you to customize it for different types of content beyond the base design of the ZUI.

Implementation

The following is my best estimation for how this system should be implemented. Ultimately, though, implementation will be up to the programmers. (I hope my description makes sense to the programmers out there.)

At first, the system should be coded as an application, rather than solely as an operating system. Doing so will allow us to gain as much exposure as possible. The core of the application should be written such that it can also easily be loaded as an entire operating system environment (such as on top of Firefox OS’s kernel). The application should mostly be written as a Web app. It should be available as an Open Web App, so that it can be run anywhere Firefox runs: desktop operating systems, Android, and Firefox OS. Hopefully, with some extra code, the app should also run in iOS.

On the filesystem level, each page should automatically be saved either as a single file (perhaps .html), as a special folder (similar to .app), or as an archived folder (similar to OpenDocument). All metadata related to the page, such as location on the canvas and the page’s state, are saved along with it in the filesystem. Groups would simply appear as folders. If a page appears in more than one location, we use a hard link for each instance. Previous versions of each page are also stored, so that you do not lose content when refreshing a page. If the user already has files (such as when running this system as an app), they would appear as pages within the environment. Ultimately, it should be easy for anyone with more direct access to the filesystem to copy pages and to share them with other operating systems.

From there, one possible goal is to get this system incorporated into Firefox OS (as well as the Firefox browser). Firefox OS’s system of apps could co-exist within our environment: when creating a new page, installed apps will be some of the objects suggested by the New Page screen. Moreover, apps could all appear in the canvas as pages. Beyond Firefox (OS), perhaps the design could even get incorporated into other Web-based systems, like Open webOS or Chrome OS.

The Next Step

I, with countless others, have been frustrated by the inhumane design of most digital environments for far too long. Alas, it is only recently that I got a more concrete idea of how to fix this mess. For me, there is a moral imperative to try to give the world a better alternative to the interfaces that confuse people and scare them away from computers. The time has come to do something about it.

In order to make this happen, I need someone to help me create an interactive mockup of the concept so that people can play around with it, fall in love with it, and want to help turning it into a reality. I also need developers to help code the actual app. More importantly, I need eyes: as many people as possible should see this. Ideally, I would love to generate interest within Mozilla.

Finally, for any potential employers out there, I am available to work on such a project (and available for UX jobs as well).

Now, let’s make it work!

David Regev on UX

Thoughts on design, user experience, and how out the…

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store