Hypertext
As Semantic
Tailorability

Russell Okamoto

Published in

New Media: Art & Science

14 min readApr 10, 2015

Part 2 of 5
Characters of the Net, Unite!

Futures Of Text Through The Looking Glass Of Tailorability

In 1965, Ted Nelson introduced hypertext in a paper titled, “A File Structure for the Complex, the Changing, and the Indeterminate”

“Let me introduce the word ‘hypertext’ to mean a body of written or pictorial material interconnected in such a complex way that it could not conveniently be presented or represented on paper.”

Nelson chronicled Xanadu, his pre-Web vision (1960) for a worldwide hypertext publishing network in Computer Lib / Dream Machines (1974) as well as in Literary Machines (1980). Nelson dedicated the 1987 version of Literary Machines to fellow computing pioneer Doug Englebart and cited a few of Engelbart’s inventions including the text link.

Dedication by Ted Nelson to Doug Engelbart in Literary Machines

Early hypertext pioneers and systems including Xanadu and NLS showcased the power of semantic tailorability — the ability for ordinary users to weave together ideas by specifying links between documents.

history of hypertext

Over 30 years later, hypertext is omnipresent. The world is full of connections. Networks. Graphs. We surf a Web of related ideas with browsers that fit in our pocket. Semantic Web frameworks, languages, and standards abound. So much innovation in the Web had occurred by 2001, Frank Halasz proclaimed the challenge of hypertext tailorability was solved:

“Moreover, many of these wonderful facilities (HTML and Javascript, for example) are easy enough to learn and use that they meet the challenge that I put forward in “Seven Issues” of enabling “nonprogrammers” to build interesting hypertext applications.” — Reflections on “Seven Issues”: hypertext in the era of the web (2001)

As a programmer who has struggled with cross-browser layouts challenges and issues of scope, closures, and functional programming, I respectfully disagree that HTML and Javascript are easy enough for nonprogrammers to learn.

But I do agree that one modern hypertext innovation empowers nonprogrammers to tailor semantic text in a very simple way.

It’s called the hashtag.

Hashtags are radically tailorable semantic links. Hashtags let anybody group messages into shared topics quickly and seamlessly creating a de-facto Semantic Web. Nonprogrammers can use hashtags to enrich messages or photos or videos instantly. Hashtags are used across social media platforms. They permeate Twitter, Instagram, Facebook, YouTube, Pinterest…Periscope. Hashtags are today’s newswires and public conversation streams.

And hashtags pave the way for a future of semantic tailorability.

Journalists protest against rising violence during march in Mexico (2010)

A future of text
is basis tags

Semantic tailorability empowers people to add meaning and expressiveness to their messages with minimal syntax. To expand hypertext beyond hashtags, what’s needed is a finite set of symbols that add the most relevancy to any communication scenario. From a mathematical viewpoint, this is like identifying a set of linearly independent vectors — a basis — that span the universe of all conversations.

A future of text defines a basis — a minimal set of tags — for marking up the semantic dimensions of any conversation.

Resource Description Framework (RDF) organizes semantic data into three axes — subject, predicate, and object. So RDF triples could be a compact basis; however, it’s hard (at least for me) to readily distinguish subjects, predicates, and objects within messages.

IMHO, a basis that’s most likely to evolve beyond hashtags will emerge from the world of Journalism. Journalism captures stories with 5Ws — who, where, what, when, and why — and four sentence types — declaratory, interrogative, imperative, and exclamatory. These news reporting dimensions create a familiar basis that can span any conversation.

Eight years ago, Chris Messina suggested # for the what dimension. Email and social media convention use @ for the who dimension. Questions or interrogatives are suffixed by ?. Whereas apps like Twitter/Foursquare already supply geolocation for social media posts, explicit use of a slashtag “/” can demarcate informal location names, “/HomeOfTheBlazers” or even pathways through nested locations, “/Pdx/ThePearlDistrict/VerdeCocina” (the particular restaurant Verde Cocina in the Pearl district in Portland).

A future of text will declare semantic basis tags for all conversational dimensions. A mapping between dimensions, tags, examples, and interpretive meanings is summarized here:

Regardless of what symbols — or perhaps emojis :-) — standardize into a preferred basis, an eigenbasis will let people tailor semantically rich messages with a simple, compact syntax.

From a knowledge discovery viewpoint, basis tags will enable a drill-down navigational experience that lets you start at a root tag like # (topics) and recursively narrow down messages by specifying intersecting dimensions — /, @, ?, !, *, emoticons, as well as other topics. This fractal-like search is akin to how product search at Amazon.com lets you discover what you are looking for by specifying facets/dimensions one at a time.

Buckminster Fuller World Game: “Make the world work, for 100% of humanity, in the shortest possible time, through spontaneous cooperation, without ecological offense or the disadvantage of anyone.”

A future of text
is the Taskchain

Celly helps over a million users participate in tens of thousands of social learning networks. Organizations range from mobile communication networks that amplify student voices in the classroom to professional development networks for teachers to international AIDS/HIV education programs to social enlightenment campaigns against worldwide economic inequity (#OccupyWallStreet) and racial injustice (#BlackLivesMatter).

Turning #words into !actions

Observing user feedback and requirements across diverse collaboration scenarios, we’ve found lifecycle patterns essential to learning networks regardless of size, complexity, or social terrain.

What’s common across every learning network is the need for task coordination. A simple, scalable, cross-platform way to let people with any device — from SMS featurephones to browsers to smartphones — manage tasks in emergent (and non-emergent) collaborative scenarios.

During Hurricane Sandy, for example, many Celly networks (aka “cells”) were created and joined by thousands of #OccupySandy volunteers within 48 hours of landfall. A lot of messaging was about coordinating help and supplies — requests for water, transportation, food, medical and health supplies, drivers, clean-up crews.

What was missing, however, was a simple way to uniquely tag tasks so volunteers, organizers, and aid groups could readily address, track, and search for messages about specific incident requests.

Large-scale task coordination remains an unsolved challenge today not only in natural disasters and humanitarian aid, but also in community building, political movements, and the enterprise (more in Part 4). Tasks need to be uniquely labeled and accessible from any device, connect resources and requestors, and store searchable message streams for stakeholder coordination.

In other words, we need a simple, fast, device-agnostic way to harness not only the wisdom of crowds, but also the workforce of crowds.

Inspired by outcomes and reflections from #OccupySandy to #OWS to #edtech, the Taskchain is a global ledger of tasks where anybody can share, discover, and collaborate on issues simply by posting messages with tasktags — words prefixed by “!” to identify tasks. The goal of the Taskchain is to create World Wide Workflow.

How the Taskchain works.

To open a task, anybody can prefix a message/tweet/post with the tasktag “!” like “We !needWater”; an ID generator then canonicalizes the tasktag by appending a sequence identifier, “We !needWater123”. Participants can then refer to the unique tasktag in subsequent messages and track status in an ad-hoc discussion thread. When the task is fulfilled, the poster suffixes the tasktag with an ending “!” to indicate closure, “Water arrived !needWater123!”

Tasktags can be private, scoped locally for an enterprise, organization, or movement. Block chain technology can generate sequence IDs for a globally distributed and serialized list of public tasktags. Hence the moniker Taskchain.

Last May, I attended a conference in DC about the Role of Big Data in Improving Security and Resilience to Catastrophic Events. During a morning Q&A session, I briefly mentioned how experiences during Hurricane Sandy underscored the need for tasktags. Disaster relief experts — panelists from the United Nations Office for the Coordination of Humanitarian Affairs, American Red Cross, FEMA, and The White House — expressed interest in a lightweight task management system.

At the conference, I also met Patrick Meier of Qatar Computing Research Institute (QCRI). Recently I read his team’s latest report about the role of social media to inform UN needs assessment during disasters. Specifically, QCRI studied 2 million tweets resulting from Typhoon Yolanda (known locally in the Philippines as Haiyan). They identified tweet clusters through supervised machine learning and applied Latent Dirichlet Allocation (LDA) to group tweets into topic models.

Below is a mockup of sample tweets showing tasktags juxtaposed with the original tweets from Typhoon Yolanda to illustrate how the Taskchain can spontaneously create ad-hoc conversation threads per incident request according to tasktag:

Operation on low-cost SMS featurephones is critical to inclusion, especially in the developing world where smartphones are less prevalent. Here’s a simulation of how the Taskchain can work with text messaging:

The Taskchain is an example of how semantic tailorability can create World Game benefits — huge, positive impact with the least amount of resources in the shortest amount of time. Tasktags are a simple, tailorable way to make needs explicit rather than implicit. This can improve accuracy of Multi Cluster/Sector Initial Rapid Assessment (MIRA) situational reports and improve cost and efficiency of resource allocations.

Cyclone Pam recently devastated the islands of Vanuatu. The Taskchain is a low-friction, practical tool for “Supporting improvisation work in inter-organizational crisis management” for Vanuatu, locally in Portland, and places in between. Taskchain applications can span from disaster relief to business workflow to alliance workflow to community building.

In summary, the Taskchain is a way for people to tailor global to-do lists. Whether semantic meaning of list items are tasks or questions or conversation threads or instructions or options or issue inventories, the notion of a list command empowers people with a means for enumeration where list items are conveniently captured and searchable by the cloud. By enhancing its lexicon by a single character “!” , Twitter (and other sites) could unlock a future of text powering global collaboration — the Twitterchain.

http://reliefweb.int/report/world/hashtag-standards-emergencies

A future of text
is folksonomy

Ideas of conversational basis tags and the Taskchain naturally lead to folksonomy — bespoke vocabulary of tags created by community members to share collaborative information for a particular movement, enterprise, or organization.

OCHA’s standardized social media hashtags for disaster response is a sample folksonomy. OCHA prescribes a hashtag vocabulary to be used for social media during disasters to help organizations coordinate relief efforts.

Stowe Boyd’s microsyntax project, introduced five years ago, was a repository for emergent folksonomies found in the wild. Many Semantic Web ontologies exist that prescribe vocabularies for RDF triples. Just a few weeks ago, Square announced $cashtags to send payment to individuals via messaging. And earlier this week Snapchat announced a vocabulary for friend emojis.

But what’s still missing is way for people to tailor their own folksonomies — canonical tags for user-specific domains.

Celly designed a conceptual Professional Development (PD) network for K12 teachers with the Oregon Department of Education (ODE) based on a repository of Common Core Standards documents. Like many states, Oregon is concerned about Common Core Standards for education. Teachers, students, and parents have questions about efficacy of particular standards, where to find standards, what standards apply to specific classes, and what learning resources are aligned with standards, and resources have been proven to improve learning outcomes. In short, PD networks that share best practices are needed.

Our PD network architecture drives PD conversations by auto-generating hashtags for Common Core items. To tag messages with Common Core Standards, teachers just start typing keywords that trigger autocomplete suggestions based on words found in the Common Core Standards specifications. When a teacher types “#geom”, a menu pops up that includes specifications entries for all Common Core Standards that include the phrase “geom”. When a specification is selected, a canonical hashtag associated with the Common Core specification is automagically appended to the teacher’s original message (see animation below). This folksonomy-driven PD network provides key benefits: (1) teachers don’t have to memorize Common Core Standards (2) Common Core Standards hashtags are auto-generated and canonicalized so teachers don’t have to create and remember them (3) Teachers can connect with peers by tracking Common Core Standards tags that match their teaching area and expertise.

Every teacher who has seen this PD network design has been summarily impressed by how easy and fast it is to co-create knowledge with their peers and subscribe to topics of interest organized by Common Core folksonomy tags.

A future of text transcends #edtech into enterprise messaging systems and movement building where user-tailorable folksonomies are seeded by document repositories and messages stored in organizational memories.

A future of text
is structural search

In “Reflections on NoteCards” Frank Halasz cited two types of search in hypertext systems: content search and structural search.

Content search is basic keyword search where users enter words and phrases as queries and matching nodes are returned. Google dominates today’s $50B digital advertisement industry (US) largely derived from keyword searches.

Structural search, on the other hand, divines results from patterns within a semantic graph. In 1991, Halasz mentioned how structural search had not gained much traction. In 2001, he re-emphasized that point.

“a search engine that looks for structural patterns makes little sense given the lack of meaningful node-link structure inherent in the Web.” — Reflections on “Seven Issues”: hypertext in the era of the web (2001)

Despite Halasz’s observations 14 years ago, node-link structure in the Web and semantic search is alive and growing today.

Programming languages exist to query structure contained in tables, data cubes, object graphs, XML documents, and Linked Data — from SQL to OLAP to Object Query Language (OQL) to XQuery to SPARQL. Structural search is used billions of times a day by JQuery to identify nodes in a DOM tree (“find all links in this paragraph of this document”).

From a machine learning viewpoint, the entire industry of Search is built from the node-link structure inherent in the Web. PageRank creates the world’s largest adjacency matrix formed by hyperlinks connecting Web pages. Facebook Graph, Google Now predictive search, and Google Knowledge Graph/Vault harvest semantic knowledge from the vast, linked landscape of social media and the Web.

As users and machines add basis and folksonomy tags to social media, Web pages, and Internet Of Things device output, search engines will ingest these semantic clues and become even smarter.

I built database engines for over a decade — including object, XML, SQL, stream, and distributed databases — and appreciate the formidable challenges of semantic heterogeneity. Creating semantic mappings between independent schemas is hard. Recovering structured tables from unstructured and semi-structured Web and social data is hard. Nevertheless, I anticipate Search experts will continue to solve these problems and utilize basis tags and folksonomy tags to improve schema-level attribute extractors, speech acts, topic models, sentiment analysis, data source quality rankings, and Natural Language Processing (NLP) models.

Improved semantic insight will in turn improve Natural Language Interfaces (NLIs) for Search. OK Google and Siri already provide basic NLIs that let non-technical users voice simple queries over semantic repositories. As mentioned in Futures of Text, NLIs will evolve from single request/response answer bots into always-on conversational companions.

For users who want hands-on direct semantic queries, languages like SPARQL can slice and dice RDF triples, however, the barriers to entry are high for non-programmers. What’s missing is a way to make queries reusable so beginners can easily tailor existing queries by plugging in their own parameters. A future of structural search for non-programmers will be graphical Query-By-Example (QBE), rather than one-shot keyword result pages.

QBE interfaces are essentially dynamic forms for search. They present nested search fields that are auto-generated from inferred schemas gathered on-the-fly from functions like Web Service APIs and IOT endpoints, content feeds like Twitter and RSS, and structured facts from Semantic Web repositories like Freebase and Google Knowledge Graph. Users start a query by typing keywords into a text box and autocomplete options appear with associated query parameters. QBE guides the user along, offloading them from having to know the underlying query language syntax.

QBE can produce a hyperquery — a “query sentence” that can be searched and easily reformulated with new parameter values by less sophisticated users. As NLI to SQL translators improve, hyperqueries can meld into the messaging experience letting you type while parts of speech — nouns, verbs, prepositions, adjectives — are dynamically embellished with hypertext dropdowns and autocomplete suggestions.

Richer, more natural hyperquery interfaces will appear in products from Gmail to Evernote to Messenger to Slack, enabling non-programmers to tailor structural queries that drill deeper into higher dimension worlds of the public Semantic Web and private corpora.

So much of querying is currently about data in the Social Graph or the Knowledge Graph. Along with all this data, there’s a Function Graph too. Alan Kay once mentioned the need for a universal interface language. I’m not sure what he meant exactly, but as IOT devices and REST endpoints grow, machines will need their own search engines to discover each other and to make sense of so many interface APIs, nouns, and verbs. Beyond human machine interactions, machines will autonomously chat with other machines.

As the 5Vs of Big Data— Volume, Variety, Velocity, Veracity, and Value — continue to grow, ad-hoc queries will return increasingly stale query results. To keep up with this challenge, prospective search agents will track the Internet continuously and push the freshest answers to us — in situ — as they happen and in the context where they occur (Minority Report-like targeted ads will regrettably come along for the ride too).

Digital labyrinth carved on a pillar of the portico of Lucca Cathedral, Tuscany, Italie.
The Latin inscription says “HIC QUEM CRETICUS EDIT. DAEDALUS EST LABERINTHUS . DE QUO NULLUS VADERE . QUIVIT QUI FUIT INTUS . NI THESEUS GRATIS ADRIANE . STAMINE JUTUS”, i.e. “This is the labyrinth built by Dedalus of Crete; all who entered therein were lost, save Theseus, thanks to Ariadne’s thread.”

Summary

Hypertext is half a century old yet is still today’s most influential text application. And will remain so. Because hypertext is the Web.

Hypertext lets us weave ideas, information, people and machines together across space and time, creating portals and pathways that propel us to new worlds.

Hypertext is Ariadne’s ball of thread that we each carry through the maze of our own lives, documenting our personal adventures, tracing our collective lineage, recording fruitful routes in a shared map of human understanding — a consilience that guides us collectively forward, step-by-step, link-by-link, while honoring and remembering our past.

“William Whewell, in his 1840 synthesis The Philosophy of the Inductive Sciences, was the first to speak of consilience, literally a ‘jumping together’ of knowledge by the linking of facts and fact-based theory across disciplines to create a common groundwork of explanation.” — Chapter 2, Consilience, Edward O. Wilson

Simply put, hypertext enables us to learn together.

As the foundation for tailorability, I think hypertext will create many futures of text — from basis tags that empower richer storytelling, global workflow via a Taskchain, folksonomies for collaboration, to metaphors for structural search and beyond.

Paired with today’s mobile devices, subsequent text innovations will expand tailorability literally into new dimensions. In Part 3, I’ll explore a key derivative application, Spatial Hypertext, and futures of text inspired by contextual tailorability.

This article continues in Parts 3 to 5:

Preamble
Part 2: Hypertext as Semantic Tailorability
Part 3: Spatial Hypertext as Contextual Tailorability
Part 4: IRC as Collaborative Tailorability
Part 5: BBS as Programmatic Tailorability

Thanks for reading! And deepest thanks to all people, products, and projects I’ve cited and hyperlinked :-) !

With Gassho!

Russell Okamoto
A Public Cellyzen

HypertextAs SemanticTailorability

Part 2 of 5Characters of the Net, Unite!