Hashtags and Confidence
By David Weinberger
AI Outside In is a column by PAIR’s former writer-in-residence, David Weinberger, who offers his outsider perspective on key ideas in machine learning. His opinions are his own and do not necessarily reflect those of Google.
Making online objects more understandable?
I love how many machine learning systems require explicit decisions about what level of confidence we want them to assume as they make their classifications and correlations. That makes these machines more consistently humble than we humans are.
As I said in my previous post, I even hold the hope that we’re going to learn from these machines to attach confidence levels to much of what we say. Perhaps we’ll even use the .00 to 1.0 scale that machine learning has adopted from statisticians.
So, in my fever dream of the future, we routinely say things like, “That celebrity relationship is going to last, 0.7 for sure!” and “Leo is a 0.6 shoo-in for an Oscar!” Oh, sure, expressions of confidence probably (0.8) won’t take exactly that form. But, then, a decade ago, many were dubious about the longevity of tagging online objects with a word or phrase to make them more findable and understandable. Yet the brief history of tagging perhaps shows that ways of contextualizing our statements can change more rapidly than we might think.
Personal, perspectival filtering
In the mid 2000’s, tags were odd creatures. They’d been introduced to the Internet by techies, and after an initial splash, seemed destined to fade away like yesterday’s YouTube sensation. Perhaps (0.3) explicit confidence levels will follow the same trajectory.
But tags, like The Dude, abide.
Photo tagged “the_dude” at Flickr. Photo by Joe Polletta, licensed as CC-BY-SA
Remember word clouds, the bell bottom pants of the 2000s? This one, made for this article, is courtesy of wordclouds.com
Tags and explicit confidence levels are similar sorts of modifiers. A public tag tells the world what you think the tagged object is about. Degrees of confidence tell the world how likely you think what you just said is true. Both acknowledge that our experience of the world is a personal, perspectival filtering of it. Both acknowledge that our statements reflect our standpoint, not the world itself.
And both have entered discourse from the technological end of the pool. Tagging first hit public consciousness in the early 2000s at del.icio.us, a social bookmarking site popular with the technically literate at the time. It’s now offline but originally it let you maintain a list of URLs that you might want to find again. Since those lists could quickly get very long, del.icio.us let users attach brief labels — tags — to the URLs. Then, in an act of genius, it let people share their lists and tags. If you were interested in, say, 35mm photography, you could search for the “photography” and “35mm” tags at del.icio.us and find thousands of pages that had been tagged that way by readers.
By readers. That was the controversial thing about tags, for tags themselves were not new. Humans have been labeling things pretty much since we began writing — in fact, quite possibly using images as labels before we began writing. But as organizational systems scaled up, the categories were usually made up by experts who devised entire hierarchies of categories, sometimes of great beauty and satisfying neatness, in which everything had its place, and each thing only had one place. This was because labels were initially applied to physical things, and a physical thing can’t be in two bins at the same time.
From that point of view, user-based, shared tagging looks like a nightmare. Most are created by non-experts who very likely will disagree about what to label something. They are ad hoc and unsystematic. And tagged things can have as many different — even contradictory — tags as the people of the world want to apply to them. To a professional cataloguer or classifier, tags look like a mess.
They are. Messiness is part of tags’ true beauty, though, for it makes them a source of emergent meaning: the variety of tags the world applies to a simple object reveals much about what that object means to the world.
Statements of belief(s)
Del.icio.us peaked and faded, but tags remain. The Apple file system encourages users to use them. The Steam game platform applies them to the games it offers for sale, and lets users invent them for their own use. Some photo sites encourage their use. Gmail lets us create labels, which are a type of tag. (As a user, I wish Google Photos had tags; machine learning’s ability to identify images is mind-blowing, but will probably never anticipate that I want to tag that one beach scene as “saw dolphin” or “lost wallet.”)
And then there’s the Big Dog of tags: Twitter. The tech industry’s Chris Messina invented the hashtag in 2007 by suggesting that attendees at a techie “unconference” put “#barcamp” into their tweets so that other attendees could find them by searching. By now hashtags have driven tags into the syntax of conversation itself. They are used not merely to identify tweets but commonly as a meta-commentary on a tweet, shading it, contextualizing it, sometimes even taking it back, as in “Sorry! #NotSorry.” You’ll even sometimes still hear a young person on TV saying something like, “Sure, I’ll come with you, Dad. Hashtag: DadTrip. Hashtag: HideUnderMyHoodie.” At this point, such a use of hashtags is actually more likely intended to tag a character as a silly bourgie rebel.
It is quite fascinating that what began as a way of making something more findable so quickly became a way of commenting on one’s comment. More startlingly, we have incredibly rapidly overcome millennia of cultural and institutional assumptions about the importance of neatly categorizing things as a way of capturing what they essentially are. Now we’ve come to believe that categories are not statements of a thing’s eternal essence, but are expressions of what things mean to us as individuals.
In a similar way, statements of one’s level of confidence make it clear that knowledge is always imperfect, or, more exactly, can never be known to be perfect. Our level of confidence in a statement of belief is an integral part of that belief. The rise of tags shows us that we are willing to introduce conventions for contextualizing our statements. So perhaps there’s hope that levels of confidence will also become a more common part of how we routinely express ourselves.
0.8 for sure.