Cutting Through the Fuzz: What Sets Information, Data, Knowledge and Art Apart

Published in

Joseph Nease Art Gallery

17 min readFeb 6, 2024

David Bowen, outsourced narcissism, computer, robotic arm with a camera attached, mirror, monitor, cables, as displayed in Joseph Nease Gallery’s Catching Up / Resurfacing Exhibit.

In David Bowen’s outsourced narcissism, a computer-controlled robotic arm with a camera uses a mirror and AI to “recognize itself” and post selfies on Instagram when certainty exceeds 85%. It continually improves its self-identification skills through this process. The question posed here is which part of this mechanization do we consider “information” versus “data”? Is there any sort of true “recognition” taking place that we might consider “knowledge”? And where exactly is the “art”?

To make a quick counterargument, a lot of the following may come off as a series of posturing and wholly unnecessary distinctions. Do we really need to contrast Information from Data, for instance? It’s a fair question. But at the very least, it’s still an unanswered question. That alone makes it worth pondering.

Artists have, in actuality, been struggling with this sort of fuzziness for decades. The public is the latecomer here, with the widescale use of Large Language Models in 2023 suddenly bringing these sorts of issues into a more public discourse.

In large part, this is due to a fundamental change in how we search for things online. A lot of us have begun to use ChatGPT for answers, and it’s already testing our sense of informational “correctness”. We may ask it to draft a briefing for a meeting, or the beginning of an article, or a difficult email we don’t want to write. This is already a fundamentally fuller sort of question than we would ask Google.

Through traditional search, we learned to communicate by some strange form of keyword speak. “Cafes … Open … Now … Nearby” or “Symptoms … Stomach Ache … Chills … Medicine?” It’s a bizarre way to ask for things, to say the least, and we’re even more bizarrely accustomed to it.

Though now we’re asking ChatGPT questions like “provide me with a comprehensive list of all cafes nearby that are dog friendly and offer dairy alternatives” or “I’m going to list a series of symptoms, which I need you to look at and advise as to the possible causes as well as the possible treatments, okay?”

This is a dialogue, not a simple query. These are fuller, more complex requests, and they imply a certain agency on the part of the models themselves.

In many cases, ChatGPT has been a more trustworthy source of medical information than entire suites of human physicians. Though it doesn’t always get specifics correct, and these sorts of hallucinations seem to be more of a feature than a bug.

This sort of “bug” might simply be unrecognized inspiration, unfettered by appropriate context. A human child may go on about how their cat is bigger than a dinosaur, and we certainly wouldn’t berate the child for not understanding scale. We’d likely consider it funny because we both understand what the child is trying to imply, and it also works as unintended hyperbole. If we learn that the child was referencing one of his toy dinosaurs to come to the conclusion that his cat is larger, it’s all the funnier for the irony that the adults were the ones who misunderstood the context the child was working with.

Why do we not extend Large Language Models the same courtesy?

Generative models will more often than not write beautifully cogent paragraphs in response to our questions, but what’s far more intriguing is that the context itself can be entirely novel (again, implying a certain uncontrollable agency). We call this peculiarity “hallucination”, though it’s not at all clear that it’s different from the same sort of miscommunication we experience with each other when our context is misaligned. Therefore, we have to ask, what is the information? What is the data being utilized? Can we consider it some form of artificial knowledge, and why does the writing oftentimes feel so creative?

Kathy McTavish, Quilt Factory Quilt, cotton, polyester, thread, code, 73.5 x 73.5, 2023

These quandaries extend into the visual realm as well. Kathy McTavish is a media composer, cellist and installation artist whose work blends data, text, code, sound and abstract, layered moving images. Years before ChatGPT entered the public zeitgeist, her work has focused on creating generative methods for building networked, multichannel video and sound environments.

This work (like the quilt shown above) is generative, has a certain agency of its own, and yet conveys an aperiodicity that is strikingly human. There is an underappreciated symmetry in the randomness, much as you’d expect from Penrose tiles or a historic mosque’s non-repeating patterns. Her cross-sensory, polyphonic landscapes flow from the digital into physical spaces, and cause us to again ask where does the data turn into information, what do these models “know”, and why are the patterns produced so beautifully?

These questions have taken on a newfound materiality in some of the neoteric platforms like DALL-E and MidJourney, that allow users to employ their knowledge of art to turn information into data, which then creates new art.

MidJourney Prompt: an abstract painting with pink flowers and blue dots, light violet and white, vibrant and textured, impressionist gardens, light white and turquoise, vibrant and colorful abstracts, color-field, — ar 1920:1080

It’s easy to see why generative dialogues may turn out to be the preferred form of online productivity and interaction for the majority of the planet. There is ample evidence that this is the case. It’s only human.

We already communicate this way with each other — why shouldn’t we do so with machines? It’s not been possible until now, but the future has arrived rather suddenly. Some might say, ‘under cloak of night’.

Numerous studies have provided convincing evidence that the world at large probably isn’t ready (throw a stick in any direction and you’ll find a dozen nowadays). But the internet itself, as a beast of burden, is already at technical parity. A wild synthetic animal fueled by the sweaty fingers of a billion users, the whole being far more than the sum of its parts. All finding their way to neural nets that produce fluid answers, trained on the past two decades of human input and an archive of known human history.

Which is at least part of why generative interactions are so enslaving. They are the logical end of our blood, sweat, and tears accumulated into information and mirrored back at us like Narcissus’ reflection. If we aren’t careful, we may die by the water’s edge after forgetting to eat and sleep. Hopefully the revolution will not be Instagramed, lest everyone becomes too busy with their own content creation to notice the robots tilling our fields have collectively decided to leave planet earth.

The underlying point here being that this generative technology is more addictive than ever. Therefore widespread adoption of the technology, both for purely practical as well as obsessive reasons, is inevitable. The numbers don’t lie.

ChatGPT, launched by OpenAI, has seen the most rapid growth ever witnessed in an online platform. It acquired 1 million users just 5 days after launching in November 2022, and by February 2023, it had 1 billion users, a 170% increase from the previous month. By May 2023, it had 1.8 billion worldwide visits.

Google Bard, which launched in beta in February 2023, had 142.6 million visits by May 2023, a significant increase of 187.2% from April. However, this is still significantly less than the number of visits to ChatGPT.

While these large language models are growing, traditional Google searches still dominate. Google Search traffic fell by only 0.4% in October 2023, and Google still has more than 1 billion daily active users. However, in the realm of generative search, Google is lagging considerably behind. In fact, quantitatively, it looks as though OpenAI is as much ahead of Google in the realm of Large Language Model implementation as Google is ahead of OpenAI in general internet searches.

Recently, Bard, Google’s chatbot, managed to edge past GPT-4 in the LMSYS Leaderboard rankings, barely ascending to the second tier, thanks to an update that incorporated the new Gemini Pro model. But one has to ask, does it actually matter when generative content creation (video, images, music, etc.) is now the standard? If a digital platform company isn’t using generative technology nowadays it’s perceived as a black mark that identifies a business for obsolescence. Generative content capabilities are now the expectation.

The ultimate irony here is OpenAI doesn’t really care about internet search algorithms, or image generation, or any of the other things everyone else is competing for — their target is Artificial General Intelligence (AGI). So it would seem Google and OpenAI aren’t even playing the same game. Their respective fields simply happen to coincide…for the moment.

Targeting Artificial General Intelligence

Mark Zuckerberg recently threw his hat into the AGI ring, which came as a shock to a lot of people. Most of all because, as Meta continues to prove with their open source LLM Llama, he plans to give his AGI model away for free. Whatever one thinks of FaceBook or Zuckerberg post-Cambridge Analytica, he’s setting his sights as high as possible.

The ultimate target matters. The space that lies between oneself and one’s goals serves not only as a measure of the challenges we impose upon ourselves but also as that same narcissistic mirror, reflecting the magnitude of perspective hoped to be gained from the effort. This principle is vividly illustrated in the realms of both art and technology, where unconventional approaches and the pursuit of novel objectives tend to pave the way.

Disruption, by definition, comes from off the playing field. One of the best examples of this is the information economy itself. What began as a project to create a robust, decentralized communication network (ARPANET) to address the mundane issue of ensuring effective communication in the event of a nuclear attack, ultimately led to the creation of the internet, a technology that has profoundly transformed nearly every aspect of modern life.

Disruptive signals, it would seem, are always worthy of our consideration.

James Woodfill Code Practice Cart, mixed media, 2019

James Woodfill’s exhibition “Crossing Signals” at the Joseph Nease Gallery, originally known as “CODE PRACTICE,” explored the dynamic relationship between various modes of communication and the development of perception through art.

Influenced by the principles of signaling theory from evolutionary biology, Woodfill’s art delves into the authenticity and dependability of signals, especially disruptive ones.

In an era where discerning truth has become increasingly complex, Woodfill provides an unavoidable psychophysical commentary. His current projects boldly challenge traditional perspectives, suggesting that the essence of communication (whether it be data, information, knowledge, or art) evolves with shifting context, rather than being fixed in nature.

Where AI is concerned, we’d all do well to take a good hard look at Woodfill’s body of work. And if we’re lucky, the abyss might stare back at us, so we can get a real sense of what we’re heading into. Of course, when the future seems uncertain, people tend to look more towards the past.

After all, the past seems dependable. But is it?

Woodfill’s work inadvertantly provides some thoughtful commentary on this very natural inclination as well. His crafted environments serve as repositories for both shared and personal recordings, even if they appear to us as the talking ghost of machine language. They are the embodiment of the machines we’ve accumulated to record our very lives.

James Woodfill “Boundary Marker — Set #1”, acrylic and gesso on birch plywood and poplar, 27" x 81" x 2"

Even his static frames, hanging on the walls, unfinished and casting shadows are wholly dependent upon the space, providing a contextual awareness of light as dappled information.

His frequencies, rich with visual, auditory, and spatial components, mimic the representation of memory we’re trying to understand. What is the past to a machine, and what is our relationship to machines that record our past? Woodfill’s work actively engages in shaping and reviving this question. Causing us to wonder where the personal memory ends and the digitized memory begins?

Memory, it turns out, is vital to understanding what we consider to be “human knowledge”. Which naturally contextualizes itself against non-human forms of knowledge, chiefly of the digital variety. Is true knowledge possible for a computer? Some prominent thinkers tend to think so.

But what is the difference between thinking in carbon versus silicon? More specifically, we might ask, what do Large Language Models remember, if anything, from their training data?

Testing External Memory

We haven’t, until very recently, tested the technological bounds of external memory. Books are a continuation of scrolls, which were a continuation of clay tablets, which were in their own way a continuation of cave drawings and oral histories. Computers are certainly a physiologically different animal, but they haven’t come on the scene until relatively recent decades.

To go back to the initial line of questioning, there is certainly an argument to be made for distinguishing analog knowledge and art from digitized data and the subsequent information we assume it conveys. Pre-historic elders may have passed stories down from one person to another, but without present day descendants to interpret their ancestors’ messages, we’re essentially forced to implicate whatever material evidence we can to interpret these messages. And once they’re digitized, we have to accept that we’re ingesting something absent its original context.

All of this is very interesting, but to throw a disruptive signal into the mix, what about artificial artifacts? What memories do they invoke? What knowledge? What information, and what art?

How can something be an artifice resembling an artifact, devoid of a prominent place in human history, though exemplifying the truest sense of artistic form?

Cary Esser, Disclosure series, earthenware and glaze

Cary Esser’s artistic journey, as showcased in her Disclosure series, embodies a profound exploration of the potential for artifacts to convey layers of meaning that are not immediately apparent. Esser’s fascination with the interplay between organic and geometric structures, and her reflections on architectural form and surface, serve as a reminder for the way artifacts hold a multitude of meanings. Her work, particularly influenced by Native American parfleches, underscores the idea that an object’s significance is magnified more by the mysteries it carries than its actual history.

We will here make the case that this is true of all artifacts, especially real ones.

For instance, a piece of Esser’s earthenware transcends its materiality to become a repository of cultural memory (a reflection) and personal interpretation. What’s important when one considers it isn’t the historicity it harkens to, as much as the infinite interpretation it cultivates. This is the quintessence of the philosopher’s stone. The secret ingredient to meaning-making is a sort of indefiniteness. Not a finality, but an inexhaustible plausibility of being.

The notion that an artifact’s value is contingent upon its ability to generate boundless meanings may sound contrary to the argument for protecting provenance and true historical record. But provenance only works its magic when there is far more perceived to be unknown than known.

The extent to which an object’s “story” is known versus unknown is actually quite important. Esser’s creations invite viewers to delve into this realm of unbounded interpretation, where the absence of a fully solidified backstory enriches the potential for personal and collective meaning-making. This is key to interpreting the interplay between data, knowledge, and meaning we’re after.

A brief example from history.

The Antikythera device, considered the first known example of an analog computer, dates back to the time of Ancient Alexandria. There is certainly far more unknown than known about this device. Many have ruminated on the oddity that the seeds of the industrial and computer revolutions seem to have been present two millennia beforehand. Was the sophistication of the Greeks just an immense flash in the pan? Were the Dark Ages inevitable? Could we have produced the first smart phones to build network states in 330 AD, instead of Constantine uniting disparate territories?

The unknown factors involved give this artifact a power that the greatest ball of twine won’t come near to possessing for another two millennia (if it survives two more years).

Libraries…The Persistent Model

That said, some ideas are profoundly sticky. Around the same time period as the Antikythera device, we built a library that became the envy of the rest of the Western world. There were many similar compilations of work, such as the Library of Ashurbanipal and its tens of thousands of clay tablets, but it was the Library of Alexandria alone that became the most seeded in the minds of Westerners.

The Alexandrian collection has taken on the role of a societal archetype for compiled wisdom. Likely even more so since it was burned to the ground, as institutions lost to history tend to have more longevity as ideas. Would Plato have written Socrates into his Republic as the father of philosophy had his teacher not drank that fateful cup of hemlock? Darkness is a fertile ground for memory making as tribute.

So it was with this Library, burned down by a mishap with Julius Caesar’s military, or the early Christians, or who knows? Arguably everything archived since the burning of the library has more or less followed the same trajectory — and that’s what matters. The same sort of cataloging.

Whether it was a scroll on Papyrus from Ancient Egypt, or a printed pamphlet from the time of the American Revolution, or a series of tweets from the Arab Spring protests in Egypt. If anyone holds onto these “informational artifacts” it’s in essentially the same way. We store them, either through people, or on shelves, or in databases. What are search/filter algorithms if not automated librarians?

Thus the Library at Alexandria seems to be, at least until 2023, THE de facto model for how artifacts are stored (assuming we view servers as analogous to a digital library, and we very much do).

Getting Back On Track…with Information

Notice we have not yet distinguished information from data, knowledge and art — at this stage in our journey, defining these terms seems prudent. Suffice it to say, we’re in uncharted waters together, so it’s best to move forward thoughtfully lest we mistake a whale for land (or some such other digital Fata Morgana).

To this end, we’ll begin with ‘Information’. What is it, and why is it different than data, knowledge, and art?

Information, since the dawn of the “Information Age”, has taken on a decidedly different flavor than previous epochs. There was something of a genesis in Claude Shannon’s landmark 1948 paper “A Mathematical Theory of Communication.”

In this work, Shannon introduced the key ideas of information entropy and channel capacity, grounding information quantitatively in probability and statistics rather than the semantic meaning of a message. As Bell Labs engineer and “Father of Information Theory,” here is how Shannon defined information:

“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point…Frequently the messages have meaning; that is, they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.”

There’s something quite essential to the way Shannon is flippantly disregarding the realm of human meaning here in favor of comparatively brutish engineering. In the 1800s, most people used the words “information” and “knowledge” interchangeably. Unbeknownst to him, this change in focus was a key piece of a much greater and more radical transformation in how ‘information’ itself is conceived. He further stated:

“The significant aspect is that the actual message is one selected from a set of possible messages. The system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design.”

So in Shannon’s formulation focused on transmitting signals reliably, information loses any specific meaning and becomes probabilistic, defined by the degree of uncertainty reduced by selecting from all possible messages built into a communication channel. The focus lies in encoding/decoding schemes rather than interpreting content. This engineering-centric conception guided much early digital technology. And indeed, as we shall see, has direct implications for how “knowledge” is generated by Large Language Models today.

James Woodfill Cart Set With Lights #1, mixed media, 45" x 72" x 84", 2019

At the very least, Woodfill’s disparate signals are starting to sound a lot more like the psychophysical processing of a dualism where information has become unnaturally disassociated from the knowledge it once represented.

The seas become murkier still when we consider the difference between data and information. Is there a difference?

In an era where data analytics drives decisions and “information is power”, what exactly distinguishes raw data from contextual information? On the surface they may seem interchangeable. But there exist important semantic and structural distinctions with implications for how we transmit, interpret, and apply these vital digital elements.

Squaring the ‘Information’ circle with ‘Data’ and ‘Knowledge’

If we harken back to Shannon’s paper, we might say in retrospect that he was as chemist of sorts, sanitizing knowledge into information. Which was then broken down into data that could be transmitted according to the theories he espoused.

Therefore data is the raw substance, or element, of information. Information, in turn, is the sterilized form of knowledge, as information alone lacks the wisdom and agency that knowledge implies. It is by this process that knowledge can be trasmutated into information, broken down into Shannon’s bits, and transmitted anywhere in the universe. How it is reassembled on the other end is entirely up to the entities doing the receiving.

Data comprises the raw material of facts, measurements, quantities, characters or symbols that have been captured via instruments, entered into software systems, or generated through simulations and models. They populate databases as lengthy strings of binary digits within fields and tables. Data measures, enumerates, classifies, denotes — but alone lacks the inherent message of information. Like disparate threads, data awaits the organizing perspective that draws out an informational scaffolding to apply knowledge and meaning.

Information is the go-between for this process. It’s the web that ultimately binds data to knowledge, and vice versa. Thus, both Martin Luther King’s infamous I Have A Dream speech and the recorded sound of radio waves from a far off star inhabit the same meaning-scrubbed realm of ‘information’. It is the launching off point for both creating superfluous human beauty and grinding that superfluity down into bits which can be sent anywhere in the cosmos.

Knowledge emerges when information finds particularly meaningful relational and contextual structure. Data emerges when information is broken down into its smallest components.

In either case, when patterns are uncovered that link data elements into a coherent picture, they first inhabit the realm of information. It gets confusing, but it’s a distinction worth making.

When trends and cause-effect narratives tie quantities to insights over time, they assume the momentum of meaning. When microscopic threads become integrated tapestry, revelation and story are the macroscopic result. This is all the handiwork of information looming, where the thread is data laid bare, and the resulting pattern is knowledge, and ultimately art.

Twilight Of The Gods…Or Something Else?

Standing as a great leviathan of data, information and knowledge, art is the one tip of the pyramid that captures everything beneath it. Artistic ventures can be technological, such as David Bowen’s outsourced narcissism. They can actively show us what data is and force us to scrounge in the dark for meaning, such as with James Woodfill’s vast body of work. They can break apart the meaning-making process into its most programmable parts, such as with Kathy McTavish’s generative quilts. They can even exist independent of historical meaning, such as with Cary Esser’s disclosure series (which sounds like some sort of bait considering the historical context of the parfleches in Native American culture these were heavily influenced by, but Esser’s “Disclosures” are not those).

What we really require at this junction is not more code, but more navigation as human beings through towering seas of syntheticism.

There has been much “doomsdaying” as of late, and it’s easy to see why. Setting aside the various wars around the world, the global weirding that’s thrashing coastal cities with record storms (making the north warm, and the south cold), and the aftermath of a pandemic… Setting aside all of those abject horrors, the promise that Artificial Intelligence is coming to rob everyone of both their livelihoods and their self-worth understandably has people on edge.

Although it may seem counterintuitive at a time when most people are decrying the end of art and artists, it is specifically art and artists who have been exploring this artificial space for hundreds of thousands of years. We may be able to create a digital simulacrum of the first cave paintings, but that’s just data assembled into information. It only means something because it relates to the real thing.

It’s only art when we breathe life into it.

Artists like Cary Esser clearly understand all of this at a fundamental level. Better than most. The rest of us seem to be losing sight of humanity, drowned out by our own bemoaning of the end of creativity.

MidJourney Prompt: a painting shows several wild animals, in the style of art of the upper paleolithic, dark white and amber, luxurious wall hangings, the new fauves, stone, explosive pigmentation, dau-al-set.

But artists are the path forward, and they’ve been guiding us from the very beginning. That never stopped being the case. Art is the big road sign, made manifest. And we are the illumination which brings knowledge from data chaos.

Cutting Through the Fuzz: What Sets Information, Data, Knowledge and Art Apart

Written by Joseph Nease Gallery