FactMiners’ Fact Cloud & Witmore’s Text as Massively Addressable Object

A “You Might Like…” recommendation in a recent routine email from Academia.edu led me to Micheal Witmore’s 2010 blog post with the titillating title, “Text: A Massively Addressable Object” (Academia.edu publication link). At the time of its writing, Witmore was busy organizing and encouraging multi-disciplinary involvement in the Working Group for Digital Inquiry at University of Wisconsin-Madison. Dr. Witmore is currently Director of the Folger Shakespeare Library, only the seventh in a distinguished line of Directors to have the awesome opportunity and responsibility to lead the world-renowned research center.

The title alone of Dr. Witmore’s piece assured me of an interesting read. But it was this sentence that unleashed a torrent of idea-synthesizing thoughts so intense as to be felt throughout my body in what can best be described as an “ideagasm”:

“In an earlier post, for example, I argued that a text might be thought of as a vector through a meta-table of all possible words.”

As goosebumps subsided, I was sure that I was on the trail of a Kindred Spirit whose words were “killing me softly with his song.”


Incubation: FactMiners’ Page Segmentation and Internet Archive Digital Collections

I was predisposed to a flood of synthesizing ideas upon exposure to Witmore’s thought-provoking article. As co-founder of The Softalk Apple Project and its affiliated FactMiners digital humanities applied research project, I have been working with our Internet Archive partners to specify file and directory naming conventions, together with public and private metadata to be associated with the items in the Softalk Magazine Collection. Our project takes this big leap forward when our collection enters the Archive’s vast collections upon initial digitization this month at the Archive’s Midwest Regional Scanning Center in Fort Wayne, Indiana.

My soulmate wife Timlynn Babitsky and I are recent stage 4 cancer survivors. We envisioned The Softalk Apple Project and FactMiners as a “Pay It Forward” tribute to Softalk magazine and as a way to help us rationalize why we had lived while so many of our treatment buddies died. As a 25th wedding anniversary gift to ourselves on behalf of our projects, we have funded the complete scan by the Internet Archive of the full run of 48 issues of Softalk magazine. Once scanned and on-line at the Internet Archive, the “serious fun” begins as we will now have the “raw materials” that will be the “playing fields” for FactMiners’ Digital Humanities social games for LAMs (Libraries, Archives, and Museums).

FactMiners.org — our grassroots Citizen Science project — is working through our affiliated Citizen History project, The Softalk Apple Project, to provide a permanent, archival-quality home for the preservation of Softalk magazine. Published between 1980–84, Softalk uniquely covered, in almost obsessive broad and deep detail, the dawn of the microcromputer and digital revolutions that shape our 24/7 lives today.

But our project does not stop at page-wise digital images or OCRed text in PDF file formats as our digitization objective. Rather, we’re working with the Archive to ensure that FactMiners’ applied research has a supportive infrastructure to delve into page segmentation and fine-grained document structure and content modeling based on a ‘metamodel subgraph’ design pattern. This approach to domain modeling is intended to take maximum advantage of the ‘schemaless’ nature of widely available graph database software technologies. (See “Where Facts Live”, the Neo4j GraphGist Edition of my #MCN2014 presentation.)

A Digital Rose is a Digital Rose…

Unless a cultural artifact is “born digital” all on-line digital cultural artifacts are copies of the original. Does that matter? What is ‘the original’ of a mass-produced cultural artifact? Is the first issue off the presses of a magazine the “original”? The one hundredth? In what sense does the act of creating an artifact transform the creator’s ideal to a rendered work which can never be anything but an approximation to the creator’s intent? Questions like these have animated in-person and on-line discussions by the International Council of Museums CIDOC Special Interest Group responsible for the development of the ISO standard Conceptual Reference Model for Museums (Libraries and Archives, etc.)

Through my recent experience getting prepared for the ingestion of Softalk magazine into the V’ger-like” scanning and preservation service of the Internet Archive, I have evolved my thinking about the essential nature of a “fully-realized” FactMiners’ Fact Cloud.

My initial thoughts were of a Fact Cloud as a semantically rich, “self-descriptive” graph database (that is, incorporating a ‘metamodel subgraph’ design pattern) comprised of all of the source document’s ‘elementary facts’ — or ‘assertions’ in the context where ‘fact’ is too strong a word for the atomic conceptual modeling elements to be studied. The very name, FactMiners, was chosen to connote the discovery and extraction of these elementary units of study from the object of study, that is, the source document.

No matter how comprehensive that collection of facts/assertions ‘fact-mined’ from the source, my notion of a FactMiners’ Fact Cloud was of it being “something different” — a companion — to the digitized source document collection. But as I read Witmore’s article it gelled for me that FactMiners’ Fact Clouds are a prime example of his suggested newfound levels of abstraction that unfold to study upon adopting a more expansive understanding of Text as Massively Addressable Object.

You can’t read much these days without bumping into the topics of Big Data, Open Data (just the data, not linked), and Linked Open Data — in our case, #LODLAM, Linked Open Data in Libraries, Archives, and Museums. The global tidal wave of our collective efforts to digitize and openly publish vast stores of cultural heritage material has contributed to the challenges and opportunities of Big Data and Open Data — That is, now that we’ve “piled it up,” how do we know what we’ve got, and how do we make it accessible and useful? That’s where the Linked part comes into play. In anticipation and response to these growing challenges, the International Council of Museums (ICOM) chartered a Special Interest Group within its International Documentation Committe to develop what has evolved into the ISO Standard Conceptual Reference Model for Museums (CIDOC-CRM, or #cidocCRM hereafter). Follow this link for more on FactMiners’ #cidocCRMgraph and #cidocCRMdev applied research.

Within Witmore’s perspective, a FactMiners’ Fact Cloud is no longer a companion to the source text but simply another modality of faithful digital preservation. In CIDOC-CRM modeling terms (see sidebar cartoon and caption), a FactMiners’ Fact Cloud is simply an instance of E73 Information Carrier of a digitized cultural artifact. And this Carrier will be sufficient to reconstruct and present a faithful human-readable and/or viewable digital version of the source cultural artifact.

This sense of the “first class nature” of a FactMiners’ Fact Cloud was reinforced by my perusing a number of digital collections already preserved within the Internet Archive collections. What you find in a typical digital collection is an organized collection of TIFF or JPEG image files, often complemented by one or more PDF files — the PDF being essentially a proprietary bundle of #cidocCRM’s E33 Linguistic Objects and E36 Visual Items within its underlying segments/blocks that make up each page of the source document. Seeing these various representative digital collections, I thought:

“How is a collection of image files or PDFs any different than the collection of files and folders that will be generated and stored through FactMiners’ Digital Humanities-inspired social gameplay?”
Thankfully preserved for generations to come, the 80-page “Shaving Made Easy” — an essential reference for retro-urban metrosexual males — is accessed at the Internet Archive through a reader that “reconstitutes” the book-reading experience by presenting a collection of high-resolution JPEG images through a book-simulating image-rendering client viewer in your browser. Follow this link, for example, to the single image of the right-hand page above.

For RESTful and similar Linked Open Data interactive access, a FactMiners’ Fact Cloud will be persisted in a live datastore such as an Open Source graph database like Neo4j. This is what we will do to host FactMiners’ #Play2Learn social games.

When it comes to archival storage of Fact Cloud data, however, we plan to simply write #cidocCRM-compliant FactMiners’ data into the TEI header section of our versioned page segment files to be stored within The Softalk Apple Project collection at the Internet Archive.

When you access an existing document in a Collection at the Internet Archive today, an image-based slideshow app or browser plug-in is “smart” about showing you the Collection’s associated TIFF or JPEG image files. Or a PDF reader app or browser-plugin does the same for associated PDF format files. In each case, the client viewer faithfully reconstitutes a human readable presentation of the original source document. These apps or browser-plugins even simulate page-turning and other visual cues to enhance your reading or viewing experience.

FactMiners’ client app games and their associated Fact Cloud admin and research tools and viewers will do the same. When not interacting with a FactMiners’ Fact Cloud through a persisted live datastore, our apps and plugins will be able to reconstitute all or part of a Fact Cloud from TEI-encoded archived data. This reconstitution not only makes the source document available for traditional reading and viewing, the document is enhanced by the conceptual lens provided by the Fact Clouds’ interpretive metamodel.

With this recent experience and associated thoughts rumbling around, I was ideally prepared for an ideagasm…

Witmore’s Challenge to Consider Text as Massively Addressable Object

Michael Witmore has a permanent smile since becoming Director of the world-renowned Folger Shakespeare Library. Unless you are feeling particularly self-confident today, I urge you NOT to read Dr. Witmore’s bio unless you can afford to be a deflated shadow of yourself until the “cup half full” part of your self-image kicks back in.

As mentioned earlier, the sentence in Witmore’s article that triggered my ideagasm was this one:

“In an earlier post, for example, I argued that a text might be thought of as a vector through a meta-table of all possible words.”

The implication of Witmore’s base assertion was amplified in further statements:

“What distinguishes this text object from others? I would argue that a text is a text because it is massively addressable at different levels of scale. Addressable here means that one can query a position within the text at a certain level of abstraction.”

And he draws our attention to the analytic continuity that can be achieved by abstracting to higher and lower levels of conceptual reference:

“The book or physical instance, then, is one of many levels of address. Backing out into a larger population, we might take a genre of works to be the relevant level of address. Or we could talk about individual lines of print; all the nouns in every line; every third character in every third line. All of this variation implies massive flexibility in levels of address.”

These two quotes bring into focus a distinction between general possibilities — in terms of where this line of thinking can lead — and the more specific sense of encouraging a broadening of the fields of literary and historic interpretation to include statistical analysis to be taken in concert with its more traditional qualitative interpretive analyses.

In “Text as Object II: Object-Oriented Philosophy. Criticism?” Witmore introduces the simplified example of a computational linguistics model in which each model instance of a Text can be expressed as a unique vector through the hypothetical matrix of all possible words. Witmore uses this hand-drawn figure to illustrate his point.

How far can we take Witmore’s idea of “text as massively addressable object”? In particular, how do we identify and focus on the “trajectories” of the multitude of vectors through meta-tables, not just of all possible words, but through the multi-matrices of atomic model elements of all possible conceptual models that can be brought to the examination of text.

To understand “massively addressable,” it is helpful to explore Witmore’s “Text as Objects: Object Oriented Philosophy. And Criticism?” (Part I and Part II).

OOPS! This is not your Mother’s Object-Oriented

Tom Tom Club: “Wordy Rappinghood” — They say you remember your first. My first ontologist was a fellow consultant in the IBM Object Technology Practice that we called “Doug the Librarian” because he didn't code. He “just” modeled. But with his Library and Information Sciences training and business process modeling experience, Doug was a pleasure to “pair program” with on client engagements. In order for our “executable business model” skunkworks to create specific instances of client’s business models, we had to first be very rigorous about understanding EXACTLY what everybody meant by the words they used. And with words agreed upon, to next understand the concepts, relations, and processes that were expressed by these words. Driving out unspoken assumptions and contradictions was required to create executable models. If I had to truly understand the deep implications of OO Philosophy rather than glean its superficial similarities to OO Software Design, I’d want Doug along for the ride.

A word of caution… A dive into Witmore’s two-part “Text as Objects” posts may wend you into familiar-yet-strange territory. Familiar, that is, if you have experience designing and writing Object-Oriented software as I do. Strange, in the sense that Witmore dips into the esoteric realm of contemporary philosophy to examine the implications and distinctions between so-named Object Oriented Philosophy and that of the Speculative Realists as articulated by the competing respective perspectives of Bruno Latour and Gerald Harmon. (For an unfiltered Deep Dive into this debate, enjoy an Open Access copy of Harmon’s “Prince of Networks: Bruno Latour and Metaphysics.”)

Suffice to say, object-oriented philosophy and object-oriented software design have roots in our uniquely human evolved set of core cognitive “ways of thinking.” Exploring the interesting distinction between the OO-ness of these domains is beyond the scope of this article. I will, however, borrow Witmore’s “T1-T2” simplified example from “Text As Object II” to unwind the way in which we can provide a basis vector to organize the types and interplay of the various levels of Text’s “massively addressable” levels of scale and abstraction.

Text Bonanza! On Sahle Now!

Patrick Sahle presented his pluralistic metamodel of Text as part of a marathon workshop in 2012 at which 40+ scholars surveyed and commiserated on the ways they might collaborate and communicate to advance the domain of Digital Humanities. Sahle used this graphic to array the various lenses, or points of view, that researchers may “look through” when studying texts. His six “T(sub X)” metamodel partitions the study of Text within six primary categoies; T(sub L) being ‘Text as Language’ together with ‘Text as Work’, ‘Text as Semantics’, Text as Visual’ (object), ‘Text as Document’, and ‘Text as Graph’. Although the audio is not the best, here is a link to the video of his inspiring “Modeling Transcription” presentation. (Note: Low audio first 6 minutes, all brilliant.) The video can be supplemented by direct viewing of his slides (PDF) or a (sometimes rough) transcription of his insightful presentation.

Before looking at how we can place Witmore’s “T1-T2” example in context, it is helpful to take a hop back to March 2012 to virtually attend the 3-day “Knowledge Organization and Data Modeling in the Humanities” workshop held at Brown University. Fortunately, the workshop organizers did a great job of capturing and freely publishing the full video presentations, together with slides and transcriptions. One insightful presentation at this Digital Humanities modeling marathon is especially relevant to our reading of Witmore’s “Text as Object..” articles.

Patrick Sahle’s “Modeling Transcription” presentation provides a pluralistic model of Text — a metamodel of sorts, graphically depicted in the annotated sidebar image. Sahle’s model provides an insightful basis vector along which we can enumerate and organize the multitude of modeling abstractions that cover the many ways we humans (and increasingly, our agent-based soft machines) access and understand Text.

This is the most evocative image I could find to reflect Sahle’s interest in digital paleography. UCSC Experimental Theater 2010 production “Stop the Press!” was an audience-interactive piece about the demise of print and the rise of digital media. I wish I could share a link to a video of the performance which sounds tailor-made for this context. We can instead enjoy this amazing photo, “Cyber-Illuminated Manuscript,” by Steve DiBartolomeo.

Sahle is an energetic member of the Cologne Center for eHumanities (CCEH), the Data Center for the Digital Humanities (DCH), and teaching/researcher at the Institute of Documentology and Scholarly Editing (IDE). While rooted in his focus on digital paleography — thereby explaining “Transcription” in the session’s title —Sahle’s wide-ranging context-setting remarks are inspirational to the broad domain of Digital Humanities.

Sahle enumerates six “degrees” along which we can radiate and inter-relate the ways we think about, study, create, manipulate, etc., text as cultural artifact: 1) Text(sub L) as Linguistic code, 2) Text(sub W) as a Work, 3) Text(sub S) as Semantic meaning, idea, content, 4) Text(sub V) as Visual object, 5) Text(sub D) as Document, and 6) Text(sub G) as a version or set of Graphs. While each of these sub-domains stands on its own as a concentration of research activity, Sahle uses his pictographic radial model within his presentation to beautifully showcase the interplay and overlaps of examples of applied research within these framing concentrations.

Witmore vis-à-vis Sahle

Let’s “unwind” Sahle’s model to lay it along a Z-axis in relation to Witmore’s “T1-T2” example. This will help us to visualize and organize those “massively addressable” levels of abstraction at the many levels of scale that open up as cultural heritage texts are increasingly available as digital computational objects.

The first thing you might notice is that we’ve provided some instant context, placing Witmore’s highly simplified “T1-T2” example as a “first page” in the “ream” of model-instance “slices” that are found in the T(sub L) portion of Sahle’s scale covering all possible levels of abstraction to which we might subject Witmore’s T1 and T2 hypothetical 1,000-word texts.

It is instructive to read some more of Witmore’s comments in this context:

Now a mathematically-­minded critic might say the following: Table 1 (That is, the first “page” in our “ream” of Sahle’s T(sub L) ways of looking at the T1 document.) is a topologically flat representation of all possible words in English, arrayed in a two­-dimensional matrix. The text T1 is a vector through that table, a needle that carries the “thread” through various squares on the surface, like someone embroidering a quilt. One possible way of describing the text, then, would be to chart its movement through this space, like a series of stitches.
Generalizations about the syntax and meaning of that continuously threading line would be generalizations about two things: the sequence of stitches and the significance of different regions in the underlying quilt matrix. I have arranged the words alphabetically in this table, which means that a “stitch history” of movements around the table would not be very revealing. But the table could be rendered in many other ways (it could be rendered three­ or multi­-dimensionally, for example)

In the figure at the top of this section, I accentuated Witmore’s “stitching thread” metaphor by shading the “pierce points” of the “vector through all possible words” being the plane of the metaphorical quilt. From a graph theoretic perspective, the “thread” between these shaded cells maps relationship vectors/edges between the word nodes in a graph that complements this visualization.

Under-couch fuzzball or Instance-specific Metamodel “fingerprint”? Yan Cui, inset left, is a brilliant game designer/developer at GameSys. His recent awesome blog post, “Modelling game economy with Neo4j,” is a beautiful example of a metamodel graph. It is not running as a subgraph within the game, nor is it dynamically connected to monitor and adjust the global player economy. The game developers do periodic static analysis to visualize and balance the game’s economy. Here, Cui shares an image showing the interconnection of Quests with the game economy, revealing a multitude of “impact paths” that any tweak to an item in the economy might have. This image is an “instance fingerprint” of one “version” of the game’s economy based on the metamodel described in his article. When developers tweak the economy, a new “fingerprint” is generated. Graph theoretically, there is a “delta” — an easily computed, exact difference — between any instance of the economy metamodel and another. This is the same way that FactMiners’ Fact Clouds will have metamodel subgraph “fingerprints.” In the case of the Softalk magazine archive, each issue will have its distinct fingerprint. “Delta dives” will be among the new pastimes of Digital Humanities Researchers of the Future. :-)

Using this simplified example, Witmore goes on to describe how more realistic computational linguistics conceptual models can be used to add new layers of exploration and interpretation of Text, especially Text as digital computational object. In these and any other use cases that you can imagine, any particular Text as Object (of study) will have a unique “fingerprint”/signature when expressed/abstracted through the lens of any specific conceptual reference or metamodel.

We might, for example, examine enough sample documents with a proposed metamodel that would allow us to precisely identify Italian from English sonnets. In any case, the metamodel is used to constrain the view onto the Text as Object of study, and accumulated studies of multiple texts yield an aggregated dataset of comparative signature/fingerprints. Understanding the delta between these signatures is at the heart of the new avenues of inquiry opened by Witmore’s encouragement to embrace massive addressability.

As the “devil is in the detail,” virtually any metamodel with sufficient descriptive and/or explanatory power is going to be a multi-dimensional hypergraph that stretches our visual metaphors helping to frame this discussion.

For example, as we build the metamodel subgraph to model the complex document structure and depicted content of Softalk magazine, the “fingerprint” of each issue will be a Gordian Knot of monumental proportion. Each instance — an issue of the magazine — will interweave massive numbers of model element instances prescribed by the META:Structure and META:Content partitions of the metamodel subgraph.

Hold that multi-dimensional, unique model “fingerprint” idea in mind for a moment. We can use this mental image to illustrate the kind of previously untenable research and interpretation that will open up when we start to address Witmore’s cited massive levels of abstraction and scale now open to research — the kind of research, for example, that will be possible through FactMiners’ Fact Cloud of Softalk magazine.

Is there a Softalk “Selfie”?

Let’s take, for example, the proposition that Softalk magazine was such a singularly influential voice in the ecosystem of early microcomputing that early content may well have “Future Echos” in later issues… In other words, can we find a “selfie” of Softalk in Softalk? Is it the editorial content, advertising, open community opinions, or some other modelable “ripple” in the multi-issue dataset that predicts future issue content? Once we can statistically predict such future changes, does our metamodel have explanatory power or is it descriptive?

FactMiners’ Fact Cloud metamodel will put our “fingerprint pages” in the T(sub S), Text as Semantic meaning, portion of the Sayle’s range of text study perspectives. It will be the rich inter-connection of our Document Structure and Content Depiction models that will dramatically reduce the “time and effort to result” for the kind of scholarly research studies described in this section of my article.

Consider a hypothetical research study by Journalism historians as part of our Softalk Selfie Search. Let’s examine the performance of software in the magazine’s famous Top 30 lists. Does the movement of software within these lists have any relationship to the coverage of these products in editorial and advertising content? Let’s be sure to account for a reasonable number of subtle advertising placement and frequency dynamics; the type of editorial coverage and reputations of writers, reviewers, etc. And we’ll want to take into account external influences in the overall economy by incorporating available economic Open Data.

Even just the prerequisite data gathering and analysis requirement of this proposed study is daunting — a strict square-inch measure of the extent of all editorial and advertising coverage in nearly 10,000 pages of commercial magazine content. Imagine the extent of investigative activity ardent graduate students or research assistants would need to tirelessly measure, compile, and synthesize in order to pursue this study’s research objective.

With a fully-realized FactMiner’s Fact Cloud of Softalk magazine available, however, this study becomes an afternoon exploration consisting of a series of insightful queries into the Fact Cloud. This hypothetical researcher is experiencing the tangible benefit of having ready access to Witmore’s “massively addressable” levels of scale and abstraction that are now available for exploration within Digital Humanities research.

Rainman Meet Sherlock — Cognitive Computing & the Digital Humanities

“Okay.” the skeptical among you might say, “Other than eliminating drone-work by graduate assistants at research Universities, and casting a more-appreciative light on the data-geeks who sit in the back of department meetings, is that all there is to be gained by developing the means to map and explore these nether-regions of Text’s massive addressability?”

Thankfully the benefit of exploring these new frontiers of Digital Humanities research will not be limited to cost-savings in domain-specific research activities. The confluence of emerging artificial intelligence technologies with the Humanities’ many and varied explicit conceptual reference models provides fertile ground for the development of innovative software technologies that will transform Life in the 21st Century.

I’ll wrap this piece up with an example describing the intersection of the two seemingly disparate domains of Witmore’s T(sub L) computational or statistical literary linguistic studies with FactMiners’ T(sub S) focus on semantics, meaning, and content depiction. The proposed integration of these two “emerging levels of massive addressability” of Text is just one example of potential contributions by the Digital Humanities to the multi-disciplinary Computer Science domain of Cognitive Computing.

Fortunately, we don’t have to delve too deeply into the details of this final example as I have written an earlier piece that describes this applied research agenda more fully. “Inside the FactMiners’ Brain — Rainman Meet Sherlock” describes the underlying “yin-yang” of cognitive processes that I metaphorically cast as characters drawn from film and literature. I then encourage readers to consider the synergy of these legendary characters’ complementary skills and Way of Being as Rainman and Sherlock Holmes work together on an investigation of, for our purpose here, a staggering load of Texts intimately related to some Victorian crime scene.

In considering the implications of this prior article, I’ll weave the final thread in the tale explaining my body’s readiness for that “Killing Me Softly” ideagasm I experienced when reading Michael Witmore’s “Text: A Massively Addressable Object.”

From the Visualizing English Print project overview, a splatterplot positioning the acts of Shakespeare’s plays along the axes of “Tragedyness” and “Comedyness.” (My annotation to emphasize the axes assignment along with an homage to Terry Gilliam.)

Witmore’s article is found on the WineDarkSea.org blog which he co-authors with fellow computational literary linguist, Jonathan Hope of the University of Strathclyde, Glasgow. Together Witmore and Hope are deeply involved in a diverse group of researchers doing amazing work, primarily through the Visualizing English Print (VEP) project. This project’s work is at the forefront of T(sub L) research creating software tools to perform, together with best practices to interpret, the results of Rainman-style investigation of our “crime scene” corpus of Texts.

Rainman here is not a simple savant with limited utility, but rather a necessary Superhero with special abilities to perform the daunting first orders of business in cognition, that is perception leading to action. These essential first steps are the ‘Observe’ and ‘Orient’ parts of the cognitive modeling concept of the OODA Loop — Observe, Orient, Decide and Act. Initially developed as an effective metamodel for articulating military strategy, the OODA Loop has found “kinder, gentler” application in diverse fields, most notably in business and, for our purposes, cognitive science.

Our T(sub L) Superhero Rainman fits comfortably in this cognitive computing role. His superpowers allow him to perform the essential, mind-numbingly voluminous job of obsessively, statistically, algorithmically, classifying, counting and reporting the staggering sense of order and meaning that he experiences when exposed to new information.

In addition to the common visualization of the cyclic nature of the OODA Loop, I’ve included a related interpretation, the OODA Funnel to reflect its “plunge” and “squish” operations. These viscerally-appropriate terms were first used by David Gelernter, in Mirror Worlds (1993), to describe the incredible challenge of intake and reduction involved in real-time processing of a relentless stream of raw information into a tractable form in order to transition to the culminating Decide and Act steps in the OODA Loop.

In order to plunge and squish order out of chaos, Rainman could benefit from consultation with Sherlock; T(sub L) Language exploration informed by, and in a virtuous circle informing, T(sub S) semantic models — for example, FactMiners’ Fact Clouds — of the Texts under investigation. That is, Rainman meets, and works hand-in-hand with, Sherlock; T(sub L)<=>T(sub S).

The Digital Humanities are bursting with models. The #cidocCRM, the Conceptual Reference Model for Museums, is an obvious example. However, the full range of refined ontologies, metadata standards, text encoding guidelines, and other formalized semiotic systems in the Humanities are unique and important resources. Digital Humanities’ “subject matter experts” are in an ideal position to become Cognitive Computing researchers new BFFs (Best Friends Forever).

In addition to all manner of domain-specific conceptual models and subject matter experts in these domains, we have huge and growing digital datasets specific to those domains. Whether the digitized complete works of Shakespeare or the run of Softalk magazines, our on-line digital datasets are much-studied “non-moving targets” ideal as “proving grounds” for the development of Cognitive Computing technologies.

A Holy Grail for the Cognitive Computing community is the creation of an agent-based software exoskeleton of the Empowered Individual — a software “assist” to speed up our OODA cycle. We’ll need this increasingly fast and efficient activity loop to cope with, and participate in, the hyper-speed 24/7 world of Life in the 21st Century.

It will certainly be exciting to be a member of the Digital Humanities community that helps to envision and create the Cognitive Computing technologies that will improve our daily lives in the 21st Century… A Journey that we will take by first forging ahead to wherever our ability to identify and understand “massively addressable” Text may take us! :-)

Now if you will excuse me, I feel another ideagasm coming on…

-: Jim Salmons :-
9 April 2015
Cedar Rapids, Iowa


Afterword: I Can See for Miles…and Miles…

If I had to boil my thesis down to a Tweet regarding the potential for the Digital Humanities to be the incubator for breakthrough social and technology innovations that will help shape our daily lives as Empowered Individuals in the 21st Century, it would be this:

We grind the Lens to see the Future by first turning it on the Past.