Greg Williams
Voice Tech Podcast
Published in
5 min readApr 28, 2019

--

Inside AI

Natural Language Generation

Practical Considerations for Product Managers

Natural Language Generation (“NLG”) promises to raise analytical and explanatory power to levels economically impossible to achieve by headcount alone. When imbued with carefully crafted paragraphs, sentences, and phrases that are conditionally invoked by data, NLG systems should create an unlimited number of stories that are indistinguishable from the work of trained analysts.

When approaching a NLG project, product managers should bear in mind that the four most important ingredients of a successful NLG program are writing talent, subject matter expertise, a complete and accurate set of data, and technological capability.

The Project Begins and Ends with Writing

Considering that most traditional writers reflexively resist NLG, it is understandable that product managers — rarely professional writers themselves — might approach NLG as a technological puzzle rather than a writing assignment, which is exactly what it is. Writing is the final and, indeed, the only expression of the entire endeavor.

And yet, press coverage of NLG tends to expose a view that writing as a physical, human activity, with fingers on a keyboard and coffee rings on printed drafts, is an antiquated trade, akin to breaking rocks with a hammer.

In the American folklore song, “John Henry,” the protagonist clings to his hammer even upon the advent of the steam drill. Likewise, writers tend to adhere to old school work habits, especially in the face of an apparent sentiment that their skill sets have grown quaint. They may observe that the production of an analytical paper that draws upon multiple disaggregated sources requires a human researcher, and that therefore — here is their mistake — no professional writing can be orchestrated programmatically. This conclusion does not contemplate a scenario where all necessary research is contained within a structured data set, upon which writers may build modular linguistic components to be paired with rules for assembly. Under such a circumstance, traditional writers have the opportunity to flex their skills in a way that multiplies their output. NLG is not a steam drill that eliminates the writer by making writing easy. On the contrary, it makes the writer more important by requiring ever more and creative ways to swing the hammer.

Build better voice apps. Get more articles & interviews from voice technology experts at voicetechpodcast.com

Subject Matter Expertise is Essential

In any NLG project, writing talent must be paired with subject matter expertise — ideally within the same individuals. But frequently, as when a NLG project is outsourced, the product manager must work with skilled writers who are not experts on the required subject matter, and with subject matter experts who are not skilled writers. When this happens, it is important to bring these three-legged racers together at least daily, if not more frequently, to review the prose output as it evolves. Over time, frequent meetings may result in the transfer of a modicum of subject matter expertise to the writers, but expectations here should be limited. Even if the writers are not working on multiple NLG projects for other clients, it is not reasonable to expect them to absorb a career’s worth of subject matter expertise during a series of relatively brief meetings over a few months. One might just as well expect their writing skills to be contagious like the common cold.

Within the context of NLG, the beguiling terms “machine learning” and “artificial intelligence” can lead to the misconception that if we pour enough data about any given subject into the computer hopper, the computer will become a subject matter expert. Futurists may suggest that we will someday turn to machine learning and artificial intelligence not only to do most of our writing in various business fields, but also to produce creative work, such as plays and novels. For the sake of argument, let’s concede that the futurists may eventually turn out to be right . . . but not within the timeline of any current or foreseeable NLG project, just as the next car we buy will not fly over traffic jams. Therefore NLG product managers with ambitions beyond simple declarative sentences that compare one number to another should furrow their brows at assurances that expertise can be gleaned instantly from data itself, without the extensive involvement of their companies’ subject matter experts. Indeed, an outsourced NLG project requires their constant engagement. Daily schedules should be planned accordingly.

Data Forms the Foundation

A successful NLG project has at its foundation a structured set of data that, ideally, is complete and free of errors. This is why the earliest NLG efforts began with sports statistics. With their relentless obsession with numbers and completeness, sports statisticians and record keepers reliably produce data sets that that contain neither null nor misplaced values.

As with subject matter expertise, the effort required to create a structured database, too, is sometimes dismissed with the suggestion that it can be done automatically. May we not programmatically scrape and parse web content? Yes, many terabytes of data can be amassed this way, but such efforts are not foolproof, and may result in the publication of an absurdity such as this: “Restaurant owner passed away on Friday, March 1, 2019. Restaurant owner was a resident of 07040, New Jersey, at the time of his passing.” For now, a NLG product manager should either hesitate to tie the fate of a project to the promise of a programmatically created database, or should lengthen the project timeline to account for the significant additional QA required to standardize the collected data.

Technology Conducts the Assembly

The possibility of NLG was never going to arise from the imaginations of traditional writers. Our current discussion is a tribute to the creativity and vision of computer scientists. For external publication purposes, however, NLG remains the result of applying writing talent and subject matter expertise to a structured database, with the goal of producing prose that should feel far removed from the computer science labs where it was first conceived. Technology in its current state can supplant neither human expertise nor writing talent; nor does it provide a shortcut to creating a reliably structured database with sufficient fill rates and quality levels to support NLG. Nevertheless, technology is as vital to the final prose output as stitches are to clothing. A flexible, creative technologist may use any of a number of programming languages to configure the story components and invocation rules that writers have created in cooperation with subject matter experts. The result should be prose that is suitable for external publication, enabling a company to project insights voluminously, in easily digestible text rather than in spreadsheets or data tables.

About the Author:

As Senior Vice President at Reis, Greg Williams led the Product Management and Web Content teams, working closely with the Technology, Client Services, Sales, and Economic Research teams. He is a twice-published novelist (Boomtown, Younger than Springtime) and experienced long-form journalist (New York Magazine). A graduate of the University of Virginia with a B.A. in English, and Johns Hopkins University with a M.A. from the Department of Writing Seminars, he has studied writing with National Book Award Winners John Casey and John Barth, and with Nobel Laureate J.M. Coetzee.

--

--