Designing a templating system using Natural Language Generation

Somya Anand
MindTickle
Published in
3 min readMay 12, 2019

Have you ever felt the need to design a templating system which delivers content that is more humane and dynamic in nature rather than just computerized, synthetic texts?

With the growing demand for personalized online content, automated text generation is becoming increasingly important. For instance, text generation for interactive fiction games such as The Dreamhold requires the ability to vary the information content of a text in a fine-grained and flexible way based on the situation chosen by the gamer. In this scenario, the generated text has to be very personalized and empathetic in the context of the gamer.

Another use case can be data to text generation which is becoming immensely popular in data analytics. Interpreting raw data into meaningful insights may require a more dynamic and flexible templating system which can be fine-tuned to add new variables with the growing data and related outliers.

There are some existing templating systems such as web templating languages essentially embeds the template inside a general-purpose scripting language, that supports complex conditionals, loops, access to code libraries, etc. Business rule systems, including most document composition tools, take a similar approach but focus on writing business rules rather than scripts. Adding general-purpose programming to a template language certainly makes it much more powerful and useful, and this is a sensible approach in many contexts. But the lack of any linguistic capabilities makes it difficult to build systems that reliably generate complex high-quality texts.

Following are a few drawbacks in the static templating system which does not take natural language generation into consideration:

  • A template system which can correctly handle the agreement, morphology, punctuation, reduction and other low-level phenomena to generate grammatically correct sentence is very expensive to design (in terms of programming effort).
  • Static template system might fail to dynamically create paragraphs from representations of the meaning to be conveyed by the sentence and/or its desired linguistic structure.
  • Static templates are not very flexible, as only the predefined variables can change, thus issues with maintainability.
  • Static templating systems cannot be readily used to design a sentence planning module which can incorporate text readability enrichers like aggregation, referring-expression generation, sentence formation, and lexicalization.

Then the question arises, what can we do to generate more dynamic and syntactically correct sentence?

The answer lies in designing an NLG based templating system. Growing research in natural language generation and available open source resources like SimpleNLG (A Java API which functions as a “realization engine” for Natural Language Generation architectures), TextWorld (Python framework for generating text-based games) etcetera can be used to design a natural language based templating system without having much domain knowledge.

However, Switching to designing an entire NLG system can be disadvantageous if:

  • The application does not already have a declarative domain knowledge base and/or syntactic representation of the output text.
  • Writing a linguistic realizer system has a time tradeoff as it might be very application dependent, hence not modular enough for the reusable purpose.
  • NLG is an experimental technology which implies there are high chances of nuances in existing morphology.

So, What can we do now to avoid the time and complexity effort of pure NLG based system but at the same time build a templating engine which generates grammatically correct sentences?

I propose hybrid systems which can utilize existing NLG software along with the static templating system in the following way:

  • Embed NLG-generated fragments into a template slot or that insert canned phrases into an NLG-generated matrix sentence.
  • Use NLG techniques for “high-level” operations such as content planning, but templates for the low-level realization.

Designing a system like this is a perfect trade off for time effort vs correctness of dynamically generated sentence. These systems are easy to implement, require less research expertise and can be made very modular to handle new variables or for other reusable purposes. Using an existing realizer engine along with the static templating techniques will definitely increase maintainability, text readability, or some other important attribute of the target application system.

What’s next?

  1. A Hands-on tutorial on Design data to text generation hybrid templating system using simpleNLG.
  2. An article on Key hacks and grammatical realization of sentences for designing a custom sentence realizer.

--

--

MindTickle
MindTickle

Published in MindTickle

MindTickle is the world’s leading sales readiness platform that gives you the power to ramp up new reps faster, coach them effectively, keep them updated and create a culture of sales excellence. MindTickle is also home to one of the world’s most transparent and unique culture.

Somya Anand
Somya Anand

Written by Somya Anand

Machine learning engineer@TextIQ, I am passionate about applied research and AI interpretability. I like to create things. Love for history, travel and art.