Introducing RosaeNLG, an open-source Natural Language Generation library

Ludan Stoecklé
Voice Tech Podcast
Published in
3 min readSep 18, 2019

RosaeNLG is an open-source (Apache 2.0) Natural Language Generation (NLG) library written in JavaScript, based on the Pug template engine. It supports any language and is currently shipped with resources for English, French, German, Italian and Spanish languages.

RosaeNLG’s logo, thanks to Denis Aulas

RosaeNLG is the first open-source NLG library which is easy to use and complete enough to write real life NLG applications. It can be run both server side (node.js) and client side, in a browser.

What is NLG?

NLG stands for Natural Language Generation. The goal of NLG is to automatically generate texts from structured data, with the same quality as if a human being had written it.

NLG use cases are the automation of the production of repetitive reports and texts based on structured data. For instance:

  • describe a product and its features based on its characteristics
  • produce structured reports in the financial industry e.g. risk reports, financial fund performance
  • generate personalized emails
  • generate annotated training data for NLP and chatbots!

You do not need a Natural Language Generator to generate simple texts (a standard template engine does the job), but generating sophisticated texts quickly becomes tricky. The practical issues of NLG are:

  • the use of synonyms and referring expressions to avoid repetitions
  • the proper agreement of verbs, nouns, adjectives (depending on the output language)
  • proper punctuation and spacing (also depending on the output language)
  • the ability to properly list things (xxx, yyy and zzz)

Commercial NLG systems

NLG has existed for a long time as an academic subject and is commercially available since a decade. Main commercial NLG actors are Narrative Science, Arria NLG, Automated Insights, Yseop (I have been working for and with Yseop for a long time), AX Semantics and probably some others.

There are open source offerings for NLG. Still, they are generally outdated or not maintained, or focus on one very specific NLG feature. SimpleNLG is the most advanced one. More on that: a blog post of Ehud Reiter about why there are no open-source generators.

That’s why I decided to write my own Natural Language Generator, and make it open-source.

RosaeNLG’s characteristics as a Natural Language Generator

There are various techniques to generate texts. Template based generators use templates, which are a mix of static content (plain text) and dynamic content. Think of PHP etc. In template-based NLG system, most of the time you don’t really care about the exact grammatical structure of the text (subject, verb, etc.), and therefore you don’t need to be a linguist to use those systems.

The characteristics of RosaeNLG are:

  • template-based: based on Pug, a templating engine.
  • based on modern & mature technologies: JavaScript & Pug
  • complete enough to build real life projects
  • fast
  • open-source of course (Apache 2.0)
  • works both in node.js and in the browser, for client side NLG
  • supports any language and currently shipped with resources for English, French, German, Italian and Spanish languages

RosaeNLG templates are basically pug templates where you use RosaeNLG structures and functions to complete the standard pug syntax:

RosaeNLG quickstart for node.js

This will output <p>I love apples, bananas and tomatoes.</p>. You can also test code directly in your browser.

Yseop template generator

Yseop is a well-known NLG software vendor. While RosaeNLG is a production-grade NLG software, some users may prefer to switch to a commercial software product like Yseop for various reasons like:

  • Support, Consulting, Professional Services
  • Additional features, friendly user interface
  • Additional linguistic resources (Spanish, Dutch, Japanese etc.)

RosaeNLG is shipped with a migration feature which is aimed at users who started creating RosaeNLG templates and wish to switch their existing templates to Yseop.

--

--