Automatic Natural Language Generation, the new “normal”
2016 was the 400th anniversary of the death of Miguel de Cervantes, the greatest writer in the Spanish language, and the exact same year a book written using Artificial Intelligence passed the first round in a literary competition.
Most people see this as science fiction, and many still don’t believe it is possible. However, few people know that today more than 50% of the content published online is already generated by computers and that in a few years this percentage will reach 90%.
Associated Press, Forbes, Yahoo and more recently the Washington Post are already using these technologies. Beyond media markets, e-commerce, logistics, energy, pharmaceutical and real estate companies are also automating the generation of content. Even in the field of Information and security services we have seen In-Q-TeL (the CIA’s venture capital arm) making some strategic investments in Natural Language Generation (NLG) technologies.
Back in 2008, MarketBrief, a NYC-based startup, developed the first commercial technology to generate content based on data automatically. Last year we had the opportunity to meet Chris Auer, co-founder of MarketBrief, and he told us that they were too early to the market and they had to battle against the resistance of companies to adopt their technology.
NLG industry today
Nowadays, the market of Natural Language Generation is booming and is expected to reach $1 billion by 2020. The industry is dominated by a handful of companies using different approached and technologies. Narrativa (Spain), Narrative Science (US) and Arria (Scotland) are using Artificial Intelligence (AI). Using a programmatic or rule-based approach, we can find Automated Insights (US), Syllabus (France) and AX-Semantics (Germany).
All the NLG companies using AI have spun off from university research, which is largely due to the fact that many companies are not willing to take the intrinsic high risks of working in new and highly complex technologies.
Adoption rates of NLG technologies have risen fast; however, the news industry might be the exception. The human mind is wired to prefer avoiding loss over maximizing gain and this is definitely true for the news industry where, with some important exceptions, it is still slow at adopting these kinds of technologies.
Investors have been very active in recent years where we have seen significant investments, acquisitions and even IPOs and we expect this trend to continue over the next years.
Future of NLG
Currently, most of the automated content is generated using structured data. However, only a small fraction of the world’s information is structured thereby limiting the possibilities of NLG technologies.
Today, companies such as Yatrus Analytics* and Dataminr are able to monitor social media to find and report events happening in real time. These companies are working in the field of Natural Language Processing (NLP) and we foresee a future where NLG and NLP will go hand in hand to generate narratives in real time based on unstructured data (text, pictures & videos).
The future will tell, but we believe that automation of content generation will be the “new normal” in few years, and NLG together with other technologies will help companies, governments, and society in general to be better informed about what is happening in real time.
*Narrativa and Yatrus Analytics are already working together to bring to the market real time new discovery and generation.