cube with letters N L G
Automatically transforms data into written narrative

Generate newsletter automatically using NLG: Part 1

Rajesh Gudikoti
IBM Data Science in Practice
5 min readMar 18, 2019

--

In this post, I will be showcasing the power of natural language generation, or NLG, with the business case of generating a newsletter.

The general practice when we create a newsletter is to employ a tool with individual entries that include:

1. name of the event

2. type of event

3. topic

4. speaker name/s

5. location where event is held

6. number of people attended the event

There are other fields like expected number of people for the event, etc., but for simplicity, I would refer only above 6 fields.

When a publishing team asks to provide a narrative of the event which we handled(completed), we can refer to the tool and create a brief text from the data within. Normally, when such a request comes, I provide a description as below.

We conducted a meetup event on “March 20, 2019” in “Bangalore”. “Rajesh Gudikoti” handled session on “Natural Language Processing”. The number of people attended was “40”.

The details(data) mentioned in quotes are picked from the tool or database.

Problem Statement : How can we go about generating an event narrative by automatically picking the data from database?

This is where Natural Language Generation(NLG) will help us. Natural Language Generation is a technique to produce narrative out of structured dataset.

sticky notes detailing differences between NLP, NLU, and NLG
NLP, NLU and NLG (Courtesy: towardsdatascience.com)
pyramid showing increasing business value thresholds for natural language generation
NLG variants

For simplicity I will be accessing data from an Excel sheet instead of from a database.

screenshot of a snippet of an Excel spreadsheet
Event Records
# using pandas dataframe
xl = pd.ExcelFile("/Users/rajeshgudikoti/Documents/rajesh/learning/nlp/nlg/Event Report-Feb 25.xls")
df = xl.parse()

In case I need to generate narratives for multiple events in bulk, I then can pick the events which are completed. For generating a narrative for individual event, this may not be required.

data = df.loc[(df[event status'] == 'Completed')]
print(data.head()[:2])
cartoonish image of person with beard in bed on a computer
Lazy Man thinking Automation of his task

Now, it is time to use NLG (Natural Language Generation)!! Instead of typing a narrative out, I would like to generate a narrative.

I have provided some pseudo code for reference.

clause_1 = nlgFactory.createClause()sub1 = nlgFactory.createNounPhrase(speakers[index])verb1 = nlgFactory.createVerbPhrase("conduct")

If the event type is a webinar type of event, I do not need the location of event. For other event types, like meetups or conferences, I need to mention the location where the event was held.

if(project_event_type[index] !='Webinar'):  at_preposition = nlgFactory.createPrepositionPhrase("at")

Below is the generated text using NLG:

Rajesh Gudikoti <xxx@in.ibm.com> conducted "Natural Language Processing" Meetup on 2019-02-20T00:00:00.000000000 at Bangalore.

Similarly, I can generate a second sentence.

sub2 = nlgFactory.createNounPhrase('number of developers')sub2.setDeterminer("the")

And, our next generated text:

The number of developers attended is 30.0.

I will join both above sentences and create a paragraph to get a complete narrative.

Multiple Sections

We can create multiple sections and multiple paragraphs within a section. If I handled two events and need to generate a narrative for both of them, I would go for multiple sections. I have given sample code, you can visit here to view complete code

# ******** paragraph 1 *****************para1 = nlgFactory.createParagraph()
para1.addComponent(sentence_1)
para1.addComponent(sentence_2)
#print(‘\n \n developer advocacy newsletter creation *************\n’)
print(realiser.realise(para1).getRealisation())# ******** paragraph 2*****************
para2 = nlgFactory.createParagraph()
para2.addComponent(sentence_3)
para2.addComponent(sentence_4)
#print('\n \n developer advocacy newsletter creation *************\n')
print(realiser.realise(para2).getRealisation())# *************** Section 1 **********
section1 = nlgFactory.createSection()
#print(type(para1), type(para[0]))
section1.addComponent(para1)
print(realiser.realise(section1).getRealisation())
# ********** Section 2 ***************
section2 = nlgFactory.createSection()
#print(type(para1), type(para[0]))
section2.addComponent(para2)
print(realiser.realise(section2).getRealisation())
testDocument = nlgFactory.createDocument("Developer Newsletter Document")
#testSection = nlgFactory.createSection("Event 1-----")
#testParagraph = nlgFactory.createParagraph()
#testSentence = nlgFactory.createSentence('This is first developer newsletter')
# ************** Multiple sections **************#testParagraph.addComponent(testSentence)
#testSection.addComponent(testParagraph)
#testDocument.addComponent(testSection)
testDocument.addComponent(section1)
testDocument.addComponent(section2)
print(realiser.realise(testDocument).getRealisation())

Sample output after creating multiple sections:

##### Developer Newsletter Document

Rajesh Gudikoti<rg@in.ibm.com> conducted "Generation of Newsletter using Natural Language Generation " Webinar on 2019-02-20T00:00:00.000000000. The number of developers attended is 120.

Rajesh and Ramesh conducted IBM Cloud Paks Meetup on 12/22/2019. The number of developers attended is 30.

Additional Features

If a speaker wants to mention additional comments like “Even though it was a rainy day, people showed interest to attend the session. It was good turnout to the event”. This can be captured and appended.

Rajesh Gudikoti <xxx@in.ibm.com> conducted "Natural Language Processing" Meetup on 2019-02-20T00:00:00.000000000 at Bangalore. The number of developers attended is 30. Even though it was rainy day, people showed interest to attend the session. It was good turnout to the event.

Assume that your estimate on the number of people expected to event was 100, but actually 120 people turned out. Now NLG can interpret data and add a summarization — “It was overwhelming response to the event.”.

Rajesh Gudikoti <xxx@in.ibm.com> conducted "Natural Language Processing" Meetup on 2019-02-20T00:00:00.000000000 at Bangalore. The number of developers attended is 120. It was overwhelming response to the event.

To keep it simple, I used only a few sentences and put them in a simple text format. You can create this narrative in HTML format also.

You can find code developed for demo purposes in the Github link below, and refer to the Simplenlg section.

https://github.com/ragudiko/nlg

cartoon image of a person writing and thinking
write myself or use nlg

Conclusion: Next time you need to provide such narrative, think whether you want to use NLG to do for you. Some of other NLG use cases are

  1. NLG can provide product descriptions and categorization for online shopping and e-commerce and help personalize customer communication via chatbots.
  2. NLG could help in the filing of a report by producing regulatory content in a specific narrative.

My other articles

http://bit.ly/2x23ye3

ibm.biz/BdZtFt — ML in Procurement — Assist Procurement Expert

--

--

Rajesh Gudikoti
IBM Data Science in Practice

I have spent 2 decades in software industry working on Java, J2EE, BPM and AI/NLP/ML. I enjoy writing articles especially on NLP.