Wikipedia Factoid Bot (1 of 6): Intro and Configure Demo Code

Anthony Stevens
IBM watsonx Assistant
6 min readJan 10, 2017

This is the first in a six part series on training a bot to answer factoids. In this post, we’ll describe the basic components of answering factoids then download and configure the demo code. In the remaining posts:

Post 1: You are here
Post 2: Identify famous people as entities using Alchemy Language
Post 3: Initialize the factoid bot’s connection to Watson Conversation
Post 4: Train a factoid bot to classify the intent of a user’s query
Post 5: Extract answers from DBpedia (Wikipedia)
Post 6: Finalize the conversation flow

You can download the NodeJS factoid bot source from Github or simply follow through the tutorial and download then. But here’s a working demo of the factoid bot running on Bluemix:

Screenshot of Factoid Bot running on BLuemix

Factoids About Famous People: A Starting Point

How old is Brad Pitt?

When you want to know a quick fact about nearly anything, where do you go? With 15.6 billion page views in October 2016, it seems most people go to Wikipedia, so if we could connect our bot to Wikipedia, it could access 38 million articles in 280 languages to provide impressive factoid answering skills. But to simplify our initial solution, we’ll begin with a subset of Wikipedia which is 800,000 articles about living famous people with details such as birthdates, birthplaces, spouses and more.

However downloading an entire Wikipedia page and doing a text search would be very inefficient. Fortunately, the crowd-sourced community at DBpedia regularly extracts then structures Wikipedia’s information into RDF format so properties from every Wikipedia page can be accessed by HTTP using a SQL-like language called SPARQL.

DBpedia captures Wikipedia’s content and makes it available via SparQL

Great! We know where our answers will come from and only need to match these answers to the textual queries submitted by users. This simplifies our natural language processing (NLP) challenge to:

A. Identify the famous person of interest (entity) in our user’s text query,
B. Classify which personal details our user wants to know (intent),
C. Get the correct link to and perform a SPARQL query of DBpedia,
D. Format this into a response and possibly send error messaging to users.

We’ll use Alchemy Language’s Entity Extraction for A with the added benefit that it also returns a link to the DBpedia page for that entity (famous person). For B and D, we rely on the Watson Conversation service while C requires nothing more than standard HTTP requests executed in NodeJS. The diagram below represents these as steps 1–9 of our factoid bot’s workflow.

High Level Workflow for Factoid Conversation Bot

If intents, entities, SPARQL, or these Watson services are unfamiliar then don’t worry. We’ll go into each step in detail during the next four posts of this tutorial. For now though, let’s get your instance of the bot running.

Prior Skills/Experience Required
You‘ll need an IBM Bluemix account and know how to create services from the Watson Developer Cloud since instances of the Watson Conversation and Alchemy Language services are needed. You should have completed the introductory Conversation tutorial and be able to create intents/entities using the Watson Conversation tooling.

If any of that sounds unfamiliar, go here for the basics of getting started with Watson Conversation and perhaps complete this tutorial on creating a basic Watson chat bot in 10 minutes. You can also learn more about intents and entities in Watson Conversation through reading these excellent articles written by Simon Burns and Joe Kozhaya.

And lastly please post comments below about anything missing from this tutorial. I monitor comments daily and appreciate advice on improvements.

Configure Your Factoid Bot
Let’s start by running your own instance of the factoid bot then explore how it works. Download the factoid bot source code from GitHub then open your browser and go to the Workspaces page of your Conversation service’s tooling. Click the import workspace button (to right of the green “Create” workspace button).

Choose the Conversation workspace file that you downloaded at:
{your factoid bot directory}/conversation/factoid_bot_workspace.json, ensure “Everything (Intents, Entities, Dialog)” is selected, and press Import.

You’ll need workspace id next so click View Details in the popup that appears when you click the workspace information button (three vertical dots).

Drop-down menu for workspaces
Workspace Details

Open {your bot directory}/config/watson_config.json and replace YOUR WORKSPACE ID with your workspace id:

Conversation
"workspace_id" : "YOUR WORKSPACE ID"

Running Factoid Bot on Bluemix (skip if running locally)
If you want to run the factoid bot on Bluemix, then you’ll need to ensure the name of both your Alchemy and Conversation services in Bluemix match the names given in {your bot directory}/manifest.yml. The default names are “conversation_demo_service” and “alchemy_demo_service” so either update the manifest.yml or update your service instance names in Bluemix.

EITHER replace the Alchemy and Conversation service names with your own in the manifest.yml
OR update your Conversation and Alchemy service names in Bluemix

Now go to {your bot directory} in the Terminal window and “cf push” your app to Bluemix. If you didn’t update the service names properly in the manifest.yml, then you’ll see an error like this.

“Could not find service” error due to not naming your Conversation service properly in manifest.yml

Your factoid bot should now be running at the web page indicated at the conclusion of your successful cf push. Skip the next section unless you want to run the bot locally as well.

Running Factoid Bot Locally (skip your pushed to Bluemix)
To run the bot locally (rather than pushing to Bluemix), first execute npm install to download the required packages. If you get the error Please try running this command again as root/Administrator then execute sudo npm install instead. Next you need to provide the bot can access your services so open {your bot directory}/config/watson_config.json and replace the USERNAME/PASSWORD/API_KEY text shown below with credentials for your own services. Go here for instructions on how to access your service credentials.

Conversation
// OPTIONAL PARAMS: Used to run app locally and NOT on Bluemix
"username" : "USERNAME",
"password" : "PASSWORD"
Alchemy Language
// OPTIONAL PARAMS: Used to run app locally and NOT on Bluemix
"api_key": "YOUR API KEY"

Now go to {your bot directory} in the Terminal window and run “npm start”. You app should now be running at http://localhost:3000.

Next Steps
Now let dig into the code and learn how the factoid bot works. Go to the second post to learn how we identify famous people as entities using Alchemy Language.

--

--