Build an AI Assistant with Wolfram Alpha and Wikipedia in Python

Salisu Wada
5 min readNov 25, 2017

--

Build an AI Assistant Wolfram Alpha and Wikipedia in Python

Wolfram Alpha is a computational search engine that tends to evaluate what the user asks. Imagine asking a question like “What is the current weather in London” or “Who is the president of United State of America”. Wolfram Alpha will be able to evaluate the question and respond with an answer like “15 degrees centigrade” or “Donald Trump”.

Wikipedia, however, is a search engine that unlike Wolfram, does not compute or evaluate the question but rather searches for the keywords in the query. For example, Wikipedia cannot answer the questions like “What is the current weather in London” or “Who is the president of United State of America” but can search for keywords like “Donald Trump” or “London”.

In this tutorial, these two platforms (Wikipedia & Wolfram) will be combined to build an intelligent assistant using python programming language

Things we need

  • Make sure you have python installed

If you prefer using a virtual environment, you can find a tutorial here on how to create one

Get Wolfram Alpha App ID

You can register on the Developer’s Portal to create an AppID. (Note: This ID will be deleted)

Wolfram Alpha App ID

Application Workflow

User’s input will be passed to Wolfram Alpha for processing. if a result is obtained, the result will be returned to the user. If no result is obtained, an interpretation of the input is used as a keyword(s) for Wikipedia query.

Application Workflow

Lets start coding

Let’s begin by installing all the required python packages using PIP

pip install wolframalpha
pip install wikipedia
pip install requests
  • Create a python file and open it with any code editor of your choice
  • Import the pre-installed packages
import wolframalpha
import wikipedia
import requests

Implementing Wikipedia Search

Let’s create a function “search_wiki” that takes the keyword as parameter

# method that search wikipedia... 
def search_wiki(keyword=''):
# running the query
searchResults = wikipedia.search(keyword)
# If there is no result, print no result
if not searchResults:
print("No result from Wikipedia")
return
# Search for page... try block
try:
page = wikipedia.page(searchResults[0])
except wikipedia.DisambiguationError, err:
# Select the first item in the list
page = wikipedia.page(err.options[0])
#encoding the response to utf-8
wikiTitle = str(page.title.encode('utf-8'))
wikiSummary = str(page.summary.encode('utf-8'))
# printing the result
print(wikiSummary)

The wikipedia.DisambiguationError occurs when Wikipedia returns multiple results as shown below. Therefore, the first result (at index=0) will be selected

wikipedia.DisambiguationError:

“Trump” may refer to:
Donald Trump
Trump (card games)

Tromp (disambiguation)

Implementing Wolfram Alpha Search

Create an instance of wolfram alpha client by passing the AppID to its class constructor

appId = ‘APER4E-58XJGHAVAK’
client = wolframalpha.Client(appId)

The image below shows a sample response returned by Wolfram Alpha. The important keys are: “@success”, “@numpods” and “pod”

  1. “@success”: This means that Wolfram Alpha was able to resolve the query
  2. “@numpods”: Is the number of results returned
  3. “pod”: Is a list containing the different results. This can also contain “subpods”
Result Sample from Wolfram Query
  • The first element of the pod list “pod[0]” is the query interpretation and the first subpod element has a key “plaintext” containing the interpreted result
  • The second element of the pod “pod[1]” is the response that has the highest confidence value (weight). Similarly, It has a subpod with key “plaintext” containing the answers.

Note: Only “pod[1]” with key “primary” as “true” or “title” as “Result or Definition” is considered as the result

You can read more about the “pods” and “subpods” here

So, let’s create a method “search” and pass the “search text” as a parameter.

def search(text=''):
res = client.query(text)
# Wolfram cannot resolve the question
if res['@success'] == 'false':
print('Question cannot be resolved')
# Wolfram was able to resolve question
else:
result = ''
# pod[0] is the question
pod0 = res['pod'][0]
# pod[1] may contains the answer
pod1 = res['pod'][1]
# checking if pod1 has primary=true or title=result|definition
if (('definition' in pod1['@title'].lower()) or ('result' in pod1['@title'].lower()) or (pod1.get('@primary','false') == 'true')):
# extracting result from pod1
result = resolveListOrDict(pod1['subpod'])
print(result)
else:
# extracting wolfram question interpretation from pod0
question = resolveListOrDict(pod0['subpod'])
# removing unnecessary parenthesis
question = removeBrackets(question)
# searching for response from wikipedia
search_wiki(question)

Extracting Item from Pod — Resolving List or Dictionary Issue

If the pod has several subpods, then we select the first element of the subpod and return the value of the key “plaintext”. Else, we just return the value of the key “plaintext”

def resolveListOrDict(variable):
if isinstance(variable, list):
return variable[0][‘plaintext’]
else:
return variable[‘plaintext’]

Remove Parenthesis (Brackets)

Here, we are splitting the bracket from the text and selecting the first item e.g. “Barack Obama (Politician)” will return “Barack Obama”

def removeBrackets(variable):
return variable.split(‘(‘)[0]

Enhancing the Search Result with Primary Image

It will be better if we can attach a primary image to the search result. For example, searching for “Albert Einstein” will return both text and his image in the result. To get the primary image of a query from Wikipedia, one needs to access it via a REST endpoint: (titles = Keyword)

https://en.wikipedia.org/w/api.php?action=query&titles=Nigeria&format=json&piprop=original&prop=pageimages
Result format

The “pages” dictionary may contain zero or more items. Usually, the first item is the primary image

def primaryImage(title=''):
url = 'http://en.wikipedia.org/w/api.php'
data = {'action':'query', 'prop':'pageimages','format':'json','piprop':'original','titles':title}
try:
res = requests.get(url, params=data)
key = res.json()['query']['pages'].keys()[0]
imageUrl = res.json()['query']['pages'][key]['original']['source']
print(imageUrl)
except Exception, err:
print('Exception while finding image:= '+str(err))

Full Code Listing

Full code can be found on GitHub

--

--

Salisu Wada

A busy researcher. Squeezing the little time I have to write #code