Build an AI Assistant with Wolfram Alpha and Wikipedia in Python

5 min readNov 25, 2017

Build an AI Assistant Wolfram Alpha and Wikipedia in Python

Wolfram Alpha is a computational search engine that tends to evaluate what the user asks. Imagine asking a question like “What is the current weather in London” or “Who is the president of United State of America”. Wolfram Alpha will be able to evaluate the question and respond with an answer like “15 degrees centigrade” or “Donald Trump”.

Wikipedia, however, is a search engine that unlike Wolfram, does not compute or evaluate the question but rather searches for the keywords in the query. For example, Wikipedia cannot answer the questions like “What is the current weather in London” or “Who is the president of United State of America” but can search for keywords like “Donald Trump” or “London”.

In this tutorial, these two platforms (Wikipedia & Wolfram) will be combined to build an intelligent assistant using python programming language

Things we need

Make sure you have python installed

If you prefer using a virtual environment, you can find a tutorial here on how to create one

Get Wolfram Alpha App ID

You can register on the Developer’s Portal to create an AppID. (Note: This ID will be deleted)

Application Workflow

User’s input will be passed to Wolfram Alpha for processing. if a result is obtained, the result will be returned to the user. If no result is obtained, an interpretation of the input is used as a keyword(s) for Wikipedia query.

Lets start coding

Let’s begin by installing all the required python packages using PIP

pip install wolframalpha
pip install wikipedia
pip install requests

Create a python file and open it with any code editor of your choice
Import the pre-installed packages

import wolframalpha
import wikipedia
import requests

Implementing Wikipedia Search

Let’s create a function “search_wiki” that takes the keyword as parameter

# method that search wikipedia... 
def search_wiki(keyword=''):
  # running the query
  searchResults = wikipedia.search(keyword)
  # If there is no result, print no result
  if not searchResults:
    print("No result from Wikipedia")
    return
  # Search for page... try block 
  try:
    page = wikipedia.page(searchResults[0])
  except wikipedia.DisambiguationError, err:
    # Select the first item in the list
    page = wikipedia.page(err.options[0])
  #encoding the response to utf-8
  wikiTitle = str(page.title.encode('utf-8'))
  wikiSummary = str(page.summary.encode('utf-8'))
  # printing the result
  print(wikiSummary)

The wikipedia.DisambiguationError occurs when Wikipedia returns multiple results as shown below. Therefore, the first result (at index=0) will be selected
wikipedia.DisambiguationError:
“Trump” may refer to:
Donald Trump
Trump (card games)
…
Tromp (disambiguation)

Implementing Wolfram Alpha Search

Create an instance of wolfram alpha client by passing the AppID to its class constructor

appId = ‘APER4E-58XJGHAVAK’
client = wolframalpha.Client(appId)

The image below shows a sample response returned by Wolfram Alpha. The important keys are: “@success”, “@numpods” and “pod”

“@success”: This means that Wolfram Alpha was able to resolve the query
“@numpods”: Is the number of results returned
“pod”: Is a list containing the different results. This can also contain “subpods”

The first element of the pod list “pod[0]” is the query interpretation and the first subpod element has a key “plaintext” containing the interpreted result
The second element of the pod “pod[1]” is the response that has the highest confidence value (weight). Similarly, It has a subpod with key “plaintext” containing the answers.

Note: Only “pod[1]” with key “primary” as “true” or “title” as “Result or Definition” is considered as the result

So, let’s create a method “search” and pass the “search text” as a parameter.

def search(text=''):
  res = client.query(text)
  # Wolfram cannot resolve the question
  if res['@success'] == 'false':
     print('Question cannot be resolved')
  # Wolfram was able to resolve question
  else:
    result = ''
    # pod[0] is the question
    pod0 = res['pod'][0]
    # pod[1] may contains the answer
    pod1 = res['pod'][1]
    # checking if pod1 has primary=true or title=result|definition
    if (('definition' in pod1['@title'].lower()) or ('result' in  pod1['@title'].lower()) or (pod1.get('@primary','false') == 'true')):
      # extracting result from pod1
      result = resolveListOrDict(pod1['subpod'])
      print(result)
    else:
      # extracting wolfram question interpretation from pod0
      question = resolveListOrDict(pod0['subpod'])
      # removing unnecessary parenthesis
      question = removeBrackets(question)
      # searching for response from wikipedia
      search_wiki(question)

Extracting Item from Pod — Resolving List or Dictionary Issue

If the pod has several subpods, then we select the first element of the subpod and return the value of the key “plaintext”. Else, we just return the value of the key “plaintext”

def resolveListOrDict(variable):
  if isinstance(variable, list):
    return variable[0][‘plaintext’]
  else:
    return variable[‘plaintext’]

Remove Parenthesis (Brackets)

Here, we are splitting the bracket from the text and selecting the first item e.g. “Barack Obama (Politician)” will return “Barack Obama”

def removeBrackets(variable):
  return variable.split(‘(‘)[0]

Enhancing the Search Result with Primary Image

It will be better if we can attach a primary image to the search result. For example, searching for “Albert Einstein” will return both text and his image in the result. To get the primary image of a query from Wikipedia, one needs to access it via a REST endpoint: (titles = Keyword)

https://en.wikipedia.org/w/api.php?action=query&titles=Nigeria&format=json&piprop=original&prop=pageimages

The “pages” dictionary may contain zero or more items. Usually, the first item is the primary image

def primaryImage(title=''):
    url = 'http://en.wikipedia.org/w/api.php'
    data = {'action':'query', 'prop':'pageimages','format':'json','piprop':'original','titles':title}
    try:
        res = requests.get(url, params=data)
        key = res.json()['query']['pages'].keys()[0]
        imageUrl = res.json()['query']['pages'][key]['original']['source']
        print(imageUrl)
    except Exception, err:
        print('Exception while finding image:= '+str(err))

Full Code Listing

Full code can be found on GitHub