A guide to your own A.I. Voice Assistant using Python !!

Saurav Pawar
May 12 · 9 min read
Voice assistants are a boon for lazy people !

Have you ever wondered how cool it would be if you had your own virtual A.I. assistant ( just like J.A.R.V.I.S), imagine how easier it would be to send emails without typing a single word, searching on Wikipedia without actually opening the web browsers, and performing many other daily tasks with the help of a single voice command.

In this tutorial, we are gonna learn the exact same thing, i.e. how to code and build your own A.I. virtual voice assistant using Python.

Before jumping into the tutorial, we need to understand what are the tasks that a virtual voice assistant can perform for us ?

  • It can send emails for you.
  • It can play music for you.
  • It can do Wikipedia searches for you.
  • It can open websites like Google, YouTube, Stackoverflow, freecodecamp, etc., in a web browser.
  • It can open your code editor or IDE with a single voice command.

And all these things can happen without you typing a single word in the browser !

Now let’s get into the actual building of our voice assistant 😎

And also yes, don’t forget to decide a name for your voice assistant beforehand :P

Setting up the environment to code:

I used Pycharm to code this up, but please feel free to use any other IDE you are comfortable with.

Firstly, we will import/install the necessary libraries :

  • pyttsx3
  • datetime
  • speech recognition
  • wikipedia
  • webbrowser
  • os.path
  • smtplib

Defining a speak function:

The first and foremost thing for an A.I. voice assistant is to speak. To make our bot speak, we will code a speak() function which takes audio as an input, and pronounces it, as an output.

def speak(audio): pass #for now, we will write the conditions later.

Now, we need audio in order to achieve proper communication between user and the assistant. So, we are going to install a module named pyttsx3.

What is pyttsx3?

  • It is a Python library which will help us to convert text format into speech format.
  • And, it also works offline, and it is also compatible with Python 2 as well Python 3.

Installation:

pip install pyttsx3

After successfully installing pyttsx3, import this module in your program.

Usage:

import pyttsx3engine = pyttsx3.init('sapi5')voices = engine.getProperty('voices') #gets you the details of the current voiceengine.setProperty('voice', voice[1].id)  # 0-male voice , 1-female voice

What is sapi5?

  • Microsoft Speech API (SAPI5) is the technology for voice recognition and synthesis, provided by Microsoft.
  • It usually helps in synthesis and recognition of voice.

What Is VoiceId?

Voice id helps us to select different voices.

  • voice[0].id = Male voice
  • voice[1].id = Female voice

Writing our own speak() function:

def speak(audio):   engine.say(audio)    engine.runAndWait() #Without this command, speech will not be audible to us.

Creating our main function:

Now, we will create a main() function, and inside this main() function, we will define our customized speak function.

if __name__=="__main__" :    speak('Hello Sir, I am Friday, your Artificial intelligence assistant. Please tell me how may I help you')

P.S. I have given my voice assistant a name — Friday 😁

Whatever you will write inside this speak() function will be completely converted into speech. Congratulations! With this, our voice assistant has its own voice, and it is ready to speak !

Coding the wishme() function:

Now, we are going to code a wishme() function, due to which our voice assistant will wish or greet us according to the time on the computer.

To provide the current time to our assistant, we need to import a module called datetime. Import this module to your program, by using the following command :

import datetime

Now, let’s start coding our wishme() function:

def wishme():
hour = int(datetime.datetime.now().hour)

Here, we have stored the integer value of the current hour or time into a variable named hour. Now, we will use this hour value inside an if-else loop.

def wishMe():
hour = int(datetime.datetime.now().hour)
if hour>=0 and hour<12:
speak("Good Morning!")

elif hour>=12 and hour<18:
speak("Good Afternoon!")

else:
speak("Good Evening!")

speak('Hello Sir, I am Friday, your Artificial intelligence assistant. Please tell me how may I help you')

Defining takeCommand() function:

The next most important thing for our voice assistant is, it should be able to take command with the help of microphone of our system. So, let’s begin to code our takeCommand() function.

With the help of our takeCommand() function, our A.I. voice assistant will also be able to return a string output by taking input from us through our microphone.

But, before defining the takeCommand() function, we need to install a module called speechRecognition, and the command is as follows:

pip install speechRecognition

After successfully installing this module, import this module into the program by writing an import statement.

import speechRecognition as sr

Let’s start coding our takeCommand() function:

def takeCommand():
#It takes microphone input from the user and returns string output
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = 1
audio = r.listen(source)

We have successfully created our takeCommand() function. Now we are going to add a try and except block to our program to handle our errors effectively.

try:
print("Recognizing...")
query = r.recognize_google(audio, language='en-in')
#Using google for voice recognition.
print(f"User said: {query}\n") #User query will be printed.
except Exception as e:
# print(e) use only if you want to print the error!
print("Say that again please...") #Say that again will be printed in case of improper voice
return "None"
#None string will be returned
return query

Now we can finally start defining our tasks and get desired input from them.

Task 1: To search something on Wikipedia:

To do Wikipedia searches, we need to install and import the wikipedia module into our program.

Command for installing wikipedia module is as follows:

pip install wikipedia

After successfully installing wikipedia module, import the wikipedia module into the program by writing an import statement.

if __name__ == "__main__":
wishMe()
while True:

query = takeCommand().lower() #Converting user query into lower case
# Logic for executing tasks based on query
if 'wikipedia' in query: #if wikipedia found in the query then this block will be executed
speak('Searching Wikipedia...')
query = query.replace("wikipedia", "")
results = wikipedia.summary(query, sentences=5)
speak("According to Wikipedia")
print(results)
speak(results)

In the above code, we have used an if statement to check whether Wikipedia is in the search query of the user or not. If Wikipedia is found in the user’s search query, then five sentences (which you can change by changing the number of sentences from 5 to any number you want) from the summary of the Wikipedia page will be converted to speech with the help of the speak function.

Task 2: To open YouTube in web-browser:

To open any website, firstly we need to import a module called as webbrowser.

It is an in-built module, and we do not need to install it with pip statement, we can directly import it into our program by writing an import statement.

Code:

elif 'open youtube' in query:
webbrowser.open("youtube.com")

Here, we are using the elif statement to check whether YouTube is in the query of the user or not. Let’s suppose, the user gives a command as “Please, open YouTube.”

So, open YouTube will be in the user’s query, and the elif condition will be true, and hence the code will be executed.

Task 3: To open Google search engine in a web-browser:

elif 'open google' in query:
webbrowser.open("google.com")

We are opening Google in a web-browser by applying the same logic that we used while opening YouTube.

Task 4: To play music:

To play music, we need to import a module called as os. Import this module directly with an import statement.

elif 'play music' in query:
music_dir = 'music_dir_of_the_user'
songs = os.listdir(music_dir)
print(songs)
os.startfile(os.path.join(music_dir, songs[0]))

In the above code, we are firstly opening the music directory of the user and then listing all the songs present in the directory with the help of the os module.

With the help of os.startfile, you can play any song of your choice. You can also play a random song with the help of a random module. Every time you command to play music, the A.I. voice assistant will play any random song from the song directory.

Task 5: To know the current time:

elif 'the time' in query:
strTime = datetime.datetime.now().strftime("%H:%M:%S")
speak(f"Sir, the time is {strTime}")

In the above code, we are using datetime() function and storing the current time of the system into a variable called strTime.

After storing the time in strTime, we are passing this variable as an argument in speak function and hence, the time string will be converted into speech.

Task 6: To open Stackoverflow:

elif 'open stack overflow' in query :                                     webbrowser.open('stackoverflow.com')

We used the same terminology as we used for Google search engine and YouTube.

Task 7: To open freecodecamp:

elif 'open free code camp' in query :                          webbrowser.open('freecodecamp.org')

We used the same terminology as we used for Google search engine and YouTube.

Task 8: To open Pycharm (or any IDE):

elif 'open code' in query:
codePath = "/Applications/PyCharm CE.app"
#that's the code path.
os.startfile(codePath)

To open Pycharm IDE or any other application, we need the code path of the application.

Task 9: To send email:

To send an email, we need to import a module called smtplib.

What is smtplib?

  • Simple Mail Transfer Protocol (SMTP) is a protocol that allows us to send emails and to route emails between different mail servers.

An instance method called sendmail is present in the SMTP module. This instance method allows us to send an email.

It needs/takes 3 parameters:

  • The sender: Email address of the sender.
  • The receiver: Email of the receiver.
  • The message: A string message which needs to be sent to one or more than one recipient.

Now, we will create a sendEmail() function, which will help us to send emails to one or more than one recipients.

def sendEmail(to, content):
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.login('youremail@gmail.com', 'your-password')
server.sendmail('youremail@gmail.com', to, content)
server.close()

Note: Do not forget to enable the less secure apps feature in your Gmail account. Otherwise, the sendEmail function won’t function properly.

Calling sendEmail() function inside the main() function:

elif 'email to receiver's name)' in query:
try:
speak("What should I say?")
content = takeCommand()
to = "receiver's email id"
sendEmail(to, content)
speak("Email has been sent!")
except Exception as e:
print(e)
speak("Sorry sir. I am not able to send this email")

We are using the try and except block to handle any possible error that can occur while sending emails.

Now, it’s time to revise or look back to what we have learnt so far !

  • Firstly, we created a wishme() function that gives the functionality of greeting the user, according to the system time.
  • After wishme() function, we have created a takeCommand() function, which helps our A.I. voice assistant to take command from the user and function accordingly. This function is also responsible for returning the user’s query in a string format.
  • We developed the code logic for opening different websites like Google search engine, YouTube, Stackoverflow, freecodecamp, wikipedia, etc.
  • We also developed code logic for opening Pycharm IDE and also few other applications.
  • At last we worked on the functionality of sending emails (and that too without typing a word, isn’t that amazing ! 😵)

So now comes the most contradictory question, IS THIS REALLY AN A.I. ?

Well technically it isn’t, because this voice assistant is just a result of bunch of statements. But if we dive deep down into the science/purpose of Artificial Intelligence, it’s main aim is just to reduce human efforts and perform the tasks with the same efficiency as human (or even better).

And our voice assistant resolves this purpose at quite an extent.

So the final verdict is, it is an A.I. 😛!

THE END !!!

Congratulations folks! 🎉

We have successfully created our personal A.I. virtual voice assistant, and successfully took a step forward to fuel our laziness !! 😂

I hope you all liked this blog !

You can also have a look at my Github repository, in order to understand the code much more better.

Also do not forget to give some claps (you can give as many as you wish😉).

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Saurav Pawar

Written by

Hello there 👋🏼, I am currently a Sophomore 👨🏼‍🎓 pursuing my bachelor’s degree. You can either see me code or grab a coffee !!

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store