Automate the Google search using Python

Soumyabrata Roy
Analytics Vidhya

--

Whenever we need help, we take simple help from Google. With 5.4 billion searches every day, Google is the biggest search engine on the planet right now. But sometimes, it is quite daunting to search manually. what if, we could automate the whole process using python. In this story, I will help you with that. After finishing the story, you will be good to automate the whole process of google search using Python.

For this, we will take help from a Python library called Google Search. It is free, simple and pretty straight forward to use. To install the library, go to your command prompt or Anaconda prompt and type:

pip install google

After installing the google, just import the search module from google-search

from googlesearch import search

What search does is that it will take your query and search over google, find out the relevant URLs and save it in a python list. To get the saved results, you just need to iterate the objects in the list and get the URLs.

# as an example:query = "iPhone"for i in search (query,  tld='com', lang='en', tbs='0', safe='off', num=2, start=0, stop=2, domains=None, pause=2.0, tpe='', country='', extra_params=None, user_agent=None):
print(i)

Here tld: domain name i.e. google.com or google.in, lang: search language, num: Number of results per page, start: 1st result to retrieve, stop: last result to retrieve and so on. In your Jupyter notebook, if you click on shift+Tab and click on the plus sign, you will get all the information for the module.

Let’s create a simple search algorithm about the stock market company analysis which will give you a quick overall glimpse of the company condition right now

# importing the module
from googlesearch import search
# stored queries in a list
query_list = ["News","Share price forecast","Technical Analysis"]
# save the company name in a variable
company_name = input("Please provide the stock name:")
# iterate through different keywords, search and print
for j in query_list:
for i in search(company_name+j, tld='com', lang='en', num=1,
start=0, stop=1, pause=2.0):
print (i)

Here I have created the search keyword like this “company_name+query”. It takes the 1st query and adds the company name to it and search using google for the latest search results. When it completes the search, it prints out the URLs.

For simplicity, I have taken only the first URL from the search. If you would like you can take many more URLs for your search. Please run it using collab: https://colab.research.google.com/drive/17ZOtFIBoPoJYxantOFEWUCUHtpP3O4ay

The current algorithm only searches in the default mode. If you like to search category wise like News, Images, Videos, etc. etc, you can specify it easily with tpe: parameter. Just choose the correct category and you are good to go. Below I am running the same code but with Google news searches only

# only added tpe="nws" for news only
for j in query_list:
for i in search(company_name+j, tld='com', lang='en', num=1,
start=0, stop=1, pause=2.0, tpe="nws"):
print (i)

Similarly for videos add tpe=”vde”; images, tpe=”isch”; books, tpe=”bks” etc.

If you would like to go in a much more broad example of what you can do using google search, here is an example.

from googlesearch import searchimport requests
from lxml.html import fromstring
# Link URL Title retriever usin request and formstring
def Link_title(URL):
x = requests.get(URL)
tree = fromstring(x.content)
return tree.findtext('.//title')
company_name = input("Please provide the company name:")query = int(input("Please give the appropriate value. 1 for Fundamental Analysis, 2 for News, 3 for Technical Analysis & 4 for Share Price Forecast:"))if query == 1:
print (company_name+" "+"Fundamental Analysis:")
print (" ")
for i in search(company_name, tld='com', lang='en', num=1, start=0, stop=1, domains=['https://www.tickertape.in/'], pause=2.0):
print ("\t"+i)
elif query == 2:
print (company_name+" "+"News:")
print (" ")
for i in search(company_name+ 'News', tld='com', lang='en', num=3, start=0, stop=3, pause=2.0, tpe='nws'):
print ("\t"+"#"+" "+Link_title(i))
print("\t"+i)
print(" ")
elif query == 3:
print (company_name+" "+"Technical Analysis:")
print (" ")
for i in search(company_name+ 'Technical Analysis', tld='com', lang='en', num=3, start=0, stop=3, pause=2.0):
print ("\t"+"#"+" "+Link_title(i))
print("\t"+i)
print(" ")
else:
print (company_name+" "+"Share Price Forecast:")
print (" ")
for i in search(company_name+ 'share price forecast', tld='com', lang='en', num=3, start=0, stop=3, pause=2.0):
print ("\t"+"#"+" "+Link_title(i))
print("\t"+i)
print(" ")

Here I have used the same logic as the above examples. Two things I have added here is to create a function Link_title() to show the title of a particular link coming from the search query and for fundamental analysis, I have specified the relevant URL using domain option.

In this example, to analyze a stock, I have considered four categories: fundamental analysis, news, technical analysis, share price forecast. To keep that in mind, I have used the if Else condition to showcase the appropriate category URLs. So whenever you run the code, It will ask you the stock name and the search category, and then it will show you the correct URL with the title along with it. Please run the code using collab below:

https://colab.research.google.com/drive/1w6dXfUlEahI7M7HG2zSlW7QfkL-xA6vO

The stock analysis is just one of many applications you could use it for. It could be anything you want using google. I hope you have liked my story. You can access the code using Github link here: https://github.com/soumyabrataroy/Google-stock-search-using-Python.git

Please let me know your thoughts down in the comment below. I can be reached at Twitter: @sam1ceagain

Originally Published on LinkedIn: https://www.linkedin.com/pulse/automate-google-search-using-python-soumyabrata-roy/

--

--

Soumyabrata Roy
Analytics Vidhya

Data Scientist Cognizant | Answering what, why, and how of different business scenarios through machine learning and deep learning.