CODEX

How to make a Google search bot. GUI automation series I

William W
CodeX
Published in
9 min readFeb 19, 2021

--

Today we’ll introduce a series of tutorials. This first one is beginner-friendly and anyone with a computer and internet connection should be able to get through it. Definitely let me know if the pacing of this is too slow, too fast, too much info, or not enough info. I’ll take any feedback into consideration for the next lessons. If you already have Python and PyAutoGUI installed and just want the code, you can get the file on my Github.

Our first project of the series is interesting, if not a little silly: Something that takes control of your mouse, starts up Google Chrome, opens a new tab, types in a random search, then saves a screenshot.

Original pointer off Wikipedia user “Lordalpha1” copied over 7 times in Google Slides

To start, you will need to download Python if you haven’t already. If you have, skip to the section titled “Installing PyAutoGUI.”

You can download Python directly from the official website. The setup wizard should install everything correctly, but to ensure you don’t run into problems with things like opening up IDLE or running pip install commands, I suggest you open up a YouTube tutorial on a search like “how to install Python on Mac OS” or “how to install Python on Windows 7.” The default location of Python for Windows is the “Program Files” directory. So if you see Python files there, then you’ve likely installed correctly. You can also check directly through the command prompt by typing “python” and it should tell you the version and start a shell. You can then get back to regular cmd by typing “quit()”

Type “python” and if you get something like this, you’ve installed correctly

As for error messages, Google any that come up. Usually someone on stack overflow has it covered. This site is also good for troubleshooting on Windows. For this tutorial, I’m on a Windows OS, but I can also make a Mac OS tutorial if the differences are significant enough and a lot of people ask.

Common problems with Python install:

  1. PATH variable is not set. This was a problem in older installations, but I believe the latest Python versions automatically set the PATH variable in Windows. To set Path variables, check out a tutorial like this one or this one.

Installing PyAutoGUI

Okay, now that Python is installed, we need to install PyAutoGUI. Go to the package website and follow their instructions for your OS. Installing a Python package is usually done by typing “pip install [package name].” What this does is install files either from a local place (like your computer or network) or the internet (through a URL). An explanation of that is here on pip’s official site if you’re curious.

Common errors:

  1. Pip is not a recognized command. This means pip is not installed, or you are running the terminal window from a directory that does not have access to the location of pip in your files. You may need to manually install pip here.
  2. “NameError: name ‘pip’ is not defined.” You’re trying to type this in the Python shell or IDLE instead of your computer’s terminal. Make sure you’re in cmd and running as administrator on Windows. Open up cmd by just clicking on your “start” icon and typing “cmd.” When it pops up, right click the command prompt and select “run as administrator.” Now type in your commands to check for the pip install and version

Now that PyAutoGUI is installed, create a new folder. Remember the location of this folder. Call this folder “guiTutorials”

Within the new folder, create a new txt file (it may open in notepad by default). Before writing anything in this new txt file, save it as “autoSearcher.py” — it should now appear as a “Python File” if you sort files by Type within the guiTutorials folder. If it doesn’t, it’s likely called “autoSearcher.py.txt”

To fix this, go to the “View” tab in your file explorer and make sure to check the box “File name extensions”:

Now delete the “.txt” and it should appear as a Python file like this:

Note, if the “Type” tab isn’t available, just right click on the “Name” tab and select “Type” from the dropdown menu.

Now right click on the file name, and select “Edit with IDLE” then “Edit with IDLE [version]” where [version] matches the version of Python you have installed. An empty editing window should show up:

Don’t get discouraged if just setting up all the software took a while. That’s perfectly normal.

Writing the code

Starting with our high level idea, mentioned earlier: “Something that takes control of your mouse, starts up Google Chrome, opens a new tab, types in a random search, then saves a screenshot”, we can break that description into steps easily:

  1. Take control of mouse
  2. Start Google Chrome
  3. Open a new tab
  4. Type a random search
  5. Take a screenshot

Now let’s think about what needs to actually happen for each step. Maybe we’ll find “hidden” steps or requirements we didn’t think of. Maybe we’ll find out we don’t need some things.

Taking control of the mouse/touchpad: The mouse driver is a block of hardware code. It can be accessed by another piece of code through the win32 API.¹ We can call win32 ourselves or use PyAutoGUI’s built in methods. In future tutorials, we’ll see an example of both.

A side function for taking control of the mouse and clicking using win32api. I initially believed we needed this. We didn’t end up using this for now, but a future tutorial might!

Define the main body: This is simple enough. We want to make a try/except block (like a try/catch block in Java). Our code will run what is within the “try” section unless a KeyBoardInterrupt becomes true. At which point, the code within the“except” block runs. We’ll put a simple print and sys.exit() there to stop the program if the “interrupt” key is pressed by the user (by default, this is ctrl+c or the delete key).

Our main code goes in the “try” block

Make our topic list: Making our topic list is the easiest part. Just define a new array called “topicList” then put in whatever strings we want. Here’s mine as an example:

Type as many strings or phrases as you want

Starting Google Chrome: Now this is a bit more interesting. We can’t just declare somewhere in our code “start Google Chrome” or “click on Chrome.”² We need to either locate Chrome’s location in the system and command it to start, or mimic the same actions a user would take. Let’s mimic the user’s actions. Since we have PyAutoGUI at our disposal, there’s two ways we could do this: scan the screen for a Google Chrome icon, or: access it through the “Start” menu or “Windows” key. Since the icon may look different on every desktop (or not even exist on screen at a given time), the less error prone method is to use the “Windows” key, since that is always going to be available to any Windows user. Okay, so this step is actually “press the Windows key and type the phrase ‘Google Chrome’, then hit the ‘enter’ key.” ³

The three substeps followed by print statements for debugging and confirmation that line was reached. I added delays to allow the system to catch up. Notice the 0.25 second delay in the write() function. Sometimes the computer types so quickly, that characters are missed. Plus, this gives a visibly cool typing effect.

Now that Chrome is open, our next task is to open a new tab. It’s also possible that we navigated to an already opened Chrome window or we reopened a saved session of tabs. We’re opening a new tab to ensure we don’t “overwrite” and search in a tab that’s already open. On Chrome, this is a simple “ctrl + T” shortcut. PyAutoGUI is pretty cool because we can even program these shortcut combos with the hotkey() function. However, we need to put in a delay. This is because the computer hits the two keys so quickly, that it doesn’t register as separate presses. Use a 0.5 second interval.

Open a new tab slowly with the shortcut

Typing the query: Since Chrome defaults to Google Search already, all that’s left is typing in a random search. We can use PyAutoGUI.write() to select from the array of random strings to type in, then hit “enter” to conduct the search. Getting a random entry of this array is done by taking a random integer from 0 to the last index of the array (length of array -1). Then we can choose a random string to pass into the write() function:

Randomly pick a topic from the list, then use PyAutoGUI to search it.

Now we just need to hit enter, then take a screenshot. Since we’re conducting an actual internet search, we will want to put a delay in to wait for the results to load before we take our screenshot. I chose 4 seconds for this. I also gave 2 seconds to type the query in the Chrome search bar.⁴

Hit enter, then save the screenshot

We store the screenshot in a variable called “im1.” We can then use the “save” function. However, we would like to make unique file strings instead of making a bunch of im1 (1), im1 (2), im1 (3) files. One aspect that is most likely unique to each search run on a single computer is the date and time. We also want to include the topic we searched. We can do both these things by retrieving the time with the datetime library, and by concatenating (combining) the strings in the “save” function:

We call the datetime function, then we save the current time in a string. We combine the topic and time to make a unique file name.

We take our screenshot and save it to the same folder our Python code file is located in, then quit the program. Now we’re done! To run your program, hit “f5” or click the “Run” tab then “Run Module” option.

Some search result images saved automatically with unique names

As with any code, there’s always room for improvement. I even suggested a couple ways here and in the comments on Github. I hope to make this series with PyAutoGUI interesting. I might also highlight some other Python libraries to spice things up. Let me know if you want to go over another Python package or programming topic and I might consider making a new tutorial for it.

  1. If you were curious, a touchpad works because any conductive materials that touch it change its capacitance (capacitance is a physical property like mass) at certain points. These changes in capacitance are mapped to movement commands. You could also use a series of flex resistors. But making a touchpad is complex enough for another tutorial series.
  2. Technically, we could. We would need to write the code that does these things, then parse a string input by the user. This is how command lines work and we could do a user input/output terminal tutorial in the future if there’s a lot of demand for one, like one of those text adventure minigames.
  3. This is actually subject to error as well. If the Google Chrome browser does not come up first in the search results, then we will not open the correct program (for example, if a text file called “Google Chrome” happens to be highlighted first)
  4. Random delays aren’t the best way to do this, but it’s convenient for now. A cleaner way would be to track events or tasks on the computer and only run the next step if an event condition triggers.

--

--

William W
CodeX

Electrical Engineering. Software developer and cofounder of a startup.