How to use web scraping (BeautifulSoup) and GUI development (Tkinter) to create an interactive application with Python

Saurabh Ghosh
Predict
Published in
5 min readOct 18, 2022

--

Let’s learn python programming concepts by developing a GUI application with data scraped from a webpage.

Photo by chan lee on Unsplash

What to expect in this blog?

You’ll be exploring two primary areas in this blog -

  • Web scraping — You’ll scrape a web page which shows the list of buses plying in London and their stop names. You’ll use the BeautifulSoup library of Python.
  • GUI application development — You’ll use the data scraped from the webpage to create a small GUI application with the Tkinter library.
Application screen

Some key points you’ll be exploring with this code -

  1. Execute script alone or as a module
  2. Python class, methods and variables
  3. Using Pickle library methods to store and retrieve objects as serialized files
  4. Using BeautifulSoup to parse and read a webpage
  5. Using dictionary data types
  6. Using the Tkinter library and components
  7. Using Combobox control
  8. List sorting

Let’s plan for the work

Before starting the design and coding, let’s understand the requirement clearly.

User interaction involved

  1. Users should be able to see the list of bus stops and choose one bus stop as the starting bus stop.
  2. Users should be able to see the list of bus stops and choose one bus stop as the finishing bus stop.
  3. Users should be able to submit the selected bus stops to search for a direct bus between the bus stops.
  4. Users should see the result after a search or an appropriate message if no bus is found for the selected bus stops.

Essential processing involved supporting the user interaction

  1. Check if the serialized files already exist with the list of buses and the stops.
  2. If the serialized files are not present, proceed to retrieve the data from the web by scraping the page.
  3. Store the data as serialized files.
  4. Render the screen components with the data retrieved.
  5. Perform a search for a matching bus with the input from the user.

High-level design thinking

Now you can plan the required methods and attributes.

  1. scrape_bus_routes()— This method will perform the below actions -
    — Try to read the list of Bus objects from serialized file
    — Try to read the total list of stops from serialized file
    — If the serialized files do not exist, retrieve the bus routes from the webpage
    — Store the retrieved data as serialized files
    — Store the retrieved data into a variable for the rendering and other functions to access
  2. Bus class — This class will contain the bus-related details and the stops the bus travels through. A list of Bus objects will be made available for the application.
  3. get_bus_numbers() — This method will read the selected starting and finishing stops from the screen fields and will retrieve the first instance of matching bus detail where the bus runs between the selected stops.
  4. show_components() — This method will render the screen elements. It’ll use the data retrieved with the scrape_bus_routes() method. The scrape_bus_routes() method can be invoked from a constructor or as a method prior to rendering the screen elements. There will be a button to invoke the get_bus_numbers() method.

Let’s code

As you know the main methods and their purpose, now let’s get started.

You can follow the comments and docstrings within the code snippets.

Import and class definition

scrape_bus_routes()

This method is kept in a class called Scraper. Idea is that further methods can be added in this class if you want to scrape different webpage.

Key points —
— Initializing a dictionary variable
— Using the pickle.load() method to retrieve serialized object
— Using the pickle.dump() method to store an object as a serialized file
— Using BeautifulSoup class to scrape data from a webpage
— Splitting a text by delimiter
— Reducing multiple spaces within the text to one space
— Using set operation to compare between two lists and add only unique items

“Bus” class

“LondonBuses” class and constructor

Key points —
— Retrieving data from a dictionary object with key

get_bus_numbers() — This is part of the “LondonBuses” class

Key points —
— Retrieving the selected option from a Tkinter Combobox element
— Setting value dynamically into a Tkinter StringVar element

show_components() — This is part of the “LondonBuses” class

Key points —
— Using Tkinter to create GUI elements
— Using Combobox component
— Using Label component with StringVar
— Creating Button and associating it with a method to invoke when clicked

Starting/Invoking the application

This is written within the module (the “scrapeit.py” file), however outside the class.
Notice the check for “if __name__ == ‘__main__’:”. This is to let the module execute only when run directly by the “python scrapeit.py” command.

If you want to make a package for the application, the above code will go into the “__main__.py” file in the package.

That’s all the coding. Now it’s time to start the application!

Run the application

The first screen after “python scrapeit.py” -

Choose a start location and an end location.

Click on Submit button. The bus information will be shown below the button.

If there is no direct bus available between the stops, a message will be shown.

Close the window to exit the application.

Happy coding!!

Download

GitHub — https://github.com/SaurabhGhosh/scrapeit.git

Conclusion

In this blog, I hope you got some ideas about below -

  1. Execute script alone or as a module
  2. Python class, methods and variables
  3. Using Pickle library methods to store and retrieve objects as serialized files
  4. Using BeautifulSoup to parse and read a webpage
  5. Using dictionary data types
  6. Using the Tkinter library and components
  7. Using Combobox control
  8. List sorting

In my next blog, I’ll explore another program and learn more concepts.

If you have any questions related to this program, please feel free to post your comments.

Please like, comment and follow me! Keep Learning!

--

--

Saurabh Ghosh
Predict

Business Analyst, Machine Learning Enthusiast, Blogger