Automating Data Extraction and Visualization of the Saudi Pro League Points Table

Published in

CodeX

3 min readMay 28, 2024

In this article, I’ll walk you through a Python project that extracts the latest Saudi Pro League points table from the web, processes the data into a pandas DataFrame, creates a visualization using Plotly, and posts a tweet with the table image. This project showcases web scraping, data visualization, and automation using Python.

Prerequisites

To follow along, you’ll need:

Basic knowledge of Python programming
requests, BeautifulSoup, pandas, plotly, and kaleido libraries installed
A function to post tweets (we’ll use a placeholder function post_tweet here)

Install the required libraries:

pip install requests beautifulsoup4 pandas plotly kaleido

Step-by-Step Guide

1. Importing Required Libraries

We start by importing the necessary libraries.

import requests
from bs4 import BeautifulSoup
import pandas as pd
import plotly.graph_objects as go
from post_tweet_v2 import post_tweet

2. Sending a GET Request

We send a GET request to the Saudi Pro League points table URL.

# URL of the league table
url = 'https://www.spl.com.sa/en/table'

# Send a GET request to the URL
response = requests.get(url)

3. Parsing the HTML Content

If the request is successful, we parse the HTML content using BeautifulSoup.

# Check if the request was successful
if response.status_code == 200:
    # Parse the HTML content of the page
    soup = BeautifulSoup(response.content, 'html.parser')

4. Extracting Table Data

We locate the table, extract headers, and clean them.

    # Find the table
    table = soup.find('table')
    
    # Extract headers
    headers = [header.text.split()[0].strip() for header in table.find_all('th')]
    
    # Clean headers
    headers = [header.replace(' ', '') for header in headers]

Next, we extract rows of data and clean the columns.

    # Extract rows
    rows = []
    for row in table.find_all('tr')[1:]:
        cols = row.find_all('td')
        cols = [col.text.strip() for col in cols]
        cols[-1] = '-'.join(cols[-1].split())
        rows.append(cols)
    
    # Create a DataFrame
    df = pd.DataFrame(rows, columns=headers)
    
    # Print the DataFrame
    print(df)

5. Creating a Visualization

We create a Plotly table to visualize the data and save it as an image.

    # Create the table
    fig = go.Figure(data=[go.Table(
        header=dict(values=list(df.columns),
                    fill_color='paleturquoise',
                    align='left'),
        cells=dict(values=[df[col] for col in df.columns],
                   fill_color='lavender',
                   align='left'))
    ])
    
    # Adjust layout
    fig.update_layout(
        height=450,  # Adjust height
        width=1000,  # Adjust width
        margin=dict(l=10, r=10, t=10, b=10)  # Adjust margins to reduce white space
    )
    
    # Save the table as an image
    fig.write_image("df_image.png", engine="kaleido")

6. Posting the Tweet

Finally, we compose a tweet with the table image and post it using a placeholder function post_tweet.

    twitter_post = """
        🏆 Saudi Pro League Points Table 🏆
        #SaudiProLeague #Football #Soccer #SaudiArabia #SPL #saudileague
    """

    print(twitter_post)
    post_tweet(twitter_post, 'df_image.png')
    
    # Show the table
    fig.show()
else:
    print(f"Failed to retrieve the page. Status code: {response.status_code}")

Here we can see the output and website data below.

Data Posted on Twitter:

Please find the code on github:

GitHub - shoeb370/saudi-pro-league-automation: Data Extraction of Saudi Pro League Points table

Data Extraction of Saudi Pro League Points table. Contribute to shoeb370/saudi-pro-league-automation development by…

github.com

Conclusion

This project demonstrates how to scrape data from a website, process it with pandas, visualize it using Plotly, and automate posting to Twitter. Such automation can save time and ensure up-to-date information sharing on social media.

Connect with me on LinkedIn and Twitter for freelance work opportunities in web scraping and Python automation. I’m open to collaborations and new projects!

If you found this article helpful and would like to support my work, consider making a donation via PayPal.

Thank you for reading!