Track Trending Games w/ Steam API & Python

Published in

CodeX

6 min readJun 30, 2024

Abstract

The article provides an alternative approach to track games in various categories, including “New & Trending”, “Top Sellers”, “Global Top Sellers”, “Popular Upcoming”, and “Specials”, using Steam’s searching API without any HTML processing. Filter values of the API are explored, providing a comprehensive list of parameters for customizing search queries.

Forewords

In my previous articles, we delved into the world of Steam API and explored how to create custom game databases and review databases. These techniques are instrumental in gathering valuable data for research projects, such as topic modeling over game reviews. While researching, I stumbled on a Medium article describing how to create a daily database of different promotional tabs in the front page of Steam, such as “New & Trending”, “Top Sellers”, “Popular Upcoming” and “Specials”, using web-scraping and HTML processing.

The tab section in the front page of Steam.

However, I was convinced that there should be an API only method which does not require HTML processing, and began my tiny research. After some digging, I found a Steam API that supports searching features of these categories and returns easy-to-read JSON data.

https://store.steampowered.com/search/results

Python will be used for this project. requests package will be used, which can be installed through pip or conda .

# pip
pip install requests

# conda
conda install anaconda::requests

Body

The /search/results API is a GET request which can be thought as an alias of the /search API, which normally directs to Steam’s search page (returning an HTML webpage). However, a major difference between /search/results API and /search API is that the latter supports returning structured JSON, which is more suitable for data scraping.

The search page of Steam,, with URL https://store.steampowered.com/search/?term=

According to the unofficial internal API documentation, it returns a list of apps with their name and logos, which are URLs to the images. A sample result of the API is shown in the screenshot below.

A example of the result of the API using Postman.

The API offers lots of parameters for users to customize their search, which is almost identical to the criteria shown on the right side of the Steam’s search page, such as hiding free-to-play items, tags, types (identical to category in the API), OS, language, return ordering and more.

Initiate a request with the /search/results API is simple in Python. Extra error handling is included for good measure.

params = {
    "filter": "topsellers",
    "hidef2p": 1,
    "page": 1,                # to control the page of the returned result, similar to what "cursor" does in scraping reviews of a game
    "json": 1
}

def get_search_results(params):
    req_sr = requests.get(
        "https://store.steampowered.com/search/results/",
        params=params)
    
    if req_sr.status_code != 200:
        print_log(f"Failed to get search results: {req_sr.status_code}")
        return {"items": []}
    
    try:
        search_results = req_sr.json()
    except Exception as e:
        print_log(f"Failed to parse search results: {e}")
        return {"items": []}
    
    return search_results
    
search_results = get_search_results(params)

Finding all the “filters”

The unofficial documentation does not provide a list of options for the “filter” field. Suspecting that the /search/results API works in a similar way as the /search page, I tried to trigger the search page from these four tabs, “New & Trending”, “Top Sellers”, “Popular Upcoming” and “Specials”, by mainly clicking the buttons near to the “See more” phrase.

The resulting filter values are listed in the table below. Note that the “Special” tab is triggered using the “specials” field of the API.

+---------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|   “filter” value    |                                                                                               Description                                                                                               |
+---------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| “” (empty)          | The default value of the “filter” field, to display all games on Steam. Usually used with field “sorted_by”=”Released_DESC” to display all games on Steam from the latest release date to the earliest. |
| “popularnew”        | Corresponds to the “New & Trending” tab                                                                                                                                                                 |
| “topsellers”        | Corresponds to the “Top Sellers” tab                                                                                                                                                                    |
| “globaltopsellers”  | Corresponds to the “Global Top Sellers” button next to the “See more:” phrase in the bottom of the list of “Top Sellers” tab                                                                            |
| “popularcomingsoon” | Corresponds to the “Popular Upcoming” tab                                                                                                                                                               |
+---------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

The Scraper

With these fields in mind, we can build a basic scraper to search and record the recommended games and the corresponding appid from Steam daily based on different filters. We will scrape for top 100 games of the five “categories”: “New & Trending”, “Top Sellers”, “Global Top Sellers”, “Popular Upcoming” and “Specials”. To make the scraping more meaningful, the information of the game will also be scraped using the /api/appdetails API. Finally, a pickle is saved for each of the five “categories” with the date of the scraping performed.

# Imports and Helper functions

from datetime import datetime
import time
import requests
import pickle
from pathlib import Path
import re

def print_log(*args):
    print(f"[{str(datetime.now())[:-3]}] ", end="")
    print(*args)
    
def get_search_results(params):
    req_sr = requests.get(
        "https://store.steampowered.com/search/results/",
        params=params)
    
    if req_sr.status_code != 200:
        print_log(f"Failed to get search results: {req_sr.status_code}")
        return {"items": []}
    
    try:
        search_results = req_sr.json()
    except Exception as e:
        print_log(f"Failed to parse search results: {e}")
        return {"items": []}
    
    return search_results
    
def get_app_details(appid):
    while(True):
        if appid == None:
            print_log("App Id is None.")
            return {}

        appdetails_req = requests.get(
            "https://store.steampowered.com/api/appdetails/",
            params={"appids": appid, "cc": "hk", "l": "english"})        # change the countrycode to the region you are staying with
        
        if appdetails_req.status_code == 200:
            appdetails = appdetails_req.json()
            appdetails = appdetails[str(appid)]
            print_log(f"App Id: {appid} - {appdetails['success']}")
            break

        elif appdetails_req.status_code == 429:
            print_log(f'Too many requests. Sleep for 10 sec')
            time.sleep(10)
            continue

        elif appdetails_req.status_code == 403:
            print_log(f'Forbidden to access. Sleep for 5 min.')
            time.sleep(5 * 60)
            continue

        else:
            print_log("ERROR: status code:", appdetails_req.status_code)
            print_log(f"Error in App Id: {appid}.")
            appdetails = {}
            break

    return appdetails

# Main code

execute_datetime = datetime.now()

search_result_folder_path = Path(f"search_results_{execute_datetime.strftime('%Y%m%d')}")
if not search_result_folder_path.exists():
    search_result_folder_path.mkdir()
    
# a list of filters
params_list = [
    {"filter": "topsellers"},
    {"filter": "globaltopsellers"},
    {"filter": "popularnew"},
    {"filter": "popularcommingsoon"},
    {"filter": "", "specials": 1}
]
page_list = list(range(1, 5))

params_sr_default = {
    "filter": "topsellers",
    "hidef2p": 1,
    "page": 1,            # page is used to go through different parts of the ranking. Each page contains 25 results
    "json": 1
}

for update_param in params_list:

    items_all = []
    if update_param["filter"]:
        filename = f"{update_param['filter']}_{execute_datetime.strftime('%Y%m%d')}.pkl"
    else:
        filename = f"specials_{execute_datetime.strftime('%Y%m%d')}.pkl"

    if (search_result_folder_path / filename).exists():
        print_log(f"File {filename} exists. Skip.")
        continue

    for page_no in page_list:
        param = params_sr_default.copy()
        param.update(update_param)
        param["page"] = page_no

        search_results = get_search_results(param)
        print_log(search_results)

        if not search_results:
            continue

        items = search_results.get("items", [])

        # proprocessing search results to retrieve the appid of the game
        for item in items:
            try:
                item["appid"] = re.search(r"steam/\w+/(\d+)", item["logo"]).group(1)      # the URL can be steam/bundles/{appid} or steam/apps/{appid}
            except Exception as e:
                print_log(f"Failed to extract appid: {e}")
                item["appid"] = None

        # request for game information using appid
        for item in items:
            appid = item["appid"]
            appdetails = get_app_details(appid)
            item["appdetail"] = appdetails

        items_all.extend(items)

    # save the search results
    with open(search_result_folder_path / filename, "wb") as f:
        pickle.dump(items_all, f)
    print_log(f"Saved {filename}")

Actual file and Detailed usage can be found in my Github

Conclusion

In this short article, we explored the /search/results API and its potential for creating custom databases to track games in different “categories”. We also delved into the filter values of the API and recorded them for future reference.

This is Part 3 out of 3 of the series of “Scraping Steam: A Comprehensive Guide (2024 ver)”.