Paginating dataframes with Streamlit

Published in

Streamlit

3 min readMar 3, 2023

We often have dataframes that are too long to display in an app. That's when pagination comes in handy. Streamlit's custom components, like streamlit-aggrid, can handle pagination and other features, but they're only sometimes available or usable.

In this article, I'll walk you through some simple code to paginate a dataframe using only Streamlit's native controllers.

Libraries

For this example, we'll use only Streamlit and Pandas libraries:

import streamlit as st
import pandas as pd

Loading and processing data

To keep it as simple as possible, we'll use an st.file_uploader component and pandas read_csv method to load our dataframe. The logic of this app will only run when a CSV file is loaded into the controller.

There are two main app functions:

To load data into a pandas DataFrame
To split the dataframe into a specified number of chunks

I'm using the st.cache_data decorator to avoid unnecessary calls to these functions. Note that the split_frame function uses a list comprehension method to create a single object containing multiple chunks of the dataframe, indexed based on the size specified by the "rows" argument:

@st.cache_data(show_spinner=False)
def load_data(file_path):
    dataset = pd.read_csv(file_path)
    return dataset


@st.cache_data(show_spinner=False)
def split_frame(input_df, rows):
    df = [input_df.loc[i : i + rows - 1, :] for i in range(0, len(input_df), rows)]
    return df

file_path = st.file_uploader("Select CSV file to upload", type=["csv"])
if file_path:
    dataset = load_data(file_path)

Pagination interface

The user interface of the app is quite simple. It provides an option to sort the data before pagination and uses select_box, radio, markdown, and number_input controllers. We can create a simple yet effective pagination system for our large dataframes. This pagination is for display only, the data will still be available at the cache level. If you want to load millions of records, you'll probably need a lazy loading approach:


top_menu = st.columns(3)
with top_menu[0]:
    sort = st.radio("Sort Data", options=["Yes", "No"], horizontal=1, index=1)
if sort == "Yes":
    with top_menu[1]:
        sort_field = st.selectbox("Sort By", options=dataset.columns)
    with top_menu[2]:
        sort_direction = st.radio(
            "Direction", options=["⬆️", "⬇️"], horizontal=True
        )
    dataset = dataset.sort_values(
        by=sort_field, ascending=sort_direction == "⬆️", ignore_index=True
    )
pagination = st.container()

bottom_menu = st.columns((4, 1, 1))
with bottom_menu[2]:
    batch_size = st.selectbox("Page Size", options=[25, 50, 100])
with bottom_menu[1]:
    total_pages = (
        int(len(dataset) / batch_size) if int(len(dataset) / batch_size) > 0 else 1
    )
    current_page = st.number_input(
        "Page", min_value=1, max_value=total_pages, step=1
    )
with bottom_menu[0]:
    st.markdown(f"Page **{current_page}** of **{total_pages}** ")

The app

Full code

import streamlit as st
import pandas as pd

st.set_page_config(layout="centered")

@st.cache_data(show_spinner=False)
def load_data(file_path):
    dataset = pd.read_csv(file_path)
    return dataset


@st.cache_data(show_spinner=False)
def split_frame(input_df, rows):
    df = [input_df.loc[i : i + rows - 1, :] for i in range(0, len(input_df), rows)]
    return df


file_path = st.file_uploader("Select CSV file to upload", type=["csv"])
if file_path:
    dataset = load_data(file_path)
    top_menu = st.columns(3)
    with top_menu[0]:
        sort = st.radio("Sort Data", options=["Yes", "No"], horizontal=1, index=1)
    if sort == "Yes":
        with top_menu[1]:
            sort_field = st.selectbox("Sort By", options=dataset.columns)
        with top_menu[2]:
            sort_direction = st.radio(
                "Direction", options=["⬆️", "⬇️"], horizontal=True
            )
        dataset = dataset.sort_values(
            by=sort_field, ascending=sort_direction == "⬆️", ignore_index=True
        )
    pagination = st.container()

    bottom_menu = st.columns((4, 1, 1))
    with bottom_menu[2]:
        batch_size = st.selectbox("Page Size", options=[25, 50, 100])
    with bottom_menu[1]:
        total_pages = (
            int(len(dataset) / batch_size) if int(len(dataset) / batch_size) > 0 else 1
        )
        current_page = st.number_input(
            "Page", min_value=1, max_value=total_pages, step=1
        )
    with bottom_menu[0]:
        st.markdown(f"Page **{current_page}** of **{total_pages}** ")



    pages = split_frame(dataset, batch_size)
    pagination.dataframe(data=pages[current_page - 1], use_container_width=True)

That's it! We now have dataframe pagination using native Streamlit components in a simple yet helpful approach. Enjoy!