Unleash the power of Streamlit

Jeremy Sapienza

Published in

Data Reply IT | DataTech

8 min readMay 27, 2024

A faster way to build and share data apps

Fig 1.0 — The interaction with Python language

Have you ever wasted so much time learning a formatting language to create fresh and innovative web pages?

Have you ever wasted time creating complex components for your web page?

Do you know Python and would like something that takes all its features and that with simple commands allows you to create a web page from scratch in a few minutes?

Streamlit could be the solution for you!

Streamlit in simple terms

Streamlit is a free, open-source, all-Python framework that lets you quickly build interactive dashboards and machine learning web applications with no front-end web development experience required. With a basic understanding of Python, you can create and share web apps in hours, not weeks or even months.

In this tutorial, after seeing how to install Streamlit on our car, we will build an interactive dashboard that displays a Churn model prediction task. The task is explained by adding xAI method (SHAP values) to the prediction made.

Why Streamlit?

There are various tools that allow data analysis and visualization, but there are some of these apps that has some limited application and you need to pay for continuing to use them.

In this case, Streamlit allows you to create these dashboards or some data applications in a few lines of code, without having any knowledge regard front-end web application development… and it is FREE!

If you are a data person and you need to create a simple web-app exploring your data interactly, I suggest you to use Streamlit!

ps: you can use it also for deploying in your production environment

Essential features in Streamlit

Streamlit can allow you to be more efficient using some particular functionalities like:

1. Caching

Caching helps maintain your app’s performance when dealing with web data retrieval, large dataset manipulation, or resource-intensive computations. The core concept of caching is to store the results of time-consuming function calls and return these cached results when the same inputs are encountered again.

This eliminates the need to repeatedly execute the function with identical input values.

In Streamlit, you can cache a function by applying a caching decorator. There are two options available:

st.cache_data: Recommended for caching computations that produce data.
st.cache_resource: Recommended for caching global resources such as machine learning models or database connections.

Fig 2.0 — Streamlit’s two caching decorators and their use cases.

2. Session State

Session State in Streamlit offers a dictionary-like interface to save information across script reruns. You can store and access values using st.session_state with key or attribute notation, like st.session_state["my_key"] or st.session_state.my_key. Widgets manage their own state, so using Session State isn't always necessary.

A session refers to a single instance of viewing an app.

Each browser tab has its own session, and Streamlit maintains this session as users interact with the app. If users refresh the page or reload the app URL, their Session State resets, starting a new session.

3. Connections

Streamlit provides an easy way to handle popular connections, such as SQL, with st.connection. It automatically manages caching, allowing you to write less code.

…There could be also many other functionalities like: Theming and Pages that I suggest to you to check and study them here

How to Install Streamlit

Now, after having a little brainstorming on what is Streamlit and why it is powerful, let’s see how to install it!

… mmm well it is simple and fast! You need to install in your environment the framework:

pip install streamlit

if it is correctly installed write:

streamlit hello

Prerequisite

Before we jump into coding and creating our app, we need to do a bit of preparation, including downloading the dataset needed for this tutorial, and a text editor for coding and structuring the project as a type-of-streamlit folder.

You can download the dataset from Kaggle from this url. You can import the structure of the streamlit folder from this template:

cookiecutter https://github.com/andymcdgeo/cookiecutter-streamlit.git

You will then be presented with a series of prompts to choose which components to include in your application.

At the end you streamlit project folder will be structured in this way:

streamlit_app/
├── assets/
│ ├── css/
│ │ └── custom_style.css
│ └── images/
│ ├── logo.png
│ └── header.png
├── data/
│ ├── data1.csv
│ └── random_data_file.csv
├── output/
│ └── output.csv
├── pages/
│ ├── Page_1.py
│ └── Page_2.py
├── src/
│ ├── components/
│ │ ├── sidebar.py
│ │ └── special_graph_widget.py
│ └── calculations/
│ ├── simple_maths.py
│ └── trig_functions.py
├── tests/
│ ├── test_simple_maths.py
│ └── test_trig_functions
└── app.py

For running your streamlit app, you need to be inside the ./streamlit_app folder and write:

streamlit run app.py

ps: if you want to see the complete tutorial let’s check my GitHub repository! This repo contains an example of integrating a data science project inside a Streamlit basic app

Dataset

The bank customer churn dataset is a commonly used dataset for predicting customer churn in the banking industry. It contains information on bank customers who either left the bank or continue to be a customer. The dataset includes the following attributes:

Customer ID: A unique identifier for each customer
Surname: The customer’s surname or last name
Credit Score: A numerical value representing the customer’s credit score
Geography: The country where the customer resides (France, Spain or Germany)
Gender: The customer’s gender (Male or Female)
Age: The customer’s age.
Tenure: The number of years the customer has been with the bank
Balance: The customer’s account balance
NumOfProducts: The number of bank products the customer uses (e.g., savings account, credit card)
HasCrCard: Whether the customer has a credit card (1 = yes, 0 = no)
IsActiveMember: Whether the customer is an active member (1 = yes, 0 = no)
EstimatedSalary: The estimated salary of the customer
Exited: Whether the customer has churned (1 = yes, 0 = no)

Data preparation

For the data preparation is important to divide the web app layout (streamlit_app) from the backend operations, like: creating the ML pipeline, loading the template from external sources, creating the instance of your DB, etc.

For this part, I created externally a folder called “utils” and inside there are stored the basic operations cited before. The structure folder is augmented in this way:

├──streamlit_app/
  ├── assets/
  │ ├── ...
  ├── data/
  │ ├── ...
  ├── ...
├──utils/
  ├── model_artifact/
    ├── churn_model_v1.pkl
    ├── ...
  ├── load_template.py
  ├── model.py

Inside model.py is created a class named as ChurnModel, and a function called preprocess_data where are collected the basic operations to parse our dataframe:

To run and train the pipeline model you may need to be at the utils folder level and execute the Python code:

python ./utils/model.py

A new model version is created inside the ./model_artifact folder!

Interacts your data within Streamlit

Streamlit interacts with data entered through the web app seamlessly and intuitively. For building the web app you can write using markdown style and users input data via various widgets (such as text inputs, sliders, or file uploads), Streamlit captures this data and makes it immediately available for use in your script.

For example in my project, I inserted a sidebar where the user can change the input record to be predicted by the churn model:

# apply model to make prediction
def user_input_features(numerical_cols):
    CreditScore = st.sidebar.slider('CreditScore', df.CreditScore.min(), df.CreditScore.max(), df.CreditScore.mean())
    Age = st.sidebar.slider('Age', df.Age.min(), df.Age.max(), df.Age.mean())
    Tenure = st.sidebar.text_input('Tenure', f"Insert a number - max: {round(df.Tenure.max()+df.Tenure.std(), 0)}")
    Balance = st.sidebar.slider('Balance', df.Balance.min(), df.Balance.max(), df.Balance.mean())
    NumOfProducts = st.sidebar.text_input('NumOfProducts', f"Insert a number - max: {round(df.NumOfProducts.max()+df.NumOfProducts.std(), 0)}")
    HasCrCard = st.sidebar.radio('HasCrCard', [0, 1], index=0)
    IsActiveMember = st.sidebar.radio('IsActiveMember', [0, 1], index=1)
    EstimatedSalary = st.sidebar.slider('EstimatedSalary', df.EstimatedSalary.min(), df.EstimatedSalary.max(), df.EstimatedSalary.mean())
    
    data = {'CreditScore': [CreditScore],
            'Age': [round(Age, 0)],
            'Tenure': [float(Tenure.split(":")[1].strip())],
            'Balance': [Balance],
            'NumOfProducts': [float(NumOfProducts.split(":")[1].strip())],
            'HasCrCard': [HasCrCard],
            'IsActiveMember': [IsActiveMember],
            'EstimatedSalary': [EstimatedSalary]
    }
    features = pd.DataFrame(data)
    features[numerical_cols] = features[numerical_cols].astype("float")
    return features

single_employer = user_input_features(churn_model.get_numerical_features())
predict_churn = model_trained.predict(single_employer)
st.write(predict_churn)

The result through the browser is:

Fig 4.0 — Snapshot of Demo Streamlit App

To give explainability to the prediction made interactively through the sidebar on the left, I integrated a SHAP library and their plots at the right side of the web app:

explainer = shap.Explainer(model_trained.predict_proba, X)
single_employer_processed = single_employer.copy()
single_employer_processed = pd.DataFrame(sscaler.transform(single_employer_processed), columns=df.drop(columns=["Exited"]).columns, dtype=float)

shap_values = explainer(single_employer_processed.astype(float))

# predict individual
st.header('Probability of Churning')
print(shap_values)
shap.plots.waterfall(shap_values[0,:, 1], max_display = 10) 
st.pyplot(bbox_inches='tight')
plt.clf()
st.write('---')

st.header('Probability of Not Churning')
shap.plots.waterfall(shap_values[0, :, 0], max_display = 10) # loan not accepted by the customer
st.pyplot(bbox_inches='tight')
plt.clf()

The result of this integration is made by the classic waterfall plots that explain the prediction made:

Fig 4.1 — Snapshot of Demo Streamlit App

..and that’s it, with a simple line of code you can build in a few minutes an entire web app from scratch!

For some extra codes, you can look at my GitHub repository!

Streamlit Community

The Streamlit community is vibrant and active, composed of developers, data scientists, and enthusiasts from around the world. Here are some key aspects of the Streamlit community:

Diverse and Global
Community Engagement
Learning and Sharing Resources
Events and Meetups
Official Support and Contributions

Youtube Channel

The community of Streamlit takes much importance from its open-source tutorials made by their community and the workers of the company, here is the possibility to follow its page on Youtube:

The channel is a great resource for learning how to use Streamlit effectively and keeping up with the latest developments in the platform.

Fig 5.1 — Streamlit home from its YT Channel

Streamlit App Gallery

The Streamlit App Gallery is a collection of fascinating applications created by Streamlit users and hosted on the Streamlit Community Cloud. These apps showcase the versatility and capabilities of Streamlit. If you’re interested in building your own Streamlit app, these resources can serve as great inspiration!

LAST BUT NOT LEAST… it is Acquired by Snowflake

Snowflake, is a cloud-based data warehousing platform, that helps customers store and manage large volumes of data without cloud vendor lock-in. Snowflake has integrated into the cloud platform Streamlit to enhance data application development.

It recently acquired Streamlit. The acquisition deal was reportedly valued at $800 million.

Conclusions

Wrapping things up, let me just say: Streamlit is a game-changer! It’s like having a super-smooth ride for your data projects — so simple, yet so powerful. Ready to dive in? With Streamlit, your ideas can take flight and your projects can reach new heights. Here’s to a future where creativity rules and Streamlit is our trusty sidekick along for the ride!

References

[1] Streamlit • A faster way to build and share data apps
[2] Streamlit: costruire una Web App in pochi minuti — Flowygo
[3] App Gallery • Streamlit
[4] Streamlit in Snowflake
[5] For having a cookiecutter template that will create the basic structure for an organised Streamlit app provided by Andy: andymcdgeo/cookiecutter-streamlit (github.com)
[6] My GitHub repo: Jeremy98-alt/demo-streamlit-xai: a simple demo to leverage the power of streamlit plus the usage of xAI as a technique to explain the prediction of a model (github.com)