In this article, I am going to explain how to create a pandas-profiling app using Python and Streamlit, which is deployed on Heroku (dataprofile.herokuapp.com). The developed dashboard allows the user to perform exploratory data analysis with no-code! Just drag and drop your data to the dashboard and let the magic happen!
Link to program: dataprofile.herokuapp.com
Getting Started — Generating app.py file
There are four libraries required to generate this app.
Pandas: required to work with tabular data.
Streamlit: the library used to generate the dashboard. See this link for more details.
Pandas-profiling: a great tool that creates exploratory analysis using pandas data frames. See this link for more details.
Streamlit_pandas_profiling: will be used to embed the report into streamlit dashboard.
Streamlit has a setting we can use to set the page configuration layout. I like using the wide option. This is not mandatory.
Let’s create the function to load the data. For this, I am using the pd.read_csv() function to read the data as a dataframe using pandas.
Let’s create a sidebar titled “Upload data”. Here a file uploader is introduced to load the data into the app.
On the other hand, an option selection (st.selectbox) is introduced to choose the profiling mode in the pandas profiling module.
The below code can be used to have an error message/reminder for the user to load the data from the panel.
The last part of the code works after the data is uploaded to the dashboard. First, the data is read as a data frame using pandas. Then, depending on the selected option, the ProfileReport is created and saved as pr.
Once the profile report is ready, it is reported in the dashboard using st_profile_report(pr).
Full code can be seen below:
Deploying to Heroku
The same method explained in my previous article (How to Deploy a Streamlit App with Heroku) was used to deploy the streamlit app to Heroku. Below is a snapshot of the process (takes about 3–4 min).
In conclusion, now there is an app on the cloud (dataprofile.herokuapp.com), which can be used to perform no-code exploratory data analysis.
Special thanks to the developers of streamlit, pandas, and pandas-profiler.
Access to the dashboard: dataprofile.herokuapp.com
Source code: https://github.com/sercangul/dataprofile
Follow me on GitHub: https://github.com/sercangul
Follow me for more information on Python, statistics, and machine learning!