Give life to your Data Science Apps using Streamlit
Add widgets to your Data Science Applications and host them in web using streamlit
“Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.” — This is the wikipedia definition of Data Science, so what does it simply mean is that we, generate an enormous amount of data in our day to day life and using data science this data can be structured, processed and many useful insights and information can be obtained benefiting individuals.
How can it benefit us? Consider you log on to youtube, to binge watch your favorite videos, but when you open imagine if you get totally irrelevant videos which you have never watched or not prefer to watch we may be irritated. Instead once you open youtube all the videos you are about to search pops up and what ever videos are in your mind just appear on the screen, you will get a feel that world is so understanding, and your time on searching your kind of videos is saved. Data Science helps us this way it reduces the time spent on repetitive tasks by predictive analysis and automating them.
Data Science is a vast domain consisting of subdomains like:
- Machine Learning
- Artificial Intelligence
- Natural Language Processing
- Image Processing
- Audio Modelling
- Feature Engineering
- Data visualization
- Data Analytics
- Deep Learning
and more…
Now coming to a developer point of view, we, Data Science enthusiasts love to explore these areas by practical implementation of our coding knowledge with available data sets in kaggle, github or other data resources.
Usually working with real world data science problems can be done in python, as python has excellent set of libraries and packages inbuilt and the packages supporting data science are ever increasingly added. Despite python being a cool language to code in backend, its not much UI friendly language.
This is why data science enthusiasts have no option to exhibit their wonders done with data. Though through Github we can share our repositories it not much appealing and most of them wouldn’t try it.
Imagine you cook delicious food, but you couldn’t serve it in an appealing way and there’s no one to comment or appreciate, sad part!!!
But the sad story ends now with the use of ‘Streamlit’
Streamlit is an open-source Python library that makes it easy to build beautiful custom web-apps for machine learning and data science. It provides essential widgets required in the application for uploading the data parameters into the Data Science Code. Not only that, the whole application with its necessary packages can be modeled to a web-app and run using streamlit.
First install streamlit in your jupyter-lab terminal or windows powershell or command prompt using the comand,
pip install streamlit
Install all the necessary libraries and packages for your application. Then write your code in a python script ( .py file), import streamlit in your code.
If you need any kind of headings like
you can add it by,
st.title("WELCOME TO THE WORLD OF MUSIC!")
If you need to add an image to make it look good or an explanatory video like this.
by adding these lines,
vid=open("example.mp4","rb")
st.video(vid)
st.markdown("<span style=“background-color:#121922”>",unsafe_allow_html=True)
where example.mp4 is the pre-stored file in your working directory
To upload audio or video or any type of files into the program for predicting results (simulation) something like this
add these lines to your code
file_to_be_uploaded = st.file_uploader("Choose an audio...", type="wav")
for audio type can be wav or mp3
for video type can be mp4
for image type can be jpg, jpeg, png
for files type can be pdf, pptx, docx etc
Also you can customize like instead of uploading an audio, we can also record a file and send as input to our program by first inserting a button to the page as,
using this code
if st.button("Start Recording"):
with st.spinner("Recording..."):
record(duration)
The record(duration) function handles the process of recording the audio file and saving it,
def record(duration):
filename = "recorded.wav"
chunk = 1024
FORMAT = pyaudio.paInt16
channels = 1
sample_rate = 44100
record_seconds = duration
p = pyaudio.PyAudio()
stream = p.open(format=FORMAT,
channels=channels,
rate=sample_rate,
input=True,
output=True,
frames_per_buffer=chunk)
frames = []
for i in range(int(44100 / chunk * record_seconds)):
data = stream.read(chunk)
frames.append(data)
stream.stop_stream()
stream.close()
p.terminate()
wf = wave.open(filename, "wb")
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(sample_rate)
wf.writeframes(b"".join(frames))
wf.close()
audio="recorded.wav"
now we can use the ‘recorded.wav’ file in your succeeding code lines
The duration parameter sent to to record() function can be set by a slider using,
using the code
st.sidebar.title("Duration")
duration = st.sidebar.slider("Recording duration", 0.0, 3600.0, 3.0)
This will create a sidebar in that a slider can be adjusted as required
now to execute the program by a simple command:
streamlit run project.py
(consider the name of your python script source code is named as ‘project’)
Now on entering this command, you will see the following screen, wait till the code is executed this may take some time depending upon your code size and the libraries included.
Finally on perfect execution you can see your data science web app running with the desired UI and widgets in your local host with a url something like this
http://localhost:8501/
Also if you want to modify and re-execute the code you can modify your python script and click rerun
One thing to be noted in streamlit is that it doesn’t use print statement to print any output lines instead it uses ‘st.write()’ if you use print statement in your code then those lines are written to the console.
Now to show case your app with a public url you can use heroku or aws platforms
In heroku you have to upload your files in github repository and connect that repo with your heroku app. Apart from your python script you may need to add a requirements.txt file, Procfile, setup file and add it your github repository
Requirements.txt file can be generated automatically by simply installing pipreqs
pip install pipreqs
and add your working directory to below specified path like
pipreqs /<your_project_path>/
eg.
pipreqs /c:/users/home/proj_directory/
inside the folder proj_directory you should have your python script for which requirements.txt file should be created.
Procfile should be created with the following contents in it,
web: sh setup.sh && streamlit run proj1.py
Procfile should not have any extension else heroku may meet with some errors
setup.sh file contents are
mkdir -p ~/.streamlit/
echo "\
[general]\n\
email = \"example@gmail.com\"\n\
" > ~/.streamlit/credentials.toml
echo "\
[server]\n\
headless = true\n\
enableCORS=false\n\
port = $PORT\n\
" > ~/.streamlit/config.toml
for reference check out my Github repository
https://github.com/PradeepaK1/proj1
Yeah!!! By this you can give life to data science apps and unlock them from your jupyterlab or jupyter notebook