Python and LLM for Market Analysis Part V — Streamlit & CSS for news aggregation portal

Arjun
11 min readJan 23, 2024

--

In our previous articles, we delved into the world of news aggregation APIs and harnessed the power of Language Models (LLMs) to decipher sentiment. Combining these insights with technical data, we crafted personalized recommendation lists. Now, in this installment, we embark on a thrilling journey to create something truly invaluable — a dashboard dedicated to finance and stock-related news, adorned with insightful technical charts.

In my pursuit of staying on top of the latest in finance, I’ve invested in multiple premium subscriptions over the years. However, there’s always been a lingering desire for a one-stop solution — a comprehensive news aggregation platform seamlessly intertwined with technical analysis and AI-driven sentiment insights. While existing platforms serve their purpose well, our mission here is to roll up our sleeves and craft our own. Let’s dive into the realm of development and bring this vision to life.

In my early development days, my focus leaned heavily towards crafting frontends using HTML, CSS, and JS. Backends and servers were not initially on my radar. As I transitioned into a Python Developer role, I found myself dedicating less time to frontend development. Curiosity arose about a framework that would allow me to seamlessly integrate Python’s coding style with frontend tasks.

I initially experimented with WTForms and Jinja templating, but the quest for dynamic charts and compelling visuals led me to explore further. That’s when I stumbled upon Streamlit — an absolute revelation. Streamlit’s lightweight nature allowed me to stick to the Python coding style I was comfortable with, making project development remarkably straightforward.

For all my subsequent projects and demos, Streamlit became my tool of choice due to its simplicity. Although I ended up learning ReactJS later for my work, being able to write a frontend in python is a ❤.

Story Over….

Note: Please read the previous articles Part-III and Part-IV before reading further for smooth continuity.

Whats this UI portal all about?

  1. Date Field: A straightforward date picker allowing users to choose a date, revealing the day’s news.
  2. Three-Columnar Card Layouts: Displaying news articles, images, links, AI sentiment, and recommendations in an organized format.
  3. Download Button: Initiates the download of data from the news aggregator, coupled with sentiment analysis using the previously developed LLM.
  4. Progress Bar: Offers an updated, real-time status of ongoing processes.

We’re set to develop two essential files:

  • Home.py: A Streamlit file dedicated to the frontend.
  • styles.css: A simple CSS file for frontend styling and enhancing card layouts.

Additionally, we’ll enhance our existing implementation in news_tech_trader.py to provide real-time status updates on the UI. Let's dive into the development process!

#Home.py
import streamlit as st
from datetime import datetime, date
import pandas as pd
import os
from news_tech_trader import nt
import time

st.set_page_config(layout="wide")
bgs = [
"#8B0000","#B22222","#DC143C","#FF4500", "#FF6347","#FF7F50","#FFA07A","#FFD700", "#ADFF2F","#008000"
]

Home.py is a Streamlit server file dedicated to serving content for the UI. In this file, the recommendation class from the previous article (located in news_tech_trader.py) is imported using the line from news_tech_trader import nt. The page layout is configured to be wide, ensuring a full-page app experience.

Additionally, a palette of 10 colors, ranging from green to amber to red, has been selected and stored in a list (bgs). These colors will be utilized for visually representing AI sentiment in the displayed cards.

#Home.py - continues
def local_css(file_name):
with open(file_name) as f:
st.markdown(f'<style>{f.read()}</style>', unsafe_allow_html=True)


def get_data_frame(file_name):
df = pd.read_csv(file_name)
if 'ImageURL' not in df.columns:
df['ImageURL'] = None

if 'recommendation' not in df.columns:
df['recommendation'] = None

if 'PRICE_AT_TIME' not in df.columns:
df['PRICE_AT_TIME'] = None
else:
df = df[df['PRICE_AT_TIME'].notna()]
df = df.sort_values(by='impact', ascending = False)
return df

In this segment of Home.py, two methods are defined. The local_css function loads the CSS file using st.markdown, allowing for custom styling. The get_data_frame function is responsible for obtaining the necessary data frame from a CSV file. The method includes data cleanup steps such as checking for null values, ensuring column existence, and sorting the data by the 'impact' column in descending order for better presentation.

Next comes the biggest chunk of this implementation. So gear up..

#Home.py - continues
local_css('./styles.css')
def display_news_cards(selected_date):
col1, col2, col3 = st.columns(3)
cols = [col1, col2, col3]
file_name = f'./data/news/{selected_date}.csv'
if os.path.exists(file_name):
df = get_data_frame(file_name)
chunk_size = 3
for i in range(0, len(df), chunk_size):
chunk = df.iloc[i:i+chunk_size].reset_index(drop=True)
for idx, row in chunk.iterrows():
with cols[idx]:
new_value = int((row['impact'] + 1) * 50)
disp = new_value - (new_value%10)
bg = bgs[disp//10]
container = st.container()
container.markdown(
f"""
<div class="card">
<img src={row['ImageURL']} alt="News Image">
<div class="content">
<h2><a href="{row['URL']}" target="_blank">{row['Title']}</a></h2>
<p class="organization">{row['name']} - {row['symbol']}</p>
<p class="sentiment">AI Sentiment(ranges from -1 to 1): {row['impact']}</p>
<p class="sentiment">AI Recommendation: {row['recommendation']}</p>
</div>
<div class="sentiment-container">
<!-- Sentiment Fill -->
<div class="sentiment-slider" style="width: {disp}%; background-color:{bg};"></div>
</div>
</div>""", unsafe_allow_html=True)

else:
st.text_area(label="Empty data display",value="No Data Available.",label_visibility="collapsed")

Note: It might look like a lott of code at first, but please give it a look and a little patience, its really simple.

Here is the explanation of the code

  • This code segment begins by invoking the method responsible for loading the CSS file. Within the display_news_cards function, the initial step involves attempting to read a file based on the date entered by the user. As a reminder from our previous article, we adopted a naming convention for CSV files—<date>.csv—where the date corresponds to the publication date of news articles, AI sentiment, and technical data.
  • If the file exists, its contents are presented on the UI as cards. In the absence of the file, a text box is displayed with a message indicating ‘No data available.’ To accommodate the layout, we utilize Streamlit’s st.columns function to define three columns.
  • When the file exists, the dataframe is grouped into multiple sets, each containing three items. This grouping ensures that each item is displayed within its designated column. The chunking mechanism, facilitated by setting chunk_size to 3, contributes to an organized presentation of news articles on the UI.

Note the line chunk = df.iloc[i:i+chunk_size].reset_index(drop=True) very carefully. we are resetting the index so that each chunk will always have index 0,1,2. This helps us group the article(row in df) in its corresponding column in our 3 column layout.

  • We implement a rounding-off mechanism for the predicted sentiment values, originally ranging from -1.0 to +1.0, to fit within the range of 0 to 100 and rounded to the nearest 10. This rounded value becomes a crucial factor in determining the color representation of the sentiment.
new_value = int((row['impact'] + 1) * 50)
disp = new_value - (new_value%10)
bg = bgs[disp//10]
  • An exciting aspect of utilizing Streamlit is the ability to seamlessly combine Python code with HTML and CSS to create visually appealing UIs. In this implementation, a container is crafted, representing a card layout for displaying news articles dynamically.
<div class="card">
<img src={row['ImageURL']} alt="News Image">
<div class="content">
<h2><a href="{row['URL']}" target="_blank">{row['Title']}</a></h2>
<p class="organization">{row['name']} - {row['symbol']}</p>
<p class="sentiment">AI Sentiment(ranges from -1 to 1): {row['impact']}</p>
<p class="sentiment">AI Recommendation: {row['recommendation']}</p>
</div>
<div class="sentiment-container">
<!-- Sentiment Fill -->
<div class="sentiment-slider" style="width: {disp}%; background-color:{bg};"></div>
</div>
</div>
  • The layout is defined using a <div> with the class card. It includes an image section at the top, populated with the image link stored. Another <div> with the class 'content' follows, where textual information such as the news title, company name, AI sentiment, and recommendation are displayed.
  • Towards the bottom of the card, there’s a sentiment-container class. In this section, the width and background color (bg) are dynamically set based on the calculated disp variable. bg selects one of the colors defined in the list variable bgs, providing a visual representation of the sentiment. This flexible combination of Python and HTML/CSS in Streamlit facilitates the creation of aesthetically pleasing and interactive user interfaces.

A very long time I wrote some CSS. So i used a bit of ChatGPT to code it up for me. Lets dive into our card layout CSS.

.card {
display: flex;
flex-direction: column;
position: relative;
width: 100%;
height: 500px;
border: 1px solid #ccc;
border-radius: 8px;
overflow: hidden;
margin: 10px;
}

/* Image Styling */
.card img {
width: 100%;
height: 50%;
object-fit: cover;
}

/* Content Styling */
.card .content {
flex-grow: 1;
padding: 15px;
}

.card h2 {
font-size: 1.2em;
margin-bottom: 10px;
}

.card p {
font-size: 0.9em;
color: #21bdf5;
}

.card .organization {
font-weight: bold;
}

/* Sentiment Styling */
.card .sentiment-container {
display: flex;
position: absolute;
bottom: 0;
left: 0;
width: 100%;
height: 25px;
background-color: #f2f2f2;
}

.card .sentiment-slider {
flex-grow: 1;
background-color: #4CAF50;
}

.card .sentiment-label {
font-size: 25px;
color: #44e6f8;
}

/* Link Styling */
.card a {
text-decoration: none;
color: #007BFF;
}

/* Hover Effect */
.card:hover {
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
transform: translateY(-2px);
transition: box-shadow 0.3s, transform 0.3s;
}

Lets ask the owner of the code to give us a bit of explanation. I believe ChatGPT could provide a better explanation of the above code than what I could ever do. So here is the response https://chat.openai.com/share/77b72f06-dd10-4abc-b5f5-bf5f2d02af66

Next, Lets build a progress bar that displays the progress while the download, scrapping and AI sentiment collection happens in the backend

#Home.py
def progress_bar():
progress_text = "Operation in progress. Please wait."
my_bar = st.progress(0, text=progress_text)

for percent_complete,progress_text in nt.run(date.today()):
time.sleep(0.01)
my_bar.progress(percent_complete, text=progress_text)
time.sleep(1)
my_bar.empty()

As we could see, my_bar.progress(percent_complete, text=progress_text) is set to show progress. Hence we will need to update some part of our code from previous article to return status regulary.

We’ve successfully implemented the progress bar, but how do we obtain regular updates when our code, as discussed in the previous article, only returns data at the end? Let’s delve into a Python concept known as generators, utilizing the power of the yield keyword.

What is yield in Python?

yield in Python is a key player in the world of generators. Unlike the conventional return keyword, which concludes the function and delivers a single result, yield allows the creation of generators, optimizing memory usage. This is particularly valuable when dealing with extensive lists, enabling the return of results incrementally instead of all at once.

By employing yield, a function becomes a generator. When the generator function is invoked, it doesn't execute immediately; instead, it returns a generator object. The function starts executing only when the next() method is called on the generator object. At each encounter with a yield statement during execution, the function produces a value, pauses, and retains its state. Subsequent calls to next() resume the function from where it left off, facilitating a step-by-step generation of results.

Generators, being memory-efficient, are ideal for scenarios where results need to be processed iteratively rather than in a single batch.

More about generators here — https://www.geeksforgeeks.org/python-yield-keyword/

In the news_tech_trader.py file, which we developed in the previous article, let's focus on the utilization of the yield keyword within the run method.

#news_tech_trader.py (developed in last article)
def run(self, day):
results_list = []
day = date.today() if not day else day
extracted_news_file = f'./data/news/{day}.csv'
df = pd.DataFrame()
if not os.path.exists(extracted_news_file):
extracted_news_file = news.extract_news()
df = pd.read_csv(extracted_news_file)
yield 5,"Download Completed.."
df_len = len(df)
i = 0
if 'symbol' not in df:
for index,row in df.iterrows():
text = factory.create_and_scrape(row['URL'])
if text is None or len(text)<10:
text = str(row['Title']) + ' ' + str(row['Description'])
text = palm_interface.summarize(text)
data = palm_interface.prompt(text)
symbol = yfi.get_symbol_from_name(data['name'],data['symbol']) if data else ""
results_list.append({
'symbol': symbol if symbol else "",
'name': data['name'] if data else "",
'impact': data['impact'] if data else 0.0
})
i+=1
percentage = i/df_len*100
yield int(percentage), "Scrapping and AI Analyzing in progress..."
df2 = pd.DataFrame(results_list)
df = pd.concat([df, df2], axis=1)

yield 100, "Generating Recommendations"

This code employs the yield keyword to provide regular progress updates during the execution of the run method. The yielded values include the completion percentage and informative messages for the progress bar. If you need a detailed explanation, you can refer to the previous article where the code is thoroughly explained. The focus here is on the yield part, which ensures that progress is communicated back to the parent code for display in the progress bar.

ok.. finally continuing back to our Home.py where we put together all of these in action.

button_click = st.button(f'Download Latest {date.today()}')
if button_click:
d = None
progress_bar()
d = date.today()
button_click = False

d = st.date_input(label=":blue[Select a date]",format="YYYY-MM-DD")
if d:
with st.expander("News"):
display_news_cards(d)
  • In this section of Home.py, we've introduced a button field labeled Download Latest with the current date. Clicking the button triggers the progress_bar() function, initiating the actual execution of news download, scraping, and AI sentiment analysis.
  • Following that, a date picker field is provided using st.date_input. Upon selecting a date, an expander section is displayed, revealing the news cards for the chosen date using the display_news_cards function we developed earlier.
  • This interactive setup ensures a seamless user experience, allowing users to download the latest news, track the progress through a dynamic progress bar, and explore news cards based on their preferred date.

We are done writing all the code… uff…

Complete code available in github — https://github.com/vishyarjun/news_based_stock_analyzer

Output of the implementation

  1. Cards and AI sentiment view

2. Download and live progress update

Some Final Thoughts…

Having recently completed the development of my personal trading portal, which encompasses market analysis, backtesting capabilities, and now, the seamlessly integrated news aggregation portal, I must say it stands out as one of the most valuable additions. The portal also boasts exciting features such as integration with the Zerodha platform for both manual and automated buy/sell actions, collaboration with Finetuned Finance LLMs, and up-to-date macroeconomic indicators and graphs.

After immersing myself in Streamlit for the past six months, navigating its capabilities alongside the mentioned features, I’ve come to regard Streamlit as one of my favorite frameworks in recent times. While at work, where we cater to a user base of several thousand users, Streamlit is primarily employed for demos. In our production systems, we have not started using streamlit yet, although we are experimenting on it carefully on its possibilities.

In my forthcoming articles, I’m excited to share a separate series detailing my experiences in building features like Single Sign-On (SSO) and attempting to create a production-like system. I’ll delve into scaling considerations and explore performance testing, all accomplished using Streamlit. Stay tuned for insights into leveraging Streamlit beyond just demos, aiming for robust and scalable applications.

Thanks for reading it this far. If you find this post useful, please leave a clap or two, or if you have suggestions or feedbacks, please feel free to comment, It would mean a lot to me!

Incase of queries or details, Please feel free to connect with me on LinkedIn or X(formerly twitter).

--

--

Arjun

Practicing AI & Software Engineering with Computers and Kindness with Humans!