Leveraging ChatGPT to Build a Personal Financial Analyst with Python: A Step-by-Step Guide

Peter Wei
9 min readDec 12, 2023

--

Staying updated with financial news and understanding its impact on stock prices is a complex task for investors and finance professionals. To address this, I’ve developed a web application using ChatGPT, which you can explore here. This tool allows you to input a stock ticker or company name and a date range, and it fetches news and generates analysis for the most critical events concerning the company during that period. Here’s a demo to showcase its functionality.

In this tutorial, I document the development process of this app, aiming to facilitate understanding and usage of OpenAI’s API. You’ll learn how to:

  1. Direct ChatGPT to deliver responses in a specific format through OpenAI API and prompt engineering.
  2. Scrape news within a historical date range for any given company/stock, build a web app using Flask, and deploy it on PythonAnywhere.

Understanding OpenAI’s API

Integrating OpenAI’s API into your application is a direct process. A prime example of its utility is the openai.ChatCompletion.create() function. Let's delve into its practical use by creating a tool that discerns stock tickers from user queries - a fundamental feature for applications like customer support and data analysis.

Before diving into the API, it’s necessary to configure your environment correctly. This means setting up the OpenAI API key as a system environment variable. You can find a detailed guide on this setup at Immersive Limit’s tutorial, or simply ask ChatGPT to walk you through the process.

Here’s how you can utilize it:

from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": "You are an AI trained to figure out stock market ticker and return it."
},
{
"role": "user",
"content": f"Return only the stock ticker symbol for the company commonly referred to as apple."
},
{
"role": "assistant",
"content": f"AAPL"
},
{
"role": "user",
"content": f"Return only the stock ticker symbol for the company commonly referred to as tesla."
}
],
)

A pivotal element in leveraging OpenAI’s API is the messages parameter. This parameter expects an array of message objects, with each object playing a distinct role—either "system", "user", or "assistant"—accompanied by its content. A typical conversation flow starts with a system message, creating a foundation, followed by an interplay of user and assistant messages. This architecture is not just about simulating a dialogue; it's about steering the AI's behavior in a targeted manner.

Prompt Engineering: A Key Technique

In order for you to understand the design of the content here, we need to talk about prompt engineering. Effective prompt engineering involves crafting content that clearly defines the task, sets a desired output format, provides necessary context or reference and so on. I will use the above example to illustrate a few techniques on prompt engineering.

  1. Setting a persona: the system message could be used to sculpt the assistant’s persona, almost like setting the stage for a character in a play. In my application, the system message is “You are an AI trained to figure out stock market tickers and return them.” This simple sentence effectively gears the AI towards a specific task, enhancing its accuracy and relevance in responses.
  2. Providing examples for model to adapt a particular style of answer: Assistant messages are used to store previous responses from the assistant. These can be pre-populated by you to demonstrate desired behavior or outcomes, a technique known as prompt engineering. In my application, I start with a straightforward example: the user asks for the ticker symbol of “Apple,” and the assistant responds with “AAPL.” This initial exchange sets a precedent for the assistant’s subsequent interactions.
  3. Specify the steps required to complete a task: The prompt can also includes specific steps for the AI to follow, such as analyzing the overall effect of all the news for a company first and then summarizing the most important key points from individual news. As you will see later, I use this method to direct the model’s approach to news analysis. This structured approach is crucial because it turns a complex task into a sequence of clear steps, making it easier for the AI to respond in the way you want.
  4. Use delimiters to clearly indicate distinct parts of the input: Last but certainly not least, employing delimiters such as triple quotation marks, XML tags, or section titles is highly effective in segmenting text for distinct treatment. For example, you might instruct, ‘Rewrite the text within the brackets in a more formal tone: [insert text here].’ This approach becomes increasingly vital for more complex tasks, as it aids in clearly defining the specifics of the request, ensuring the AI understands precisely what is being asked.

By now, you should have a solid grasp of the tool’s functionality and the rationale behind its content structure. You can see in the end, a real test is posed to the tool when the user asks for the ticker symbol of “Tesla.” Here’s how the API elegantly handles this request:

ChatCompletion(
id='chatcmpl-8UkbB5dXOdBKzUAfoYY1QJVSAHEFh',
choices=[
Choice(
finish_reason='stop',
index=0,
message=ChatCompletionMessage(
content='TSLA',
role='assistant',
function_call=None,
tool_calls=None
)
)
],
created=1702339657,
model='gpt-3.5-turbo-0613',
object='chat.completion',
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=3,
prompt_tokens=67,
total_tokens=70
)
)

The tool accurately identifies and returns “TSLA.” To extract the assistant’s reply, one can simply use response.choices[0].message.content. This level of precision mirrors the experience you would get by posing the same query directly to ChatGPT on OpenAI's platform.

Having explored the nuances of using the OpenAI API, let’s take a look at how we can apply this powerful tool to the realm of financial news analysis.

Analyzing Financial News Using OpenAI

The code structure of my project is straightforward, composed of two Python files: financial_analyzer.py for news extraction and analysis, and app.py for the Flask web app. The entire codebase is available on my GitHub page.

The procedure involves:

  1. Validating the Stock Ticker: In the is_valid_ticker function, we ascertain whether the user’s input is a correct stock ticker. If not, we employ our earlier-mentioned ticker-generation tool in function get_corrected_ticker to deduce the accurate ticker based on the user’s input.
  2. URL Generation for News Query: Our primary news source is Seeking Alpha. In function generate_seeking_alpha_url, we create a URL specific to Seeking Alpha’s format, including the stock ticker and date range.
  3. HTML Parsing for News Link Extraction: After accessing the URL, in function extract_news_url, we parse the HTML to extract individual news article links.
  4. Text Extraction from News Articles: In function get_text_from_url, we fetch the full text of each article.
  5. Concatenate the news text with the prompt and feed it into the OpenAI API, as in function analyze_financial_news.
  6. Publish the app on web using Flask and PythonAnywhere: Our app.py file utilizes Flask for deployment, but the details of Flask are beyond the scope of this guide, as we're focusing on the OpenAI API. The deployment process is straightforward: we link our GitHub repository with PythonAnywhere, and configure the appropriate paths and environment in the PythonAnywhere. Once these steps are complete, our Flask web application becomes accessible to anyone
# Filename: financial_analyzer.py

import re
from datetime import datetime, timedelta
import requests
from bs4 import BeautifulSoup
from openai import OpenAI
import yfinance as yf


def generate_seeking_alpha_url(ticker, start_date, end_date):
start_date_formatted = start_date.strftime("%Y-%m-%dT%H:%M:%S.000Z")
# Adjust end_date to the end of the next day and format
end_date_adjusted = end_date + timedelta(days=1)
end_date_formatted = end_date_adjusted.strftime("%Y-%m-%dT23:59:59.999Z")

# Construct the URL
url = f"https://seekingalpha.com/symbol/{ticker}/news?from={start_date_formatted}&to={end_date_formatted}"
return url


def extract_news_url(ticker, start_date, end_date):
url = generate_seeking_alpha_url(ticker, start_date, end_date)
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

# Regular expression pattern to match the desired structure and capture the id, publishOn, and title
pattern = r'"id":"(\d+?)","type":"news","attributes":{"publishOn":"(.*?)",".*?"title":"(.*?)"'

# Find all matches using the regular expression
matches = re.findall(pattern, str(soup))

# Filter matches by the publishOn date and collect news titles
filtered_matches = []
for match in matches:
news_id, publish_on_str, title = match
publish_on_date = datetime.strptime(publish_on_str.split("T")[0], "%Y-%m-%d")

if start_date <= publish_on_date <= end_date:
filtered_matches.append((news_id, title))

# Create a list of URLs with titles
return [(f"https://seekingalpha.com/news/{x[0]}", x[1]) for x in filtered_matches]


def get_text_from_url(url):
# Send a request to the URL
response = requests.get(url)
# Parse the content of the request with BeautifulSoup
soup = BeautifulSoup(response.content, "html.parser")

# Extract text from the BeautifulSoup object
text = " ".join(map(lambda p: p.text, soup.find_all("p")))
return text


steps = """1) Analyze the overall effect of all the news on company stock price (positive, negative, neutral). 2)
Summarize key points and assess how each news impact the company's future prospects and stock price. keep in mind the
max length of your response is 200 words in total, so prioritize the news that are most important and relavent to the
company's prospects."""
client = OpenAI()


def analyze_financial_news(ticker, start_date, end_date):
# Validate Ticker
if not is_valid_ticker(ticker):
# Use OpenAI API to correct the ticker
ticker = get_corrected_ticker(ticker)

news = extract_news_url(ticker, start_date, end_date)
news_urls = [x[0] for x in news]
news_text = "\n".join([get_text_from_url(url) for url in news_urls]).replace(
"Have a tip? Submit confidentially to our News team. Found a factual error? Report here.",
"",
)
user_msg = f"analyze the effect of these news for {ticker}: {' '.join(news_text.split()[:1500])} "
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": f"You are a professional financial analyst, you generate insights that are both practical "
f"and analytical, potentially useful for investment or trading decisions. You dont include "
f"cliche warnings like 'It is recommended to conduct further research and analysis or "
f"consult with a financial advisor before making an investment decision'. You're succinct "
f"and are able to present all your anlaysis and news summarization in no more than 200 "
f"words. To do this, you prioritize important news, skip not-that-important news, "
f"and get rid of repetitive information.Use the following step-by-step instructions to "
f"respond to user inputs (which contains ticker, and news for you to analyze). {steps}",
},
{"role": "user", "content": user_msg},
],
max_tokens=300, # Estimated tokens for a 200-word response
)

return news, response.choices[0].message.content


def is_valid_ticker(ticker):
try:
info = yf.Ticker(ticker).info
return 'symbol' in info and info['symbol'] is not None
except Exception as e:
print(f"Error checking ticker: {e}")
return False


def get_corrected_ticker(input_ticker):
response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "system",
"content": "You are an AI trained to figure out stock market ticker and return it."
},
{
"role": "user",
"content": f"Return only the stock ticker symbol for the company commonly referred to as apple."
},
{
"role": "assistant",
"content": f"AAPL"
},
{
"role": "user",
"content": f"Return only the stock ticker symbol for the company commonly referred to as {input_ticker}."
}
],
)

corrected_ticker = response.choices[0].message.content
return corrected_ticker
# Filename: app.py

from datetime import datetime
from flask import Flask, request, render_template_string
from financial_analyzer import analyze_financial_news

app = Flask(__name__)


@app.route('/', methods=['GET', 'POST'])
def form():
result = []
analysis = ''
ticker = ''
start_date = ''
end_date = ''
if request.method == 'POST':
ticker = request.form['ticker']
start_date = request.form['start_date']
end_date = request.form['end_date']
start_date_obj = datetime.strptime(start_date, '%Y-%m-%d')
end_date_obj = datetime.strptime(end_date, '%Y-%m-%d')
result, analysis = analyze_financial_news(ticker, start_date_obj, end_date_obj)

return render_template_string('''
<html>
<body>
<form method="post">
Ticker: <input type="text" name="ticker" value="{{ ticker }}"><br>
Start Date (yyyy-mm-dd): <input type="date" name="start_date" value="{{ start_date }}"><br>
End Date (yyyy-mm-dd): <input type="date" name="end_date" value="{{ end_date }}"><br>
<input type="submit" value="Analyze">
</form>
{% if result %}
<h3>News</h3>
<div style="font-size:17px; margin-top:20px;">
{% for url, title in result %}
<div><a href="{{ url }}" target="_blank">{{ title }}</a></div>
{% endfor %}
</div>
{% endif %}
{% if analysis %}
<h3>Analysis</h3>
<div style="font-size:17px; margin-top:20px; white-space: pre-wrap;">{{ analysis }}</div>
{% endif %}
</body>
</html>
''', ticker=ticker, start_date=start_date, end_date=end_date, result=result, analysis=analysis)

if __name__ == '__main__':
app.run(debug=True)

Prospects and Potentials

OpenAI’s API is not just versatile; it’s also beginner-friendly, as we’ve seen in this tutorial. We’ve adeptly handled all kinds of user inputs, extracted tickers, and tasked it with analyzing news, summarizing overall impact as well as individual articles based on their importance. Imagine, if given more time, we could train the model to think like Warren Buffett — perhaps by uploading his books or reports for it to learn from. Furthermore, OpenAI’s prowess isn’t limited to text; it can also interpret audio and video with the right training and prompting, opening doors to a multitude of possibilities.

If it has aided your understanding or sparked your interest, I’d appreciate your support through likes and shares! Your feedback is invaluable, so please feel free to leave comments and suggestions.

Authored by Zirui (Peter) Wei, an avid enthusiast in finance and AI, I am currently open to collaborative opportunities in these fields. If you wish to connect or discuss potential projects, don’t hesitate to contact me:

Thank you for reading, and I look forward to engaging with you and the community!

--

--

Peter Wei
0 Followers

Quant pro with expertise in financial engineering & CS, leveraging hands-on experience from top finance firms.