How to Make Smart Investment Decisions: Stock Analysis and Value-Weighted Index

Pedro Flores
Machine Learning Reply DACH
10 min readApr 25, 2023
Photo by Nicholas Cappello on Unsplash

Introduction

The stock market is a fascinating and complex system where the share prices of companies change constantly, influenced by various factors, such as company performance, market sentiment, and global economic conditions. In this article, the stock data of various companies will be explored and analyzed, and some visualizations will be done for better data understanding. Furthermore, it will be explained how to create a Value-Weighted Index using data from companies with substantial stock market presence that can be used as a reference for a personal portfolio.

The approach followed have these 4 points:

  • Stock Data
  • Perform Exploratory Data Analysis
  • Compute Technical Indicators
  • Create a Value-Weighted Index

Stock Data

The stock data analyzed will be from companies in the Technology sector: Apple Inc. (AAPL) and Reply S.p.A. (REY.MI), Consumer Cyclical: Tesla, Inc. (TSLA), Healthcare: Pfizer Inc. (PFE) and Financial: JPMorgan Chase & Co. (JPM)

In order to fetch the data, a popular Python library that allows access to financial data from Yahoo Finance will be used: yfinance. For example, to download Apple’s stock data, the function yf.download() can be used. The parameters needed for the definition are the ticker symbol (AAPL for Apple Inc.), the start and end dates for the data, and the interval at which the data is desired (hourly, daily, etc.).

Here’s an example that downloads daily stock data for Apple Inc. from January 1, 2018, to April 01, 2023:

ticker = "AAPL"
start_date = "2018-01-01"
end_date = "2023-04-01"
interval = "1d"
aapl_data = yf.download(ticker, start=start_date, end=end_date, interval=interval)

The yf.download() function returns a pandas DataFrame with columns for Open, High, Low, Close, Adjusted Close, and Volume:

Stock Data DataFrame

Perform Exploratory Data Analysis

Exploratory Data Analysis (EDA) is a critical step in the data analysis process, allowing us to understand, summarize, and visualize the main characteristics of a dataset. When working with stock data, EDA can help identify trends, patterns, relationships, and potential outliers.

Check Missing Values

First, let’s make sure that the data used has no missing values. Identifying and handling these values is an essential step because they can have a significant impact on the quality of the analysis.

Count of missing values in the stock data downloaded

Visualize Stock Prices

Visualizing stock prices is important for several reasons, including better understanding, identifying trends and patterns, and detecting anomalies and outliers.

Closing Prices of stock data

It can be seen how for example Pfizer Inc. stock seems to be more stable in contrast to Tesla, Inc. Also, in the beginning of 2020 there is a noticeable drop in all prices matching the start of the global pandemic.

Visualize Daily Returns

Daily returns can be positive (gains) or negative (losses) and are influenced by various factors, including market conditions, company performance, and investor sentiment. Investors use daily returns to assess their investment strategies, evaluate the risk-reward profile of their portfolios, and make informed decisions about buying, holding, or selling stocks.

Daily Returns of stock data

For the studied companies, it can be identified for example that Pfizer Inc. return (in green) was more relevant by the end of 2021, how Tesla, Inc. shows a volatile behavior over time (in orange), or the big returns and losses observed in the beginning of 2020 for all companies.

Visualize the Distribution of Stock Returns

This step is important for several reasons:

· Identifying the shape of the distribution, whether it is symmetric or skewed. This information is useful for understanding the general behavior of the stock.

· Assessing the historical performance, such as its average return and the range of returns it has experienced. This information can be used to gauge the stock’s potential future performance and to make informed investment decisions.

· Identifying outliers or extreme returns that may have occurred. This can help understand if a stock has been subject to sudden, large price movements or if it has generally been stable over time.

· Assessing volatility, which is a key measure of risk. A wider distribution of returns indicates higher volatility and, consequently, higher risk, while a narrower distribution suggests lower volatility and risk.

All this can be observed in the following graphics. Each stock has been represented in a subplot for a better visualization.

Distribution of Stock Returns

It can be seen how in general they all follow a symmetric distribution. Regarding outliers, JPMorgan Chase & Co. has several negative daily returns while Reply S.p.A. has an unusual spike as a positive daily return of around 0.04.

When assessing the volatility, Tesla, Inc. seem to be the riskier value for having daily returns that range between -0.2 and 0.2. On the other hand, Pfizer Inc. shows a more stable behavior with a range between -0.075 and 0.1.

Visualize the Correlation Between Stock Returns

This is a measure of the degree to which the returns of two stocks move together and is generally more useful than visualizing just the correlation between stock prices, mainly due to the following reasons:

· Stationarity: Stock prices tend to trend over time and can show high correlations. Stock returns, representing percentage changes, are more stationary and provide a better measure of the relationship between stocks.

· Diversification: When constructing a portfolio, investors often seek to diversify their holdings to minimize risk. The correlation between stock returns helps to identify stocks that move independently or even in opposite directions, which can be beneficial for reducing the overall risk of the portfolio.

This can be seen in the following heatmap:

Correlation Between Stock Returns

For example, if Apple Inc. shares are part of a portfolio and someone wants to buy a new share that diversifies it, two good options would be Reply S.p.A. (REY.MI) or Pfizer Inc. (PFE) since both are less correlated.

Compute Technical Indicators

These are essential tools for stock analysis because they help traders and investors make informed decisions by analyzing historical price and volume data. Technical indicators can provide insights into market trends, momentum, volatility, and other aspects.

Simple Moving Average (SMA) and Exponential Moving Average (EMA)

The SMA is calculated by averaging the stock prices over a specific window of time. It helps identify the overall trend by smoothing out price fluctuations. If the shorter-term SMA (SMA50) is above the longer-term SMA (SMA200), it indicates an uptrend meaning that the value is expected to continue rising.

SMA and EMA for Apple Inc.

Taking a closer look at Apple Inc. SMA50 (orange) between 2020 and 2022, it is over the SMA200 (green) for the whole period, and it shows an uptrend as expected.

The EMA places more weight on recent data, making it more responsive to short-term price changes. Like the SMA, when the short-term EMA (EMA50) is above the long-term EMA (EMA200), it indicates an uptrend. The same behavior of SMA for Apple Inc. can be seen between EMA50 (red) and EMA200 (purple).

Relative Strength Index (RSI)

The RSI is a momentum indicator that measures the speed and magnitude of price movements. It ranges from 0 to 100, with values above 70 typically considered overbought (potentially overvalued), and values below 30 considered oversold (potentially undervalued). When the RSI is above 70, the stock may be overbought, suggesting that it might be a good time to sell. Conversely, when the RSI is below 30, the stock may be oversold, suggesting that it might be a good time to buy. However, in a strong trend, the RSI may remain in overbought or oversold territory for an extended period.

Relative Strength Index for JPMorgan Chase & Co

The graphic above depicts the RSI for JPMorgan Chase & Co. It shows how there is a general trend for the RSI to keep values between 30 and 70. Nevertheless, when the trend is strong, RSI remains under 30 or above 70 for some time.

Bollinger Bands

They are an analysis tool developed by John Bollinger in the 1980s. Bollinger Bands consist of a moving average (typically the 20-day moving average) and two standard deviation bands above and below the moving average. These bands expand and contract based on the stock’s volatility. When the bands are tight, it suggests low volatility, and when they are wide, it indicates high volatility. Prices tend to revert to the mean after touching the upper or lower bands, so traders often use these bands to identify potential entry and exit points.

Bollinger Bands for Tesla, Inc.

The Bollinger Bands for Tesla, Inc. show a period that suggests low volatility until middle of 2020, then, the bands start having a wider behavior what could be understood as a period of high volatility. It is also remarkable how the Close price (blue) tends to revert to the mean after touching the upper or lower bands.

Create a Value-Weighted Index

Building a Value-Weighted Index involves selecting a group of stocks that represent a specific market segment or industry, and then weighing them based on their market capitalization. To create a well-diversified value-weighted index using Python, companies form different industries are going to be considered:

· Technology: Apple Inc. (AAPL) and Microsoft Corporation (MSFT)

· Healthcare: Johnson & Johnson (JNJ) and Pfizer Inc. (PFE)

· Financials: JPMorgan Chase & Co. (JPM) and Bank of America Corporation (BAC)

· Consumer: Coca-Cola (KO) and PepsiCo, Inc. (PEP)

The steps followed to create a Value-Weighted Index are:

1. Calculate the number of shares for selected companies by dividing Market Capitalization by Last Sale . The number of shares will be considered constant.

num_shares[ticker] = market_caps[ticker] / data[ticker].iloc[-1]

2. Calculate the market capitalization on each day by multiplying the number of shares by the close stock price

daily_market_caps = data * num_shares

3. Sum the Market Capitalization of each day to have the total Market Capitalization per day

global_market_cap = daily_market_caps.sum(axis=1)

4. Calculate the company weight in the new index by dividing the Market Capitalization of each company by total Market Capitalization per day

weights = market_cap.div(global_market_cap)

5. Calculate the company weighted returns by multiplying the company weight in the new index by the stock return

weighted_returns = weights.mul(index_return)

Once this is done it is possible to visualize the Aggregate Market Capitalization and Weighted Returns, compare the Value-Weighted Index created with a reference in the stock market index like the SP500, that tracks the stock performance of 500 of the largest companies listed on stock exchanges in the United States, and finally plot a Correlation Clustermap.

Visualize the Aggregate Market Cap and Weighted Returns

Aggregate Market Cap and Weighted Returns

The left plot represents the total Market Capitalization per day for the chosen companies. In the beginning of 2020, there is a sharp decline in the share price. On the horizontal histogram on the left, the weighted returns show how Apple Inc. and Microsoft Corporation have a bigger weight in the created index than other companies.

Compare Index Created with SP500

Value-Weighted Index created vs SP500

The picture above depicts a comparison of the Normalized Market Capitalization between the custom index created and SP500. If 100€ would have been invested in 2018 in S&P500, this would have grown 25% in 2020 and 75% in 2022. On the other hand, if 100€ would have been invested in the new Value-Weighted Index created, in 2020 the quantity would be 50% higher while in 2022 more than 150% higher. The companies selected for the Value-Weighted Index performed in the last years better than companies in S&P500.

Correlation Clustermap

Correlation Clustermap

The Clustermap above represents the correlation between stock returns, with lighter colors indicating a stronger correlation and darker colors indicating a weaker correlation. For instance, JPMorgan Chase & Co. (JPM) and Bank of America Corporation (BAC) exhibit a return correlation of 0.92, implying that their stock prices have tended to move together in most cases. In contrast, Apple Inc. (AAPL) and Pfizer Inc. (PFE) have a stock return correlation of only 0.35, indicating that their stock movements are relatively unconnected.

Conclusion

This analysis has explored the intricate world of the stock market. Different trends, patterns, relationships, and potential outliers have been identified in the data through Exploratory Data Analysis and computing essential technical indicators. Creating a Value-Weighted Index and comparing it with a reference index, such as the S&P 500, has showcased the performance of the selected companies.

Overall, this article demonstrates the importance of a thorough analysis of stock data and highlights how it can help investors make informed decisions in the complex world of stock markets.

At Machine Learning Reply, we guide and support all our customers in the development of their IT capabilities towards Machine Learning, data, or cloud Use Cases, regardless of their current phase.

--

--