How I code a Python Stock Screener & A.I. Sentiment Analysis to pick stocks.

8 min readDec 22, 2023

Finding stocks to invest in can be a long and tedious process. What if we used both A.I. and Python to create a program that can speed this up? In this blog post, I will explore how the finvizfinance python library can be used to find “undervalued” stocks. Then I will present a method of sentiment analysis using FinBERT, a pre-trained NLP model — allowing for an analysis of these “undervalued” stocks.

Getting Started…

Firstly, you will need to install the required libraries and import them. finviz.com is a website that offers various tools for stock analysis such as a free stock screener — here, import the finvizfinance’s screener object which returns a Pandas data frame of the screener’s results ‘Overview’ section.

from finvizfinance.screener.overview import Overview

Then, import the Pandas, csv and os libraries. These are mainly used to manipulate csv files.

import pandas as pd
import csv
import os

Screening for Potential Stocks…

Now, following a value investing approach, the aim is to collate a list of possibly undervalued stocks to look into. To do this, create a function which uses the finvizfinance library to make a request to the online stock screener. Here is a function which will perform just that:

def get_undervalued_stocks():
    """
    Returns a list of tickers with:
    
    - Positive Operating Margin
    - Debt-to-Equity ratio under 1
    - Low P/B (under 1)
    - Low P/E ratio (under 15)
    - Low PEG ratio (under 1)
    - Positive Insider Transactions
    """
    foverview = Overview()

As a value investor, I am looking for stocks with a low Price-to-Book (P/B). This indicates that the stock price is lower than the value of the company’s assets, which suggests that the stock is undervalued — assuming the company is not in financial hardship. I am also seeking a positive Operating Margin, which is a good indicator of how well a company is being managed and how efficient it is at generating profits from sales. Furthermore, I want the Debt-to-Equity ratio to be below 1, which refines to companies with a lower risk.

Another good indicator is the most recent Price-to-Earnings (P/E) ratio being below its average. Thus, I need both a low P/E ratio and a low Price-to-Earnings Growth (PEG). The P/E ratio relates a company’s share price to its earnings per share. Therefore, a high P/E ratio could mean that a company is “overvalued.”

Finally, I look for positive insider transactions. Positive insider transactions are when insiders buy their own company’s shares. Managers and directors have unique knowledge about the companies they run, so if they are purchasing such shares, it’s reasonable to assume that prospects look favourable for the company.

    filters_dict = {'Debt/Equity':'Under 1', 
                    'PEG':'Low (<1)', 
                    'Operating Margin':'Positive (>0%)', 
                    'P/B':'Low (<1)',
                    'P/E':'Low (<15)',
                    'InsiderTransactions':'Positive (>0%)'}

Remember that these parameters are not completely comprehensive; they can be sector and industry specific, or could be too restrictive, or even, in some cases, too broad. Understanding the business you want to invest in can also be important for a value investor, so you might want to define your own stock picking strategy and look into more personalised parameters. Below is a list of all possible parameters for the FINVIZ.com’s free online stock screener:

parameters = ['Exchange', 'Index', 'Sector', 'Industry', 'Country', 'Market Cap.',
        'P/E', 'Forward P/E', 'PEG', 'P/S', 'P/B', 'Price/Cash', 'Price/Free Cash Flow',
        'EPS growththis year', 'EPS growthnext year', 'EPS growthpast 5 years', 'EPS growthnext 5 years',
        'Sales growthpast 5 years', 'EPS growthqtr over qtr', 'Sales growthqtr over qtr',
        'Dividend Yield', 'Return on Assets', 'Return on Equity', 'Return on Investment',
        'Current Ratio', 'Quick Ratio', 'LT Debt/Equity', 'Debt/Equity', 'Gross Margin',
        'Operating Margin', 'Net Profit Margin', 'Payout Ratio', 'InsiderOwnership', 'InsiderTransactions',
        'InstitutionalOwnership', 'InstitutionalTransactions', 'Float Short', 'Analyst Recom.',
        'Option/Short', 'Earnings Date', 'Performance', 'Performance 2', 'Volatility', 'RSI (14)',
        'Gap', '20-Day Simple Moving Average', '50-Day Simple Moving Average',
        '200-Day Simple Moving Average', 'Change', 'Change from Open', '20-Day High/Low',
        '50-Day High/Low', '52-Week High/Low', 'Pattern', 'Candlestick', 'Beta',
        'Average True Range', 'Average Volume', 'Relative Volume', 'Current Volume',
        'Price', 'Target Price', 'IPO Date', 'Shares Outstanding', 'Float']

Now that the screener’s filters are set, it is possible to connect to the finviz screener API and collect the desired data.

    foverview.set_filter(filters_dict=filters_dict)
    df_overview = foverview.screener_view()
    if not os.path.exists('out'): #ensures you have an 'out' folder ready
        os.makedirs('out')
    df_overview.to_csv('out/Overview.csv', index=False)
    tickers = df_overview['Ticker'].to_list()
    return tickers

#print(get_undervalued_stocks())

Running the above program (if you uncomment the last line) outputs a list of tickers which can help you form a watchlist, provided that there is further analysis. It is advisable that, before rushing to conclusions, you may wish to spend some time manually looking into these companies — focusing on, for example, Income statements, or Balance sheets.

['AMPY', 'BOOM', 'BWB', 'CAAS', 'CNX', 'HAFC', 'HTLF', 'SASR', 'SPNT', 'TCBX']

You will also find inside of your ‘out’ folder an overview of each of these stocks inside of a csv file.

Screenshot of my Overview.csv file after running the program on the 21/12/2023.

Using Financial BERT to perform Sentiment Analysis…

Now, you should have a list of “undervalued” stocks to invest in. However, it is also important to know the media’s sentiment on the company, and, most importantly, their stock itself. Whilst a good value investor should not be a part of the herd, it is wise to have an idea of perspectives on a stock — as this can help with assessing the root cause of the value of it, whether it is over or undervalued.

Thus, you will need to code a program able to obtain a list of recent news articles that relate to each stock given by the stock screener. This program will output the article’s headlines, publishing dates, and the overall sentiment conveyed within the article — allowing for an easy overview of the general sentiment towards a stock.

Getting started…

First, it is necessary to install and import the required dependencies.

The transformers library provides thousands of pre-trained models to perform tasks on different modalities such as text, vision, and audio. To perform sentiment analysis of financial news articles I will use ProsusAI pre-trained model FinBERT. This model is built by further training Google’s language model BERT in the finance domain.

from transformers import pipeline

To download news on market data from the Yahoo! Finance API I will be using the yfinance Python library.

import yfinance as yf

Next, I import Goose (an Html Content / Article Extractor, web scrapping for Python3), as well as the get method from the requests library to send HTTP requests and get the financial news articles data.

from goose3 import Goose
from requests import get

Getting Financial News Article Data…

Let’s define a function which takes as input a ticker and returns a Pandas data frame containing 3 columns: Publishing date, Article title, Article sentiment.

The Ticker module will allow you to get a list of recent financial news on a given ticker. Next, instantiate the Goose article extractor and a pipeline for the pre-trained NLP model.

def get_ticker_news_sentiment(ticker):
    """
    Returns a Pandas dataframe of the given ticker's most recent news article headlines,
    with the overal sentiment of each article.

    Args:
        ticker (string)

    Returns:
        pd.DataFrame: {'Date', 'Article title', Article sentiment'}
    """
    ticker_news = yf.Ticker(ticker)
    news_list = ticker_news.get_news()
    extractor = Goose()
    pipe = pipeline("text-classification", model="ProsusAI/finbert")

Calling the get_news() method returns a list of dictionaries. Each of these contains information on a news article such as the link to it and its title. For each dictionary (article) in the list, send a request to the article’s link before extracting its text and date using the Goose extractor.

If the article’s text has more than 512 words, then the token indices sequence length is longer than the specified maximum sequence length for the FinBERT model. Running such a sequence through the model will result in indexing errors. Although there is a way to perform Sentiment Analysis on long texts with more than 512 tokens, due to the focus of this piece, this method will not be specified. If you are interested in this, please read Priyatosh Anand’s work.

By passing a text through the pipeline, the model will give softmax outputs for three labels: positive, negative or neutral. The ‘label’ with the strongest probability is the one returned. This will describe the overall dominant sentiment of a given text.

Finally, this function converts the ‘data’ list into a Pandas data frame and returns it:

    data = []
    for dic in news_list:
        title = dic['title']
        response = get(dic['link'])
        article = extractor.extract(raw_html=response.content)
        text = article.cleaned_text
        date = article.publish_date
        if len(text) > 512:
            data.append({'Date':f'{date}',
                         'Article title':f'{title}',
                         'Article sentiment':'NaN too long'})
        else:
            results = pipe(text)
            #print(results)
            data.append({'Date':f'{date}',
                         'Article title':f'{title}',
                         'Article sentiment':results[0]['label']})
    df = pd.DataFrame(data)
    return df

Now, let’s use this new function on the tickers obtained earlier. Firstly, you will need a small helper function which converts a ticker’s news sentiment Pandas DataFrame into a csv file stored inside of the out directory.

def generate_csv(ticker):
    get_ticker_news_sentiment(ticker).to_csv(f'out/{ticker}.csv', index=False)

Finally, we call this function on each ticker obtained earlier:

undervalued = get_undervalued_stocks()
for ticker in undervalued:
    generate_csv(ticker)

This creates or updates your out directory with an Overview file containing general data on all of the “undervalued” stocks, and a csv file for each ticker with a simple overview of the recent sentiment towards a stock.

Screenshot of the csv file containing general financial news article sentiment on a specific stock

For example, just by looking at the recent financial articles titles on Amplify Energy Insiders (AMPY), it seems that others also agree that this stock is “undervalued.”

Screenshot of the out folder after running the program on the 21/12/2023

With this, I have completed my goals. By running this program I have obtained a list of “undervalued” stocks, an overview of these stocks, and a list of recent financial news articles — with their overall sentiment. You can find this article’s code on my github.

Thank you for taking the time to read the article, and to explore this subject with me,
Chedy Smaoui
(21/12/2023)

Disclaimer — I am not a professional financial advisor (yet), however, I am a university student with a passion for this, and if you have any questions, concerns, or even some suggestions to expand this subject further, then please do not hesitate to reach out, my LinkedIn messages are checked daily.