Thomas Rochefort-Beaudoin
Nov 1 · 1 min read

If you look at the algorithm you will see that the training dataset ends one month before the testing month.

The database consists of the fundamentals of the companies in the S&P 1500 index. These were obtained through the Standard & Poor’s database:

The monthly returns of every companies and of the S&P500 index were obtained through a Bloomberg terminal. These could be obtained free of charge by scraping from Yahoo Finance. A python package does that for you:

The list of the companies inside the S&P1500 throughout the years was also obtained through Standard & Poor’s.

I was fortunate enough to obtain access to these proprietary databases through academic subscriptions at my school.

Thank you for your time!

    Written by

    Aerospace engineering student @polymtl

    Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
    Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
    Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade