Nov 1 · 1 min read
If you look at the algorithm you will see that the training dataset ends one month before the testing month.
The database consists of the fundamentals of the companies in the S&P 1500 index. These were obtained through the Standard & Poor’s database:
The monthly returns of every companies and of the S&P500 index were obtained through a Bloomberg terminal. These could be obtained free of charge by scraping from Yahoo Finance. A python package does that for you:
The list of the companies inside the S&P1500 throughout the years was also obtained through Standard & Poor’s.
I was fortunate enough to obtain access to these proprietary databases through academic subscriptions at my school.
Thank you for your time!
