Hi Sean, thanks for the awesome, thought out response! I will address your points one at a time.
- I am not sure I understand what you mean by making the observations stationery. However, I do understand the benefits of normalizing the data, so I will definitely add normalization in the next article using
- I don’t like my reward function either, but I just hadn’t put in the time to come up with a better one yet. I really like the strategy you’ve mentioned, rewarding upon a profitable sell seems like it could have profound benefits compared to my current rewards.
If your agent works unrealistically well on unseen data, it usually means there is something unrealistic within your environment. Make sure your calculations are correct, and if so, test it on live data and see how it would do if you were to actually use it. Let me know how it goes!