Reinforcement Learning from Human Feedback, InstructGPT, and ChatGPT
Note: some parts of this blog post are generated by ChatGPT! :)
Welcome to my blog post on ChatGPT! In this post, we will dive into the inner workings of ChatGPT and how it is trained. However, before we get into the specifics of ChatGPT, it’s important to first review some relevant prior works and concepts to give us a strong foundation. Once we have a solid understanding of these foundations, we can move on to exploring ChatGPT in depth.