LLaMA 2 Long is a series of long-context LLMs built through continual pretraining from LLAMA 2 with…
PaLM 2 is the successor of PaLM. It’s more compute efficient and is pre-trained on a more multilingual &…
As larger models require pretraining on trillions of tokens, it is unclear how scalable is curation of…
Instruction back translation is a scalable method to build a high-quality instruction following language…