Llama3 is Here – Key Takeaways

Kunal Sawarkar
Towards Generative AI
2 min readApr 18, 2024

--

What to know about the latest and greatest Open Foundational Model

So the most anticipated model of the year is out #llama3 from AI at Meta. Was it worth the wait?

Key Takeaways:

Meta unveils Meta Llama 3, the latest in their line of open-source large language models, featuring 8B and 70B parameter models.

  • New Tokenizer: Llama 3 employs a tokenizer with a vocabulary of 128K tokens, encoding language much more efficiently and yielding up to 15% fewer tokens compared to Llama 2.
  • Grouped Query Attention: Implemented across all models, making smaller models more capable compared to Llama 2, where it was only used in the largest model.
  • Pre-trained with 15T tokens, 95% of which are in the English language.
  • Trained on 16K GPUs simultaneously, with new tools developed for managing GPU uptime. Hopefully, they will release it, as GPU utilization is the biggest challenge I have seen for fine-tuning as well
  • Interesting Use of Llama2: It was used to clean the dataset for tuning, marking an interesting use case for LLMs in the data quality domain.
  • New Fine-Tuning Approach: Combining reasoning tracing with preference ranking in the instruction set, aimed at reducing model hallucinations and error rate, similar to what OpenAI tried for step-by-step reasoning.
  • New Library: TorchTune, a PyTorch-native library for authoring, fine-tuning, and experimenting with LLMs, providing memory-efficient and hackable training recipes.
  • Responsibility: Meta emphasizes responsible AI development, offering trust and safety tools such as Llama Guard 2 and Code Shield.
  • Performance: Llama 3 sets a new standard in performance, boasting improved reasoning abilities and state-of-the-art performance on industry benchmarks. The detailed benchmarks I have seen are vs Claude and not GPT4.

While it does not provide a detailed comparison to GPT4 nor link to the research paper; it hints at something more coming soon, possibly the 400B parameter model. Early checkpoint results from this 400B parameter model look like the next seismic wave in GenAI.

The coolest thing abt #llama remains that it is actually open and available on open platforms like #huggingface and #watsonx

--

--

Kunal Sawarkar
Towards Generative AI

Distinguished Engg- Gen AI & Chief Data Scientist@IBM. Angel Investor. Author #RockClimbing #Harvard. “We are all just stories in the end, just make a good one"