Weekly AI News — July 1st 2024

Google releases Gemma 2, Claude introduces Projects, and major labels sue Suno and Udio over copyright

Fabio Chiusano
NLPlanet
5 min readJul 1, 2024

--

Solar punk city — made with DALLE 3

Here are your weekly articles, guides, and news about NLP and AI chosen for you by NLPlanet!

😎 News From The Web

  • Gemini 1.5 Pro 2M context window, code execution capabilities, and Gemma 2 are available today. Gemini 1.5 Pro has been updated with a larger 2M token context window and new code execution features to improve performance on complex tasks. Gemma 2 is now open for testing in Google AI Studio, and Gemini 1.5 Flash tuning has been released.
  • Collaborate with Claude on Projects. Claude.ai introduces the Projects feature for Pro and Team users, leveraging Claude 3.5 Sonnet’s 200K context window to enhance collaborative work through organized chats, document integration, and tailored assistance. The addition of Artifacts and a shared activity feed fosters co-creation and inspiration within the platform.
  • Apple is first company charged with violating EU’s DMA rules. The EU has charged Apple with violations of the Digital Markets Act due to restrictive App Store policies. A new investigation into how Apple manages alternative app stores and associated fees has begun. Apple could face fines of up to 10% of its global revenues and has stated it will cooperate with EU regulators.
  • Major Labels Sue AI Firms Suno and Udio for Alleged Copyright Infringement. Major music labels have sued AI music firms Suno and Udio for copyright infringement, alleging unlicensed use of copyrighted songs to train their AIs, which can produce tracks resembling popular artists. Suno and Udio claim their work is transformative and qualifies for fair use.
  • Apple won’t roll out AI tech in EU market over regulatory concerns. Apple Inc. has postponed the launch of new AI technologies in the EU due to the compliance requirements of the Digital Markets Act, which aims to prevent the favoring of own products and the misuse of consumer data, impacting privacy and security. This affects features like Apple Intelligence, iPhone Mirroring, and SharePlay, as Apple is considered a “gatekeeper” under EU regulations.
  • Stability.ai gets new CEO and investment dream team to start rescue mission. Prem Akkaraju has been named the new CEO of Stability.ai, creator of Stable Diffusion, alongside receiving investment from notable figures. Akkaraju’s role is pivotal in leading the company’s turnaround efforts, leveraging his experience as the former CEO of Weta Digital.
  • YouTube reportedly wants to pay record labels to use their songs for AI training. YouTube seeks licensing agreements with major record labels Sony, Universal, and Warner for AI training to circumvent copyright troubles, but faces pushback from artists. Meanwhile, labels are suing AI music platforms Suno and Udio for copyright infringement.
  • Meet Figma AI: Empowering designers with intelligent tools. Figma has launched Figma AI, a new AI-enhanced design platform featuring AI-driven search capabilities, generative text and image tools, and advanced prototyping functionalities. It’s currently in beta and free until 2024, though usage may be capped depending on the cost of tools.
  • Snapchat AI turns prompts into new lens. Snapchat has launched a feature enabling users to create custom AI-driven lenses using textual prompts, leveraging user interaction data and online activity to tailor experiences.

📚 Guides From The Web

  • What is an agent?. An agent, in the context of LLM systems, refers to the varying degrees of autonomous capabilities such systems have, ranging from basic task routing to fully autonomous operations. The article examines the necessary development, orchestration, and monitoring that accompany the increase in system autonomy.
  • Welcome Gemma 2 — Google’s new open LLM. Google’s Gemma 2 has been released, showcasing advanced models with a maximum of 27 billion parameters tailored for base and specialized instructional use cases. It incorporates novel AI techniques like sliding window attention, logit soft-capping, knowledge distillation, and model merging, with availability on the Hugging Face platform.
  • Top AI Tools for Research: Evaluating ChatGPT, Gemini, Claude, and Perplexity. The article provides a comparative analysis of four AI research tools — ChatGPT, Gemini, Claude, and Perplexity — examining their response quality, access to real-time data, referencing abilities, document analysis, and subscription options to enhance productivity in academic and business research settings.
  • Building a personalized code assistant with open-source LLMs using RAG Fine-tuning. Research shows that fine-tuning LLMs with Retrieval-Augmented Generation (RAG) can enhance code generation performance by reducing errors such as hallucinations and outdated information. Tests on the Together AI Platform reveal that models fine-tuned with RAG, specifically using Mistral 7B Instruct v0.2, surpass competitors like Claude 3 Opus and GPT-4o in terms of accuracy, efficiency, and cost.
  • Fine-tuning Florence-2 — Microsoft’s Cutting-edge Vision Language Models. Microsoft’s Florence-2 is a hybrid vision-language model excelling in OCR and object detection tasks. It integrates a DaViT vision encoder with BERT embeddings and exhibits improved performance upon fine-tuning with the DocVQA dataset, reaching a 57.0 similarity score. This advancement is attributed to its pre-training on the large-scale FLD-5B dataset.

🔬 Interesting Papers and Repositories

  • Judging the Judges: Evaluating Alignment and Vulnerabilities in LLMs-as-Judges. he study investigates the effectiveness of LLMs in evaluating the performance of their counterparts, using the TriviaQA dataset and human annotations as benchmarks. It reveals inconsistencies in the models’ assessments and highlights that agreement rates between LLMs do not always reflect true alignment, as evidenced by variance in scores.
  • Adam-mini: Use Fewer Learning Rates To Gain More. The Adam-mini optimizer offers performance on par with or better than AdamW with 45–50% lower memory usage, due to its structured learning rate allocation for parameter groups. It also increases throughput by up to 49.6% and reduces computational overhead.
  • Evidence of a log scaling law for political persuasion with large language models. A study investigating the influence of language model size on persuasive abilities across political issues found that larger models exhibit diminishing returns in persuasiveness, with small models nearly as effective as larger ones. The minor superiority of bigger models is attributed to enhanced coherence and topical focus, implying negligible benefits from scaling up language models further.
  • Meta Large Language Model Compiler: Foundation Models of Compiler Optimization. Meta released the LLM Compiler that uses pre-trained models, including Code Llama, for improving code optimization. These models are trained on extensive datasets of intermediate and assembly code and come in variations with 7 and 13 billion parameters. Their fine-tuned instances can notably enhance code size optimization and disassembly tasks for x86_64 and ARM architectures.
  • LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs. LongRAG is a new Retrieval-Augmented Generation framework that extends retriever units to handle up to 4K tokens. It leverages a long-context language model, enabling it to extract answers without extra training and attain high Exact Match scores, comparable to state-of-the-art performance.

Thank you for reading! If you want to learn more about NLP, remember to follow NLPlanet. You can find us on LinkedIn, Twitter, Medium, and our Discord server!

--

--

Fabio Chiusano
NLPlanet

Freelance data scientist — Top Medium writer in Artificial Intelligence