AI Updates: The Auto-GPT rocket, LLaMA’s cheap children, and more
This AI wave is nowhere near cresting
Hello and welcome back to AI Updates, where we cover the latest developments in the hottest space in tech. It’s been over a month since the last installment, and so much has happened. I’ve been reading every day and I still don’t feel caught up! Let’s dive in, hug some llamas, and learn what the AI community has in store for us today.
As usual, this article mentions Microsoft, my employer. I wrote this article in my free time, and all opinions are my own.
New to AI? I’ve got you covered! Start here:
AI agents and Auto-GPT
ChatGPT reads your messages and writes responses without any access to the Internet. The new Bing does the same, but searches online for an answer first. But what if we asked AI to do more? What if we wanted something complex, something that couldn’t be done in a single prompt?
Enter AI agents: AI products that give models like ChatGPT access to tools like Internet search, document storage, and computational engines like WolframAlpha. When prompted, agents set a goal and plan a series of tasks to accomplish that goal. Agents “talk to themselves” as they go, reasoning through a complex process and using their tools to make progress. Artificial agency of this kind has been imagined for centuries, but wasn’t feasible until ChatGPT.
For example, an agent prompted to “find the square root of the age of the founder of IBM” might first use a search to identify the founder of IBM, search again to find his birth date, use a date tool to learn the current date, use a math tool to get a final answer, and then synthesize that info into a ChatGPT-like response. All of this from a single prompt.
On March 30, Toran Bruce Richards published Auto-GPT, an AI agent powered by GPT-4. Anyone can use it without sending any data to Richards — that is, as long as they’re willing to pay OpenAI a few pennies per thousand words sent to and from GPT-4. The project has rocketed to become the 30th most-starred repo on GitHub.
Many other AI agents now exist, including BabyAGI, browser-based AgentGPT, and Khan Academy’s Khanmigo (which blurs the line between agency and AI grounding). Expect more products that (discreetly?) use AI agency to be announced soon.
AI + friendship = HuggingGPT, aka Microsoft JARVIS
For a research-grade case study on the power and diverse applications of AI agents, look no further than the HuggingGPT paper, published by Microsoft Research Asia and Zhejiang University on March 30. In it, they study an agent powered by ChatGPT with access to specialized machine learning models from the popular Hugging Face registry.
Hugging Face is the leading platform for sharing machine learning models, including image labellers, video generators, text classifiers, audio understanders, and more. Many models, unlike ChatGPT, are highly-specialized and only work on well-structured data (instead of plain English), so they’re cheaper to run and give better results. Since HuggingGPT (also known as JARVIS/Jarvis) has access to any model on Hugging Face, it can work seamlessly with images, audio, text, video, and other file formats.
In the below figure from the paper, we see Jarvis’s explanation of how it generated a dubbed video from nothing but a text prompt.
HustleGPT
So far, we’ve given AI access to the Internet and friends from Hugging Face. But what if we gave it the most powerful resource of all? 💸
On March 15, designer Jackson Greenhouse Fall gave GPT-4 a measly $100 and asked it to build its own business. He claims to have spent the money as his GPT boss instructed him to, and he went viral in the process. The business, a niche eco-living blog called Green Gadget Guru, hasn’t bloomed, as Fall has since prioritized Makeshift, the 3,000-member Discord community he started days after he found Internet fame. A spinoff group using the name HustleGPT on both Twitter and Discord recently passed 6,000 Discord members. (They also banned Fall.) As Yogi Berra said, “it’s tough to make predictions, especially about the future,” but the barrier for starting a business has certainly been lowered greatly by the newest AI tools.
LLaMA’s many children
Last time on AI Updates, we covered the announcement of Meta’s LLaMA, a language model released to researchers (and leaked on March 3). Since then, folks have built more specialized models based on LLaMA, reaching near-ChatGPT performance in user-preference studies. LLaMA’s children include GPT4All, a free downloadable ChatGPT clone that runs without Internet, and Stanford’s Alpaca, an instruction-following model whose demo was shut down four days after its March 13 announcement due to safety concerns. Each model was made for less than $1,500 and maybe a person-month of work (GPT4All was made by 5 authors who worked “about four days”). The natural drawback here is lower-quality responses, but we don’t need a Ferrari to get to the grocery store, do we? (Thanks to Patrice Pelland, my triple-skip-level manager, for sharing that analogy.)
There are now countless other LLaMA-based projects out there, including ChatLLaMA (for training your own LLaMA-based model), the interactive Vicuna (trained for $300), and Berkeley’s Koala (trained for less than $100!). As costs to train models continue to decrease, expect to see more and more little LLaMAs coming to a website near you!
AI anthropomorphism concerns
Pretending all these programs are animals is all fun and games, right? Well, some researchers disagree, and anthropomorphism of AI (that is, attributing human qualities to AI) is a significant concern in some expert circles.
Below is a collection of well-cited conversations started by Ben Shneiderman and published by Chenhao Tan. It’s an engaging academic debate about reactions to an AI product that convincingly refers to itself as “I,” our tendency to grow attached to non-human objects, and the potential impact of such technologies becoming ubiquitous. It’s refreshing to see honest discourse about the dangers of these new tools.
And that’s a wrap for this edition! The AI community has taken center-stage this month with brillions of new models, tools, and products for us to try out! Big Tech researchers continue to explore the potential of the models that already exist, and the barriers to enter the field have been lowered. Concerns remain, and always will, but we can educate ourselves by listening in on expert conversations and sharing our thoughts.
Thank you for reading. What would you like to learn next? How can I help? Let me know in the comments! 🤓
In case you missed it, here’s the first AI Updates entry: