This Robot will do all your Chores! Just ask it.

Taffy Das
Geek Culture
Published in
4 min readSep 9, 2022

If you want to watch the video version of this article instead, please check out the link above.

Today, we’ll be considering a fairly recent AI robotics system that will do all your house chores! Basically your own personal housekeeper or butler. PaLM-SayCan from Google is part of the latest models to come out of their research lab. The model has two main parts, Say and Can. Say, as you might have figured out, is the natural language piece that takes in text-based instructions. This language part is the PaLM model from Google that is able to understand the context of sentences. Can, on the other end, is the execution component that uses a robot to carry out tasks. The project is a collaborative effort between Google and Everyday Robots. Everyday Robots is a robotics company born out of X, formerly Google X. The company’s goal is to build robots that can help with tasks that we perform in our everyday lives. These tasks could be very simple, repetitive or tedious. Today, robot systems are expensive and are highly specialized for certain tasks. The instructions they perform are hard-coded commands, like “Pick up a box.” They however have trouble understanding long-term activities and making inferences about high level objectives, including a user request like “I spilled my orange soda. Can you bring me a replacement, please?”

A wide variety of tasks may now be accomplished by robots with state of the art outcomes thanks to recent advancements in the training of larger language models like PaLM. A language model typically does not interact with its surroundings or monitor the results of its replies, hence these language models are essentially not grounded in the real world due to the nature of their training process. As a result, it might respond to a request with sentences that are difficult, hazardous or irrational for a robot to follow in a real-world situation. Let’s explore examples of language models that reply to requests with non executable tasks. For instance, if asked, “I spilled my drink, can you help?” The popular language model, GPT-3, replies, “You might try using a vacuum cleaner,” although the robot may find this risky or impractical to do. The FLAN language model responds to the same query by responding, “I’m sorry, I didn’t mean to spill it,” which is not a valid response for execution, neither is it a required apology. .

PaLM-SayCan is an innovative method that tries to give the language model a solid real-world foundation through actionable tasks. In order to complete a task, the model uses what Google calls chain of thought prompting. This is how the model breaks down high level tasks during its planning phase. For example, when asked to bring a replacement of orange soda, PaLM-SayCan breaks down the tasks into: 1. find an orange soda, 2. pick up the orange soda, 3. bring it to you, 4. put down the orange soda, 5. Done.

Where was PaLM-SayCan back in the day, when mom would always ask to bring the tv remote or water from the fridge ? The kids of the future won’t know anything about that. So lucky.

PaLM-SayCan disposing of a can

At the moment, the robot is slow in task executions and the example videos have been sped up to present what the system is capable of.

Now time for some stats. The success rate of the planning phase increased by 14% once the system was connected with PaLM. Additionally, successful task completion increased by 13%. In order to teach the model to learn these basic actions, training data from 12,000 successful episodes were collected. 68,000 demos performed by 10 robots over an 11-month period were monitored to make the project a success. The results show that, over a range of 101 actions, PaLM-SayCan chooses the correct sequence of skills 84% of the time and performs them correctly 74% of the time. This is especially fascinating because it demonstrates for the first time how improvements in language models can result in advancements in robotics that are comparable. This study shows that language modeling advances may someday be useful for robotics, bridging the gap between these two scientific fields.

Moving on from here, Google wants to comprehend more fully how the language models might be improved by the robot’s real-world experience. In the hopes that it would be a helpful tool for future research that combines robotic learning with complex language models, Google has released an open-sourced robot simulation setup. I’ll post the links below for anyone interested in trying out these simulations.

The robotic setup adheres to all the well-established safety rules for robots, such as risk assessments, physical controls, and emergency stops. You wouldn’t want your robot misidentifying a knife for a spoon, and then throwing it your way! That would be a nightmarish situation, wouldn’t it? You can see why safety is important. A gradual increase and monitoring of the set of executable tasks is a reasonable approach in delivering a successful system. A contender to PaLM-SayCan may be the Tesla bot which is still under development by Tesla. The humanoid bot, called Optimus, will be a general-purpose robot for boring and repetitive tasks. Not much update has been given on the status of the project but we’ll cover it once new reports have been released.

In the next few years, these robots should be able to understand your requests easily and then execute the required task in a fast and safe way. What would be your most requested task from PaLM-SayCan ? Let me know in the comments section. I hope you learned something today.

Thank You!

Resources:

PaLM-SayCan

Paper: https://say-can.github.io/assets/palm_saycan.pdf

Blog: https://ai.googleblog.com/2022/08/towards-helpful-robots-grounding.html
Simulated Setup: https://github.com/google-research/google-research/tree/master/saycan

Related Contents:

Build 3D From Anything using AI! VR & Metaverse WorldBuilding

AI that Writes Code for You! As Good as an Engineer ?

--

--

Taffy Das
Geek Culture

Check out more exciting content on new AI updates and intersection with our daily lives. https://www.youtube.com/channel/UCsZRCvdmMPES2b-wyFsDMiA