RT/ Giving robots social skills

Paradigm
Paradigm
Published in
36 min readNov 17, 2021

--

Robotics biweekly vol.40, 3rd November — 17th November

TL;DR

  • MIT researchers have incorporated social interactions into a framework for robotics, enabling simulated machines to understand what it means to help or hinder one another, and to learn to perform these social behaviors on their own.
  • For the past seven years, an autonomous robotic rover, Benthic Rover II, has been continuously operational 225 kilometers off the coast of central California and 4,000 meters below the ocean’s surface. This innovative mobile laboratory has further revealed the role of the deep sea in cycling carbon. The data collected by this rover are fundamental to understanding the impacts of climate change on the ocean.
  • Engineering students have designed an autonomous robot that can find and open doors in 3D digital simulations. Now they’re building the hardware for an autonomous robot that not only can open its own doors but also can find the nearest electric wall outlet to recharge without human help.
  • Princeton researchers have invented bubble casting, a new way to make soft robots using “fancy balloons” that change shape in predictable ways when inflated with air.
  • Researchers at Hebei University of Technology and other institutes in China have developed an innovative system for controlling robotic arms that is based on augmented reality (AR) and a brain-computer interface. This system, presented in a paper published in the Journal of Neural Engineering, could enable the development of bionic or prosthetic arms that are easier for users to control.
  • Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), in the ever-present quest to get machines to replicate human abilities, created a framework that’s more scaled up: a system that can reorient over two thousand different objects, with the robotic hand facing both upwards and downwards. This ability to manipulate anything from a cup to a tuna can, and a Cheez-It box, could help the hand quickly pick-and-place objects in specific ways and locations — and even generalize to unseen objects.
  • Carnegie Mellon University and Meta AI (formerly Facebook AI) hope to change that through a new tactile sensing skin they believe will increase the sense of touch in robotics, wearables, smart clothing and AI. Called ReSkin, the technology is affordable, durable and easy-to-use. It harnesses advances in machine learning, soft robotics and magnetic sensing to create a skin that is versatile and as easy to apply as a bandage.
  • Researchers at UT Austin and Facebook AI Research have recently developed a new framework that could shape the behavior of embodied agents more effectively, using ego-centric videos of humans completing everyday tasks.
  • Scientists at Franklin & Marshall College have recently been trying to develop computational tools that could enhance the performance of socially assistive robots, by allowing them to process social cues given by humans and respond accordingly. In a paper, they introduced a new technique that allows robots to autonomously detect when it is appropriate for them to step in and help users.
  • Human-style eyes for robots, with some inspiration from Jabba the Hutt, created.
  • November’s issue of Science Robotics is out.
  • And more!

Robotics market

The global market for robots is expected to grow at a compound annual growth rate (CAGR) of around 26 percent to reach just under 210 billion U.S. dollars by 2025.

Size of the global market for industrial and non-industrial robots between 2018 and 2025 (in billion U.S. dollars):

Size of the global market for industrial and non-industrial robots between 2018 and 2025 (in billion U.S. dollars). Source: Statista

Latest News & Researches

Social Interactions as Recursive MDPs

by Ravi Tejwani, Yen-Ling Kuo, Tianmin Shu, Boris Katz, Andrei Barbu

Robots can deliver food on a college campus and hit a hole in one on the golf course, but even the most sophisticated robot can’t perform basic social interactions that are critical to everyday human life.

MIT researchers have now incorporated certain social interactions into a framework for robotics, enabling machines to understand what it means to help or hinder one another, and to learn to perform these social behaviors on their own. In a simulated environment, a robot watches its companion, guesses what task it wants to accomplish, and then helps or hinders this other robot based on its own goals.

The researchers also showed that their model creates realistic and predictable social interactions. When they showed videos of these simulated robots interacting with one another to humans, the human viewers mostly agreed with the model about what type of social behavior was occurring.

Enabling robots to exhibit social skills could lead to smoother and more positive human-robot interactions. For instance, a robot in an assisted living facility could use these capabilities to help create a more caring environment for elderly individuals. The new model may also enable scientists to measure social interactions quantitatively, which could help psychologists study autism or analyze the effects of antidepressants.

“Robots will live in our world soon enough and they really need to learn how to communicate with us on human terms. They need to understand when it is time for them to help and when it is time for them to see what they can do to prevent something from happening. This is very early work and we are barely scratching the surface, but I feel like this is the first very serious attempt for understanding what it means for humans and machines to interact socially,” says Boris Katz, principal research scientist and head of the InfoLab Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and a member of the Center for Brains, Minds, and Machines (CBMM).

Joining Katz on the paper are co-lead author Ravi Tejwani, a research assistant at CSAIL; co-lead author Yen-Ling Kuo, a CSAIL PhD student; Tianmin Shu, a postdoc in the Department of Brain and Cognitive Sciences; and senior author Andrei Barbu, a research scientist at CSAIL and CBMM. The research will be presented at the Conference on Robot Learning in November.

To study social interactions, the researchers created a simulated environment where robots pursue physical and social goals as they move around a two-dimensional grid.

A physical goal relates to the environment. For example, a robot’s physical goal might be to navigate to a tree at a certain point on the grid. A social goal involves guessing what another robot is trying to do and then acting based on that estimation, like helping another robot water the tree.

The researchers use their model to specify what a robot’s physical goals are, what its social goals are, and how much emphasis it should place on one over the other. The robot is rewarded for actions it takes that get it closer to accomplishing its goals. If a robot is trying to help its companion, it adjusts its reward to match that of the other robot; if it is trying to hinder, it adjusts its reward to be the opposite. The planner, an algorithm that decides which actions the robot should take, uses this continually updating reward to guide the robot to carry out a blend of physical and social goals.

“We have opened a new mathematical framework for how you model social interaction between two agents. If you are a robot, and you want to go to location X, and I am another robot and I see that you are trying to go to location X, I can cooperate by helping you get to location X faster. That might mean moving X closer to you, finding another better X, or taking whatever action you had to take at X. Our formulation allows the plan to discover the ‘how’; we specify the ‘what’ in terms of what social interactions mean mathematically,” says Tejwani.

Blending a robot’s physical and social goals is important to create realistic interactions, since humans who help one another have limits to how far they will go. For instance, a rational person likely wouldn’t just hand a stranger their wallet, Barbu says.

The researchers used this mathematical framework to define three types of robots. A level 0 robot has only physical goals and cannot reason socially. A level 1 robot has physical and social goals but assumes all other robots only have physical goals. Level 1 robots can take actions based on the physical goals of other robots, like helping and hindering. A level 2 robot assumes other robots have social and physical goals; these robots can take more sophisticated actions like joining in to help together.

To see how their model compared to human perspectives about social interactions, they created 98 different scenarios with robots at levels 0, 1, and 2. Twelve humans watched 196 video clips of the robots interacting, and then were asked to estimate the physical and social goals of those robots.

In most instances, their model agreed with what the humans thought about the social interactions that were occurring in each frame.

“We have this long-term interest, both to build computational models for robots, but also to dig deeper into the human aspects of this. We want to find out what features from these videos humans are using to understand social interactions. Can we make an objective test for your ability to recognize social interactions? Maybe there is a way to teach people to recognize these social interactions and improve their abilities. We are a long way from this, but even just being able to measure social interactions effectively is a big step forward,” Barbu says.

The researchers are working on developing a system with 3D agents in an environment that allows many more types of interactions, such as the manipulation of household objects. They are also planning to modify their model to include environments where actions can fail.

The researchers also want to incorporate a neural network-based robot planner into the model, which learns from experience and performs faster. Finally, they hope to run an experiment to collect data about the features humans use to determine if two robots are engaging in a social interaction.

“Hopefully, we will have a benchmark that allows all researchers to work on these social interactions and inspire the kinds of science and engineering advances we’ve seen in other areas such as object and action recognition,” Barbu says.

Abyssal Benthic Rover, an autonomous vehicle for long-term monitoring of deep-ocean processes

by K. L. Smith, A. D. Sherman, P. R. McGill, R. G. Henthorn, J. Ferreira, T. P. Connolly, C. L. Huffard in Science Robotics

The sheer expanse of the deep sea and the technological challenges of working in an extreme environment make these depths difficult to access and study. Scientists know more about the surface of the moon than the deep seafloor. MBARI is leveraging advancements in robotic technologies to address this disparity.

An autonomous robotic rover, Benthic Rover II, has provided new insight into life on the abyssal seafloor, 4,000 meters (13,100 feet) beneath the surface of the ocean. A study published in Science Robotics details the development and proven long-term operation of this rover. This innovative mobile laboratory has further revealed the role of the deep sea in cycling carbon. The data collected by this rover are fundamental to understanding the impacts of climate change on the ocean.

“The success of this abyssal rover now permits long-term monitoring of the coupling between the water column and seafloor. Understanding these connected processes is critical to predicting the health and productivity of our planet engulfed in a changing climate,” said MBARI Senior Scientist Ken Smith.

Despite its distance from the sunlit shallows, the deep seafloor is connected to the waters above and is vital for carbon cycling and sequestration. Bits of organic matter — including dead plants and animals, mucus, and excreted waste — slowly sink through the water column to the seafloor. The community of animals and microbes on and in the mud digests some of this carbon while the rest might get locked in deep-sea sediments for up to thousands of years.

The deep sea plays an important role in Earth’s carbon cycle and climate, yet we still know little about processes happening thousands of meters below the surface. Engineering obstacles like extreme pressure and the corrosive nature of seawater make it difficult to send equipment to the abyssal seafloor to study and monitor the ebb and flow of carbon.

In the past, Smith and other scientists relied on stationary instruments to study carbon consumption by deep seafloor communities. They could only deploy these instruments for a few days at a time. By building on 25 years of engineering innovation, MBARI has developed a long-term solution for monitoring the abyssal seafloor.

“Exciting events in the deep sea generally occur both briefly and at unpredictable intervals, that’s why having continuous monitoring with Benthic Rover II is so crucial,” explained Electrical Engineering Group Lead Alana Sherman. “If you’re not watching all the time, you’re likely to miss the main action.”

Benthic Rover II is the result of the hard work of a collaborative team of MBARI engineers and scientists, led by Smith and Sherman.

Abyssal BR-II working on the seafloor at 4,000 m depth. (A) BR-II from the port-side perspective. The principal components are the current meter (a), syntactic foam floatation modules (b), strobe ©, transit camera (d), acoustic modem (e), instrument track assembly (f), high-density polyethylene platform (g), respirometer chambers (h), titanium frame (i), and tread (j). (B) BR-II from the starboard-side perspective showing one of the titanium battery spheres (k) and the titanium controller sphere (l).

Engineers at MBARI designed Benthic Rover II to handle the cold, corrosive, and high-pressure conditions of the deep sea. Constructed from corrosion-resistant titanium, plastic, and pressure-resistant syntactic foam, this rover can withstand deployments up to 6,000 meters (about 19,700 feet) deep.

“In addition to the physical challenges of operating in these extreme conditions, we also had to design a computer control system and software reliable enough to run for a year without crashing — nobody is there to press a reset button,” explained MBARI Electrical Engineer Paul McGill. “The electronics also have to consume very little power so that we can carry enough batteries to last for a year. Despite all it does, the rover consumes an average of only two watts — about the same as an iPhone.”

Benthic Rover II is about the size of a small car — 2.6 meters (8.5 feet) long, 1.7 meters (5.6 feet) wide, and 1.5 meters (4.9 feet) high — and treads gently over the muddy bottom on a pair of wide, rubber tracks.

Researchers deploy Benthic Rover II from MBARI’s vessel, the R/V Western Flyer. The ships’ crew gingerly lowers the rover into the water and releases it to free-fall to the ocean floor. It takes the rover about two hours to reach the bottom. Once it lands on the seafloor, the rover can begin its mission.

First, sensors check the currents flowing along the seafloor. When they detect favorable currents, the rover moves up or across the current to reach an undisturbed site to begin collecting data.

Cameras on the front of the rover photograph the seafloor and measure fluorescence. This distinctive glow of chlorophyll under blue light reveals how much “fresh” phytoplankton and other plant debris has landed on the seafloor. Sensors log the temperature and oxygen concentration of the waters just above the bottom.

Next, the rover lowers a pair of transparent respirometer chambers that measure the oxygen consumption of the community of life in the mud for 48 hours. As animals and microbes digest organic matter, they use oxygen and release carbon dioxide in a specific ratio. Knowing how much oxygen those animals and microbes use is crucial for understanding carbon remineralization — the breakdown of organic matter into simpler components, including carbon dioxide.

After 48 hours, the rover raises the respirometer chambers and moves 10 meters (32 feet) forward, careful not to cross its previous path, and selects another site to sample. It repeats this sampling pattern over and over for the duration of deployment, typically a full year.

At the end of each deployment, the R/V Western Flyer returns to recover the rover, download its data, swap out its battery, and return it to the deep seafloor for another year. Within each year-long deployment, the MBARI team launches another autonomous robot — the Wave Glider — from shore to return quarterly to check on Benthic Rover II’s progress.

“The rover can’t communicate with us directly to tell us its location or condition, so we send a robot to find our robot,” explained McGill. An acoustic transmitter on the Wave Glider pings the rover on the seafloor below.

The rover then sends status updates and sample data to the glider overhead. The glider then transmits that information to researchers on shore via satellite.

“Data from the Benthic Rover II have helped us quantify when, how much, and what sources of carbon might be sequestered, or stored, in the abyssal seafloor,” said MBARI Senior Research Specialist Crissy Huffard.

For the past seven years, Benthic Rover II has been continuously operational at Station M, an MBARI research site located 225 kilometers (140 miles) off the coast of central California. Station M lies 4,000 meters (13,100 feet) below the ocean’s surface — as deep as the average depth of the ocean — making it a good model system for studying abyssal ecosystems.

Over the past 32 years, Smith and his team have constructed a unique underwater observatory at Station M. Benthic Rover II and a suite of other instruments operate there 24 hours a day, seven days a week, for a full year without servicing.

“The rover’s reliable performance over seven years, spending 99 percent of its life on the seafloor, is a result of many years of testing, troubleshooting, and developing the best techniques to maintain the vehicle,” said Sherman. “It’s a great example of what’s possible when applying technology to challenging problems in science.”

Data collected at Station M show that the deep sea is far from static. Physical, chemical, and biological conditions can change dramatically over timescales ranging from hours to decades.

The surface waters of the California Current over Station M teem with phytoplankton in the spring and summer. These seasonal pulses in productivity cascade from the water column to the seafloor. Much of this sinking organic matter — known as “marine snow” — originated as carbon dioxide in the atmosphere.

Over the past decade, MBARI researchers have observed a dramatic increase in large pulses of marine snow falling to the seafloor at Station M. These episodic events account for an increasing fraction of the yearly food supply at this site. In seven years of operation at Station M, Benthic Rover II recorded significant weekly, seasonal, annual, and episodic events — all providing data that help MBARI researchers understand the deep-sea carbon cycle.

Between November 2015 and November 2020, Benthic Rover II recorded a substantial increase in the rain of dead phytoplankton and other plant-rich debris (phytodetritus) landing on the abyssal seafloor from the waters overhead. A decrease in the concentration of dissolved oxygen in the waters just above the deep seafloor accompanied this windfall of organic matter.

Traditional short-term monitoring tools would not have detected the fluctuations that drive long-term changes and trends. Benthic Rover II has revealed a more complete picture of how carbon moves from the surface to the seafloor.

“Benthic Rover II has alerted us to important short- and long-term changes in the deep sea that are being missed in global models,” underscored Huffard.

The success of Benthic Rover II and MBARI’s ongoing work at Station M highlight how persistent platforms and long-term observations can further our understanding of the largest living space on Earth. With more companies looking to extract mineral resources from the deep seafloor, these data also give valuable insights into the baseline conditions in areas under consideration for industrial development or deep-sea mining.

The ocean is also a crucial component in Earth’s carbon cycle and climate. The ocean and its biological communities are a sink for carbon dioxide. Burning fossil fuels, raising livestock, and clearing forests release billions of tons of carbon dioxide into our atmosphere every year. The ocean has buffered us from the worst impacts by absorbing more than 25 percent of this excess carbon dioxide. Facing a changing climate, understanding how carbon flows between the ocean’s sunlit surface and its dark depths is more important than ever.

Force-Vision Sensor Fusion Improves Learning-Based Approach for Self-Closing Door Pulling

by Yufeng Sun, Lin Zhang, Ou Ma in IEEE Access

One flaw in the notion that robots will take over the world is that the world is full of doors.

And doors are kryptonite to robots, said Ou Ma, an aerospace engineering professor at the University of Cincinnati.

“Robots can do many things, but if you want one to open a door by itself and go through the doorway, that’s a tremendous challenge,” Ma said.

Self-Closing Door Pulling by a Mobile Vehicle using Deep Reinforcement Learning based Force-Vision Sensor Fusion Method.

Students in UC’s Intelligent Robotics and Autonomous Systems Laboratory have solved this complex problem in three-dimensional digital simulations. Now they’re building an autonomous robot that not only can open its own doors but also can find the nearest electric wall outlet to recharge without human assistance.

This simple advance in independence represents a huge leap forward for helper robots that vacuum and disinfect office buildings, airports and hospitals. Helper robots are part of a $27 billion robotics industry, which includes manufacturing and automation.

UC College of Engineering and Applied Science doctoral student Yufeng Sun, the study’s lead author, said some researchers have addressed the problem by scanning an entire room to create a 3D digital model so the robot can locate a door. But that is a time-consuming custom solution that works only for the particular room that is scanned.

Sun said developing an autonomous robot to open a door for itself poses several challenges.

Doors come in different colors and sizes with different handles that might be slightly higher or lower. Robots have to know how much force to use to open doors to overcome resistance. Most public doors are self-closing, which means if the robot loses its grip, it has to start over.

Since UC students are using machine learning, the robot has to “teach” itself how to open a door, essentially through trial and error. This can be time-consuming initially, but the robot corrects its mistakes as it goes. Simulations help the robot prepare for the actual task, Sun said.

“The robot needs sufficient data or ‘experiences’ to help train it,” Sun said. “This is a big challenge for other robotic applications using AI-based approaches for accomplishing real-world tasks.”

Now, Sun and UC master’s student Sam King are converting Sun’s successful simulation study into a real robot.

“The challenge is how to transfer this learned control policy from simulation to reality, often referred to as a ‘Sim2Real’ problem,” Sun said.

Digital simulations typically are only 60% to 70% successful in initial real-world applications, Sun said. He expects to spend a year or more bridging the gap to perfect his new autonomous robotics system.

So there’s plenty of time to invest in robot-proof door locks.

Bubble casting soft robotics

by Pierre Brun in Nature

Princeton researchers have invented bubble casting, a new way to make soft robots using “fancy balloons” that change shape in predictable ways when inflated with air.

The new system involves injecting bubbles into a liquid polymer, letting the material solidify and inflating the resulting device to make it bend and move. The researchers used this approach to design and create hands that grip, a fishtail that flaps and slinky-like coils that retrieve a ball. They hope that their simple and versatile method, published Nov. 10 in the journal Nature, will accelerate the development of new types of soft robots.

Traditional rigid robots have multiple uses, such as in manufacturing cars.

“But they will not be able to hold your hands and allow you to move somewhere without breaking your wrist,” said Pierre-Thomas Brun, an assistant professor of chemical and biological engineering and the lead researcher on the study. “They’re not naturally geared to interact with the soft stuff, like humans or tomatoes.”

Soft robots use squishier, flexible materials, making them desirable for applications that need a gentle touch. They may one day be used to harvest produce, grab delicate items off a conveyor belt or provide personal care. They may also be useful in health care, such as in wearable exosuits for rehabilitation or implantable devices that wrap around the heart to help it beat.

One challenge in designing soft robots is controlling how they stretch and deform, which dictates how they move. All robots have components that cause movement, called actuators. Unlike rigid robots that move in fixed ways depending on their joints, the materials in soft robots have the potential to move and expand in an infinite number of ways.

Bubble casting offers a simple, flexible way to create actuators for soft robots using basic rules of fluid mechanics — the physics of fluids. The method uses a liquid polymer called elastomer, which cures to become a rubbery, elastic material. It is injected into a mold as simple as a drinking straw or a more complex shape, like a spiral or flipper. Next, the researchers inject air into the liquid elastomer to create a long bubble throughout the length of the mold. Thanks to gravity, the bubble slowly rises to the top as the elastomer drains to the bottom. Once the elastomer hardens, it can be removed from the mold and inflated with air, which causes the thin side with the bubble to stretch and curl in on the thicker base.

By controlling a handful of factors — the thickness of the elastomer coating the mold, how quickly the elastomer settles to the bottom and how long it takes to cure — the researchers can dictate how the resulting actuator will move. In other words, “fluid mechanics is doing the work,” Brun said.

“If it’s allowed more time to drain before curing, the film at the top will be thinner. And the thinner the film, the more it will stretch when you inflate it and cause greater overall bending,” explained first author Trevor Jones, a graduate student in chemical and biological engineering.

The researchers successfully cast star-shaped “hands” that gently grip a blackberry, a coil that contracts like a muscle and even a set of “fingers” that curl up one by one as the entire system is inflated, as if playing the piano.

The actuators in this paper deform when inflated with air, but other soft robotics systems employ magnetic fields, electric fields, or changes in temperature or humidity.

A large part of the work was figuring out how the robots would behave once inflated so that the researchers could design soft actuators with specific movements. Co-author Etienne Jambon-Puillet, a postdoctoral researcher in Brun’s group, worked with Jones to develop a computer simulation of the system.

“We can predict what will happen using a simple equation that anyone can use,” said Jambon-Puillet. “We understand quite well now what happens when we inflate these tube-like materials.”

A major advantage of bubble casting is that it does not require 3D printers, laser cutters or other expensive tools typically used in soft robotics. The system is also scalable. It has the potential to yield actuators several meters long with features as thin as 100 microns — almost as small as a human hair.

“What’s really smart is this idea to shape the structure just by natural fluid motion,” said François Gallaire, a professor of fluid dynamics at the EPFL in Lausanne, Switzerland, who was not involved in the research. “These processes are going to work at many different scales, including for very tiny things. That’s exciting because casting these tubes with typical fabrication methods could be really difficult, so there’s the potential to make very small tubes.”

Despite its flexibility, bubble casting does have its limits. So far, researchers have succeeded in forcing a bubble through only a few meters of elastomer-filled tubing. Also, overinflation can cause the balloons to pop.

“Failure is fairly catastrophic,” Jones said.

Next, the group will use their system to create more complex actuators and explore new applications. They are interested in designing actuators that move together in sequential waves, like the rippling feet of a walking millipede. Another possibility is creating actuators with chambers that alternately contract and relax using a single pressure source to inflate them, mimicking the beating of the human heart.

“We understand this problem at a physics level pretty strongly,” said Jones, “so now robotics can really be explored.”

A System for General In-Hand Object Re-Orientation

by Tao Chen, Jie Xu, Pulkit Agrawal

Scientists from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), in the ever-present quest to get machines to replicate human abilities, created a framework that’s more scaled up: a system that can reorient over two thousand different objects, with the robotic hand facing both upwards and downwards. This ability to manipulate anything from a cup to a tuna can, and a Cheez-It box, could help the hand quickly pick-and-place objects in specific ways and locations — and even generalize to unseen objects.

At just one year old, a baby is more dexterous than a robot. Sure, machines can do more than just pick up and put down objects, but we’re not quite there as far as replicating a natural pull towards exploratory or sophisticated dexterous manipulation goes.

OpenAI gave it a try with “Dactyl” (meaning “finger” from the Greek word daktylos), using their humanoid robot hand to solve a Rubik’s cube with software that’s a step towards more general AI, and a step away from the common single-task mentality. DeepMind created “RGB-Stacking,” a vision-based system that challenges a robot to learn how to grab items and stack them.

This deft “handiwork” — which is usually limited by single tasks and upright positions — could be an asset in speeding up logistics and manufacturing, helping with common demands such as packing objects into slots for kitting, or dexterously manipulating a wider range of tools. The team used a simulated, anthropomorphic hand with 24 degrees of freedom, and showed evidence that the system could be transferred to a real robotic system in the future.

“In industry, a parallel-jaw gripper is most commonly used, partially due to its simplicity in control, but it’s physically unable to handle many tools we see in daily life,” says MIT CSAIL Ph.D. student Tao Chen, member of the Improbable AI Lab and the lead researcher on the project. “Even using a plier is difficult because it can’t dexterously move one handle back and forth. Our system will allow a multi-fingered hand to dexterously manipulate such tools, which opens up a new area for robotics applications.”

This type of “in-hand” object reorientation has been a challenging problem in robotics, due to the large number of motors to be controlled and the frequent change in contact state between the fingers and the objects. And with over two thousand objects, the model had a lot to learn.

The problem becomes even more tricky when the hand is facing downwards. Not only does the robot need to manipulate the object, but also circumvent gravity so it doesn’t fall down.

The team found that a simple approach could solve complex problems. They used a model-free reinforcement learning algorithm (meaning the system has to figure out value functions from interactions with the environment) with deep learning, and something called a “teacher-student” training method.

For this to work, the “teacher” network is trained on information about the object and robot that’s easily available in simulation, but not in the real world, such as the location of fingertips or object velocity. To ensure that the robots can work outside of the simulation, the knowledge of the “teacher” is distilled into observations that can be acquired in the real world, such as depth images captured by cameras, object pose, and the robot’s joint positions. They also used a “gravity curriculum,” where the robot first learns the skill in a zero-gravity environment, and then slowly adapts the controller to the normal gravity condition, which, when taking things at this pace — really improved the overall performance.

While seemingly counterintuitive, a single controller (known as brain of the robot) could reorient a large number of objects it had never seen before, and with no knowledge of shape.

“We initially thought that visual perception algorithms for inferring shape while the robot manipulates the object was going to be the primary challenge,” says MIT professor Pulkit Agrawal, an author on the paper about the research. “To the contrary, our results show that one can learn robust control strategies that are shape agnostic. This suggests that visual perception may be far less important for manipulation than what we are used to thinking, and simpler perceptual processing strategies might suffice.”

Many small, circular shaped objects (apples, tennis balls, marbles), had close to one hundred percent success rates when reoriented with the hand facing up and down, with the lowest success rates, unsurprisingly, for more complex objects, like a spoon, a screwdriver, or scissors, being closer to thirty.

Beyond bringing the system out into the wild, since success rates varied with object shape, in the future, the team notes that training the model based on object shapes could improve performance.

Adaptive asynchronous control system of robotic arm based on augmented reality-assisted brain-computer interface

by Lingling Chen et al in Journal of Neural Engineering

Researchers at Hebei University of Technology and other institutes in China have developed an innovative system for controlling robotic arms that is based on augmented reality (AR) and a brain-computer interface. This system, presented in a paper published in the Journal of Neural Engineering, could enable the development of bionic or prosthetic arms that are easier for users to control.

For people with motor impairments or physical disabilities, completing daily tasks and house chores can be incredibly challenging. Recent advancements in robotics, such as brain-controlled robotic limbs, have the potential to significantly improve their quality of life.

“In recent years, with the development of robotic arms, brain science and information decoding technology, brain-controlled robotic arms have attained increasing achievements,” Zhiguo Luo, one of the researchers who carried out the study, told TechXplore. “However, disadvantages like poor flexibility restrict their widespread application. We aim to promote the lightweight and practicality of brain-controlled robotic arms.”

The system developed by Luo and his colleagues integrates AR technology, which allows users to view an enhanced version of their surroundings that includes digital elements, and a brain-controlled interface, with a conventional method for controlling robotic limbs known as asynchronous control. This ultimately allows users to achieve greater control over robotic arms, enhancing the accuracy and efficiency of the resulting movements.

Asynchronous control methods are inspired by the way in which the human brain operates. More specifically, they try to replicate the brain’s ability to alternate between working and idle states.

“The key point of asynchronous control is to distinguish the idle state and the working state of the robotic system,” Luo explained. “After a user starts operating our robotic arm system, the system is initialized to the idle state. When the control command comes to the subject’s mind, the subject can switch the system to the working state via the state switching interface.”

After the system created by the researchers is switched into the working state, users can simply select the control commands for the movements they wish to perform and the system transmits them to the robotic arm they are wearing. When the robotic arm receives these commands, it simply performs the desired movements or task. Once the task is completed, the system automatically goes back into an idle state.

“A unique feature of our system is the successful integration of AR-BCI, asynchronous control, and an adaptive stimulus time adjustment method for data processing,” Luo said. “Compared to conventional BCI systems, our system is also more flexible and easier to control.”

The adaptive nature of the system created by Luo and his colleagues allows it to flexibly adjust the duration of the AR content presented to users based on a user’s state while he is using the robotic arm. This can significantly reduce fatigue caused by looking at a screen or digital content. Moreover, compared to conventional brain-computer interfaces, the team’s AR-enhanced system reduces constraints on the physical activity of users, allowing them to operate robotic arms with greater ease.

“Ultimately, we were able to successfully integrate AR, brain-computer interfaces, adaptive asynchronous control and a new spatial filtering algorithm to classify the SSVEP signals, which provides new ideas for the development of a brain-controlled robotic arm,” Luo said. “Our approach helps to improve the practicality of brain-controlled robotic arm and accelerate the application of this technology in real life.”

The researchers evaluated their system in a series of experiments and attained highly promising results. Most notably, they found that their system allows users to perform the movements they wanted using a robotic arm with an accuracy of 94.97%. In addition, the ten users who tested their system were able to select single commands for robotic arms within an average time of 2.04 seconds. Overall, these findings suggests that their system improves the efficiency with which users can control robotic arms, while also reducing their visual fatigue.

In the future, the approach proposed by this team of researchers could help to enhance the performance of both existing and newly developed robotic arms. This could facilitate the implementation of these systems both in healthcare settings and elderly care facilities, allowing patients and guests to engage in some of their daily activities independently and thus enhancing their quality of life.

So far, Luo and his colleagues only tested their system on users with no motor impairments or disabilities. However, they soon hope to also evaluate it in collaboration with elderly users or users with physical disabilities, to explore its potential and applicability further.

“We now plan to work on the following aspects to improve the system’s reliability and practicability for social life,” Luo added. “First, in terms of asynchronous control strategy, EOG and other physiological signals can be used to improve the asynchronous control process. Second, EEG decoding, transfer learning, and other methods can improve the model training process even further. Furthermore, in terms of the dynamic window, we could use other prediction methods to modify the system threshold in real-time.”

Shaping embodied agent behavior with activity-context priors from egocentric video

by Tushar Nagarajan, Kristen Grauman

Researchers at UT Austin and Facebook AI Research have recently developed a new framework that could shape the behavior of embodied agents more effectively, using ego-centric videos of humans completing everyday tasks. Their paper, pre-published on arXiv and set to be presented at the Neural Information Processing Systems (NeurIPS) Conference in December, introduces a more efficient approach for training robots to complete household chores and other interaction-heavy tasks.

Over the past decade or so, many roboticists and computer scientists have been trying to develop robots that can complete tasks in spaces populated by humans; for instance, helping users to cook, clean and tidy up. To tackle household chores and other manual tasks, robots should be able to solve complex planning tasks that involve navigating environments and interacting with objects following specific sequences.

While some techniques for solving these complex planning tasks have achieved promising results, most of them are not fully equipped to tackle them. As a result, robots cannot yet complete these tasks as well as human agents.

“The overreaching goal of this project was to build embodied robotic agents that can learn by watching people interact with their surroundings,” Tushar Nagarajan, one of the researchers who carried out the study, told TechXplore. “Reinforcement learning (RL) approaches require millions of attempts to learn intelligent behavior as agents begin by randomly attempting actions, while imitation learning (IL) approaches require experts to control and demonstrate ideal agent behavior, which is costly to collect and requires extra hardware.”

The main idea behind the researchers’ paper. Left and middle panel: The team discovered activity-contexts for objects directly from egocentric video of human activity. A given object’s activity-context goes beyond “what objects are found together” to capture the likelihood that each other object in the environment participates in activities involving it (i.e., “what objects together enable action”). Right panel: The team’s approach guides agents to bring compatible objects — objects with high likelihood — together to enable activities. For example, bringing a pan to the sink increases the value of faucet interactions, but bringing it to the table has little effect on interactions with a book. Credit: Nagarajan & Grauman.

In contrast with robotic systems, when entering a new environment, humans can effortlessly complete tasks that involve different objects. Nagarajan and his colleague Kristen Grauman thus set out to investigate whether embodied agents could learn to complete tasks in similar environments simply by observing how humans behave.

Rather than training agents using video demonstrations labeled by humans, which are often expensive to collect, the researchers wanted to leverage egocentric (first-person) video footage showing people performing everyday activities, such as cooking a meal or washing dishes. These videos are easier to collect and more readily accessible than annotated demonstrations.

“Our work is the first to use free-form human-generated video captured in the real world to learn priors for object interactions,” Nagarajan said. “Our approach converts egocentric video of humans interacting with their surroundings into ‘activity-context’ priors, which capture what objects, when brought together, enable activities. For example, watching humans do the dishes suggests that utensils, dish soap and a sponge are good objects to have before turning on the faucet at the sink.”

To acquire these ‘priors’ (e.g., useful information about what objects to gather before completing a task), the model created by Nagarajan and Grauman accumulates statistics about pairs of objects that humans tend to use during specific activities. Their model directly detected these objects in ego-centric videos from the large dataset used by the researchers.

Subsequently, the model encoded the priors it acquired as a reward in a reinforcement learning framework. Essentially, this means that an agent is rewarded based on what objects it selected for completing a given task.

“For example, turning-on the faucet is given a high reward when a pan is brought near the sink (and a low reward if, say, a book is brought near it),” Nagarajan explained. “As a consequence, an agent must intelligently bring the right set of objects to the right locations before attempting interactions with objects, in order to maximize their reward. This helps them reach states that lead to activities, which speeds up learning.”

Previous studies have tried to accelerate robot policy learning using similar reward functions. However, typically these are exploration rewards that encourage agents to explore new locations or perform new interactions, without specifically considering the human tasks they are learning to complete.

“Our formulation improves on these previous approaches by aligning the rewards with human activities, helping agents explore more relevant object interactions,” Nagarajan said. “Our work is also unique in that it learns priors about object interactions from free-form video, rather than video tied to specific goals (as in behavior cloning). The result is a general-purpose auxiliary reward to encourage efficient RL.”

In contrast with priors considered by previously developed approaches, the priors considered by the researchers’ model also capture how objects are related in the context of actions that the robot is learning to perform, rather than merely their physical co-occurrence (e.g., spoons can be found near knives) or semantic similarity (e.g., potatoes and tomatoes are similar objects).

The researchers evaluated their model using a dataset of ego-centric videos showing humans as they complete everyday chores and tasks in the kitchen. Their results were promising, suggesting that their model could be used to train household robots more effectively than other previously developed techniques.

“Our work is the first to demonstrate that passive video of humans performing daily activities can be used to learn embodied interaction policies,” Nagarajan said. “This is a significant achievement, as egocentric video is readily available in large amounts from recent datasets. Our work is a first step towards enabling applications that can learn about how humans perform activities (without the need for costly demonstrations) and then offer assistance in the home-robotics setting.”

In the future, the new framework developed by this team of researchers could be used to train a variety of physical robots to complete a variety of simple everyday tasks. In addition, it could be used to train and augmented reality (AR) assistants, which could, for instance, observe how a human cooks a specific dish and then teach new users to prepare it.

“Our research is an important step towards learning by watching humans, as it captures simple, yet powerful priors about objects involved in activities,” Nagarajan added. “However, there are other meaningful things to learn such as: What parts of the environment support activities (scene affordances)? How should objects be manipulated or grasped to use them? Are there important sequences of actions (routines) that can be learned and leveraged by embodied agents? Finally, an important future research direction to pursue is how to take policies learned in simulated environments and deploy them onto mobile robot platforms or AR glasses, in order to build agents that can cooperate with humans in the real world.”

ReSkin: versatile, replaceable, lasting tactile skins

by Raunaq Bhirangi, Tess Hellebrekers, Carmel Majidi et al.

Picking up a blueberry or grape without squishing it isn’t hard, but try teaching it to a robot. The same goes for walking on ice, turning a key to unlock a door or cooking a favorite dish. When it comes to the senses, touch remains a challenge for artificial intelligence and robotics researchers. Carnegie Mellon University and Meta AI (formerly Facebook AI) hope to change that through a new tactile sensing skin they believe will increase the sense of touch in robotics, wearables, smart clothing and AI. Called ReSkin, the technology is affordable, durable and easy-to-use. It harnesses advances in machine learning, soft robotics and magnetic sensing to create a skin that is versatile and as easy to apply as a bandage.

“I want this sensing skin to be so robust and simple that anyone could set it up and start gathering touch data within a day,” said Tess Hellebrekers, a postdoctoral researcher at Meta AI who earned her Ph.D. from CMU’s Robotics Institute in 2020.

Abhinav Gupta, an associate professor in the RI and the research manager at Meta AI who worked with Hellebrekers on ReSkin, said the technology could lead to an explosion of possible applications for tactile sensing. It will provide rich contact data for touch-based tasks like determining what something is, sensing movement and grasping an object. This could help in health care or in areas where dexterity to maneuver small, soft or sensitive objects is critical.

ReSkin costs less than $600 for 100 units and even less for larger quantities. It is 2 to 3 millimeters thick and can last for more than 50,000 interactions. Its sensors provide high resolution results and can detect slipping, throwing, catching and clapping. All this makes ReSkin ideal for use on robotic hands, tactile gloves, arm sleeves and even dog shoes, and helps researchers collect data that previously would have been difficult or impossible to gather.

“When it wears out, it can be easily peeled off and replaced with a new one,” Gupta said.

ReSkin can easily be replaced because there are no wires between the skin and the sensing board. The sensing board only needs to be nearby. AI methods developed by Raunaq Bhirangi, a Ph.D. student in robotics at CMU and part of the ReSkin team, contribute to its ease of use. These methods enable the sensors to autocalibrate and allow the data to remain consistent even when the skin is replaced. ReSkin requires no hardware or software changes between sensor skins.

“Repeatability and replaceability are two of the biggest bottlenecks hindering widespread use of soft sensors in robotics,” Bhirangi said. “With ReSkin, we use simple machine learning techniques to solve these problems and open a scalable and inexpensive tactile sensing module that can be used for a diverse set of applications.”

Improvements in computer vision have helped robots and AI see and perceive the world as humans do. Natural language processing has given computers the ability to listen and speak. Touch is another step toward infusing AI and robotics with a sense of what it is like to be human.

“We want AI to understand the richness of the world,” Gupta said. “Touch really helps you sense the world in a true form.”

ReSkin is part of a broader research initiative by Meta AI into touch and tactile sensing that includes high-resolution tactile hardware, simulators, benchmarks and data sets. Facebook AI hopes improvements in touch unlock possibilities in augmented and virtual reality and lead to innovations in industrial, medical and agricultural robotics.

Enabling a social robot to process social cues to detect when to help a user

by Jason R. Wilson, Phyo Thuta Aung, Isabelle Boucher

Researchers at Franklin & Marshall College have recently been trying to develop computational tools that could enhance the performance of socially assistive robots, by allowing them to process social cues given by humans and respond accordingly. In a paper pre-published on arXiv and presented at the AI-HRI symposium 2021 last week, they introduced a new technique that allows robots to autonomously detect when it is appropriate for them to step in and help users.

As robots are introduced in an increasing number of real-world settings, it is important for them to be able to effectively cooperate with human users. In addition to communicating with humans and assisting them in everyday tasks, it might thus be useful for robots to autonomously determine whether their help is needed or not.

“I am interested in designing robots that help people with everyday tasks, such as cooking dinner, learning math, or assembling Ikea furniture,” Jason R. Wilson, one of the researchers who carried out the study, told TechXplore. “I’m not looking to replace people that help with these tasks. Instead, I want robots to be able to supplement human assistance, especially in cases where we do not have enough people to help.”

Wilson believes that when a robot helps humans to complete a given task, it should do so in a ‘dignified’ way. In other words, he thinks that robots should ideally be sensitive to their users’ humanity, respecting their dignity and autonomy.

There are several ways in which roboticists can consider the dignity and autonomy of users in their designs. In their recent work, Wilson and his students Phyo Thuta Aung and Isabelle Boucher specifically focused on preserving a user’s autonomy.

“One way for a robot to support autonomy is to ensure that the robot finds a balance between helping too much and too little,” Wilson explained. “My prior work has looked at algorithms for adjusting the robot’s amount of assistance based on how much help the user needs. Our recent study focused on estimating how much help the user needs.”

When humans need help with a given task, they can explicitly ask for assistance or convey that they are struggling in implicit ways. For example, they could make comments such as “hmm, I am not sure,” or express their frustration through their facial expressions or body language. Other implicit strategies used by humans to communicate that they need help involve the use of their eye gaze.

“For example, a person may look at the task they are working on, then look at a person that can help them and then look back at the task,” Wilson said. “This gaze pattern, called confirmatory gaze, is used to request that the other person look at what they are looking at, perhaps because they are unsure if it is correct.”

The key objective of the recent study carried out by Wilson, Aung and Boucher was to allow robots to automatically process eye-gaze-related cues in useful ways. The technique they created can analyze different types of cues, including a user’s speech and eye gaze patterns.

“The architecture we are developing automatically recognizes the user’s speech and analyzes it to determine if they are expressing that they want or need help,” Wilson explained. “At the same time, the system also detects users’ eye gaze patterns, determining if they are exhibiting a gaze pattern associated with needing help.”

In contrast with other techniques to enhance human-robot interactions, the approach does not require information about the specific task that users are completing. This means that it could be easily applied to robots operating in various real-world contexts and trained to tackle different tasks.

While the model created by Wilson and his colleagues can enhance user experiences without the need for task-specific details, developers can still provide these details to enhance its accuracy and performance. In initial tests, the framework achieved highly promising results, so it could soon be used to improve the performance of both existing and newly developed social robots.

“We are now continuing to explore what social cues would best allow a robot to determine when a user needs help and how much help they want,” Wilson said. “One important form of nonverbal communication that we are not using yet is emotional expression. More specifically, we are looking at analyzing facial expressions to see when a user feels frustrated, bored, engaged or challenged.

MISC

  • November’s issue of Science Robotics is out:
  • Human-style eyes for robots, with some inspiration from Jabba the Hutt, created:

Subscribe to Paradigm!

Medium. Twitter. Telegram. Telegram Chat. Reddit. LinkedIn.

Main sources

Research articles

Science Robotics

Science Daily

IEEE Spectrum

--

--