RT/ A precision arm for miniature robots

Paradigm
Paradigm
Published in
23 min readJan 19, 2023

Robotics biweekly vol.66, 30th December — 19th January

TL;DR

  • Until now, microscopic robotic systems have had to make do without arms. Now researchers have developed an ultrasonically actuated glass needle that can be attached to a robotic arm. This lets them pump and mix minuscule amounts of liquid and trap particles.
  • Birds fly more efficiently by folding their wings during the upstroke, according to a recent study. The results could mean that wing-folding is the next step in increasing the propulsive and aerodynamic efficiency of flapping drones.
  • Using artificial intelligence, engineers have simplified and reinforced models that accurately calculate the fine particulate matter (PM2.5) — the soot, dust and exhaust emitted by trucks and cars that get into human lungs — contained in urban air pollution.
  • The research team integrated deep-learning techniques with the use of drones to automatically detect defects on the reflector surface. Specifically, they began by manually controlling a drone equipped with a high-resolution RGB camera to fly over the surface along a predetermined route.
  • Researchers have developed a new reconfigurable workspace soft (RWS) robotic gripper that can scoop, pick and grasp a wide range of consumer items. The RWS gripper’s comprehensive and adaptive capabilities make it particularly useful in logistics and food industries where they depend on robotic automation to meet increasing demands in efficiently picking and packing items.
  • An unmanned semi-submersible vehicle recently developed by researchers may prove that the best way to travel in water undetected and efficiently is not on top, or below, but in-between.
  • The research team established a multi-domain framework for switched electromechanical dynamics. Until now, researchers seeking to predict the energy usage of robotic systems were forced to rely on a piecemeal method providing only rough approximations under limited conditions.
  • Could the new chatbot ChatGPT convincingly produce fake abstracts that fool scientists into thinking those studies are the real thing? Yes, scientists can be fooled, the new study reports. Blinded human reviewers — when given a mix real and falsely generated abstracts — could only spot ChatGPT generated abstracts 68% of the time.
  • Researchers have recently created a new neuromorphic computing system supporting deep belief neural networks (DBNs), a generative and graphical class of deep learning models. This system is based on silicon-based memristors, energy-efficient devices that can both store and process information.
  • A team of researchers at Microsoft has demonstrated a new AI system that is capable of mimicking a person’s voice after training with a recording just three seconds long.
  • Robotics upcoming events. And more!

Robotics market

The global market for robots is expected to grow at a compound annual growth rate (CAGR) of around 26 percent to reach just under 210 billion U.S. dollars by 2025.

Size of the global market for industrial and non-industrial robots between 2018 and 2025 (in billion U.S. dollars):

Size of the global market for industrial and non-industrial robots between 2018 and 2025 (in billion U.S. dollars). Source: Statista

Latest News & Research

A robot-assisted acoustofluidic end effector

by Jan Durrer, Prajwal Agrawal, Ali Ozgul, Stephan C. F. Neuhauss, Nitesh Nama, Daniel Ahmed in Nature Communications

We are all familiar with robots equipped with moving arms. They stand in factory halls, perform mechanical work and can be programmed. A single robot can be used to carry out a variety of tasks. Until today, miniature systems that transport miniscule amounts of liquid through fine capillaries have had little association with such robots. Developed by researchers as an aid for laboratory analysis, such systems are known as microfluidics or lab-on-a-chip and generally make use of external pumps to move the liquid through the chips. To date, such systems have been difficult to automate, and the chips have had to be custom-designed and manufactured for each specific application.

Scientists led by ETH Professor Daniel Ahmed are now combining conventional robotics and microfluidics. They have developed a device that uses ultrasound and can be attached to a robotic arm. It is suitable for performing a wide range of tasks in microrobotic and microfluidic applications and can also be used to automate such applications.

Experimental set-up and working principle of the robot-assisted acoustofluidic end effector (RAEE) device.

The device comprises a thin, pointed glass needle and a piezoelectric transducer that causes the needle to oscillate. Similar transducers are used in loudspeakers, ultrasound imaging and professional dental cleaning equipment. The ETH researchers can vary the oscillation frequency of their glass needle. By dipping the needle into a liquid they create a three-dimensional pattern composed of multiple vortices. Since this pattern depends on the oscillation frequency, it can be controlled accordingly.

The researchers were able to use this to demonstrate several applications. First, they were able to mix tiny droplets of highly viscous liquids. “The more viscous liquids are, the more difficult it is to mix them,” Professor Ahmed explains. “However, our method succeeds in doing this because it allows us to not only create a single vortex, but to also efficiently mix the liquids using a complex three-dimensional pattern composed of multiple strong vortices.”

Characterization of the circular streaming in an acoustofluidic device.

Second, the scientists were able to pump fluids through a mini-channel system by creating a specific pattern of vortices and placing the oscillating glass needle close to the channel wall. Third, they succeeded in using their robot-assisted acoustic device to trap fine particles present in the fluid. This works because a particle’s size determines its reaction to the sound waves. Relatively large particles move towards the oscillating glass needle, where they accumulate. The researchers demonstrated how this method can capture not only inanimate particles but also fish embryos. They believe it should also be capable of capturing biological cells in the fluid.

“In the past, manipulating microscopic particles in three dimensions was always challenging. Our microrobotic arm makes it easy,” Ahmed says.

Pre-programmed high-throughput mixing of viscous fluids in a 96-well plate using the RAEE.

“Until now, advancements in large, conventional robotics and microfluidic applications have been made separately,” Ahmed says. “Our work helps to bring the two approaches together.”

As a result, future microfluidic systems could be designed similarly to today’s robotic systems. An appropriately programmed single device would be able to handle a variety of tasks. “Mixing and pumping liquids and trapping particles — we can do it all with one device,” Ahmed says. This means tomorrow’s microfluidic chips will no longer have to be custom-developed for each specific application. The researchers would next like to combine several glass needles to create even more complex vortex patterns in liquids.

In addition to laboratory analysis, Ahmed can envisage other applications for microrobotic arms, such as sorting tiny objects. The arms could conceivably also be used in biotechnology as a way of introducing DNA into individual cells. It should ultimately be possible to employ them in additive manufacturing and 3D printing.

Robotic Avian Wing Explains Aerodynamic Advantages of Wing Folding and Stroke Tilting in Flapping Flight

by Enrico Ajanic, Adrien Paolini, Charles Coster, Dario Floreano, Christoffer Johansson in Advanced Intelligent Systems

Birds fly more efficiently by folding their wings during the upstroke, according to a recent study led by Lund University in Sweden. The results could mean that wing-folding is the next step in increasing the propulsive and aerodynamic efficiency of flapping drones.

Even the precursors to birds — extinct bird-like dinosaurs — benefited from folding their wings during the upstroke, as they developed active flight. Among flying animals alive today, birds are the largest and most efficient. This makes them particularly interesting as inspiration for the development of drones. However, determining which flapping strategy is best requires aerodynamic studies of various ways of flapping the wings. Therefore, a Swedish-Swiss research team has constructed a robotic wing that can achieve just that — flapping like a bird, and beyond.

“We have built a robot wing that can flap more like a bird than previous robots, but also flap in way that birds cannot do. By measuring the performance of the wing in our wind tunnel, we have studied how different ways of achieving the wing upstroke affect force and energy in flight,” says Christoffer Johansson, biology researcher at Lund University.

Mechanical design and working principle of the biomimetic wing and the drive mechanism.

Previous studies have shown that birds flap their wings more horizontally when flying slowly. The new study shows that the birds probably do it, even though it requires more energy, because it is easier to create a sufficiently large forces to stay aloft and propel themselves. This is something drones can emulate to increase the range of speeds they can fly at.

“The new robotic wing can be used to answer questions about bird flight that would be impossible simply by observing flying birds. Research into the flight ability of living birds is limited to the flapping movement that the bird actually uses,” explains Christoffer Johansson.

The research explains why birds flap the way they do, by finding out which movement patterns create the most force and are the most efficient. The results can also be used in other research areas, such as better understanding how the migration of birds is affected by climate change and access to food. There are also many potential uses for drones where these insights can be put to good use. One area might be using drones to deliver goods.

“Flapping drones could be used for deliveries, but they would need to be efficient enough and able to lift the extra weight this entails. How the wings move is of great importance for performance, so this is where our research could come in handy,” concludes Christoffer Johansson.

Developing Machine learning models for hyperlocal traffic related particulate matter concentration mapping

by Salil Desai, Mohammad Tayarani, H. Oliver Gao in Transportation Research Part D: Transport and Environment

Using artificial intelligence, Cornell University engineers have simplified and reinforced models that accurately calculate the fine particulate matter (PM2.5) — the soot, dust and exhaust emitted by trucks and cars that get into human lungs — contained in urban air pollution. Now, city planners and government health officials can obtain a more precise accounting about the well-being of urban dwellers and the air they breathe, from new research.

“Infrastructure determines our living environment, our exposure,” said senior author Oliver Gao, the Howard Simpson Professor of Civil and Environmental Engineering in the College of Engineering at Cornell University. “Air pollution impact due to transportation — put out as exhaust from the cars and trucks that drive on our streets — is very complicated. Our infrastructure, transportation and energy policies are going to impact air pollution and hence public health.”

Previous methods to gauge air pollution were cumbersome and reliant on extraordinary amounts of data points.

“Older models to calculate particulate matter were computationally and mechanically consuming and complex,” said Gao, a faculty fellow at the Cornell Atkinson Center for Sustainability. “But if you develop an easily accessible data model, with the help of artificial intelligence filling in some of the blanks, you can have an accurate model at a local scale.”

Ambient air pollution is a leading cause of premature death around the world. Globally, more than 4.2 million annual fatalities — in the form of cardiovascular disease, ischemic heart disease, stroke and lung cancer — were attributed to air pollution in 2015, according to a Lancet study cited in the Cornell research.

In this work, the group developed four machine learning models for traffic-related particulate matter concentrations in data gathered in New York City’s five boroughs, which have a combined population of 8.2 million people and a daily-vehicle miles traveled of 55 million miles. The equations use few inputs such as traffic data, topology and meteorology in an AI algorithm to learn simulations for a wide range of traffic-related, air-pollution concentration scenarios. Their best performing model was the Convolutional Long Short-term Memory, or ConvLSTM, which trained the algorithm to predict many spatially correlated observations.

“Our data-driven approach — mainly based on vehicle emission data — requires considerably fewer modeling steps,” Desai said. Instead of focusing on stationary locations, the method provides a high-resolution estimation of the city street pollution surface. Higher resolution can help transportation and epidemiology studies assess health, environmental justice and air quality impacts.

Automated optical inspection of FAST’s reflector surface using drones and computer vision

by Jianan Li et al in Light: Advanced Manufacturing

The Five-hundred-meter Aperture Spherical radio Telescope (FAST), also known as the China Sky Eye, is the world’s largest single-dish radio telescope. Its reflector is a partial sphere of radius R=300 m. The planar partial spherical cap of the reflector has a diameter of 519.6 m, 1.7 times larger than that of the previously largest radio telescope.

The large reflecting surface makes FAST the world’s most sensitive radio telescope. It was used by astronomers to observe, for the first time, fast radio bursts in the Milky Way and to identify more than 500 new pulsars, four times the total number of pulsars identified by other telescopes worldwide. More interesting and exotic objects may yet be discovered using FAST.

However, a larger reflecting surface is more prone to external damage due to environmental factors. The FAST reflector comprises a total 4,450 spliced trilateral panels, made of aluminum with uniform perforations to reduce weight and wind impact. Falling objects (e.g., during the extreme events such as rockfalls, severe windstorms, and hailstorms) may cause severe dents and holes in the panels. Such defects adversely impact the study of small-wavelength radio waves, which demands a perfect dish surface. Any irregularity in the parabola scatters these small waves away from the focus, causing information loss.

(a) FAST’s optical geometry. (b) Automated optical inspection at FAST. © Illustration of surface defects (dent and hole). (d) Results of defect detection.

The rapid detection of surface defects for timely repair is hence critical for maintaining the normal operation of FAST. This is traditionally done by direct visual inspection. Skilled inspectors climb up the reflector and visually examine the entire surface, searching for and replacing any panels showing dents and holes. However, this procedure has several limitations. First, there is danger involved in accessing hard-to-reach places high above ground. Second, it is labor-intensive and time-consuming to scrutinize the thousands of panels. Third, the procedure relies heavily on the inspectors’ expertise and is prone to human-based errors and inconsistencies.

The remedy to the shortcomings of manual inspection at FAST is automated inspection. In a new paper, a team of scientists led by Professor Jianan Li and Tingfa Xu from Beijing Institute of Technology have made the first step towards automating the inspection of FAST by integrating deep-learning techniques with drone technology. As a first step, the research team integrated deep-learning techniques with the use of drones to automatically detect defects on the reflector surface. Specifically, they began by manually controlling a drone equipped with a high-resolution RGB camera to fly over the surface along a predetermined route. During the flight, the camera captured and recorded videos of surface conditions.

One benefit of the advanced flight stability of drones is that the recorded videos can capture a lot of information about surface details. Moreover, thanks to the GPS device and the RTK module onboard the drone platform, every video frame can be tagged with the corresponding drone location with centimeter-level accuracy. The physical locations of the panels that appear in each frame can thus be easily determined. To tackle the challenges of finding surface defects in drone imagery exhibiting large-scale variation and high inter-class similarity, they introduced a simple yet effective cross-fusion operation for deep detectors, which aggregates multi-level features in a point-wise selective manner to help detect defects of various scales and types. The cross-fusion method is lightweight and computationally efficient, particularly valuable features for onboard drone applications.

Future work will implement the algorithm on embedded hardware platforms to process captured videos onboard the drone, to make the inspection system more autonomous and more robust.

A Multimodal, Reconfigurable Workspace Soft Gripper for Advanced Grasping Tasks

by Snehal Jain et al in Soft Robotics

Researchers from the Singapore University of Technology and Design’s (SUTD) Bio-Inspired Robotics and Design Laboratory have developed a new reconfigurable workspace soft (RWS) robotic gripper that can scoop, pick and grasp a wide range of consumer items. The RWS gripper’s comprehensive and adaptive capabilities make it particularly useful in logistics and food industries where they depend on robotic automation to meet increasing demands in efficiently picking and packing items.

The RWS gripper can reliably scoop rice or couscous with radii as small as 1.5 millimeters or pick items as thin as 300 microns such as business cards or thin instruction manuals from flat surfaces. It can also grasp large convex, nonconvex, and deformable items such as melons, cereal boxes, or detergent refill bags which can weigh as much as 1.4kg. Compared to traditional rigid grippers, soft grippers use compliant soft actuators and functional hyper elastic materials, allowing them to grasp a wider range of geometries safely and reliably. In addition, soft grippers’ high degrees of freedom and compliance enable several grasp modes despite under actuation and oversimplified control strategies.

(A) Plot showing the workspace volumes of the RWS gripper in its four modes. The gripper volume in the unactuated state (“RWS”) is also plotted for reference. (B) Illustration showing the variation of gripper contact area for various grasping modes, each contact area best suited for a payload type. (c) RWS Gripper grasping (top row, from left to right) a cookie pack (power grasping mode) yogurt container (wide grasping mode) egg and tofu stack (power grasping mode), (bottom row, from left to right), cherry tomato, crisp and coin (pinch grasping mode), and chickpeas (scoop grasping mode) (D) The RWS gripper grasping payloads from YCB benchmark object sets — (top row, from left to right) a mug, knife, tuna can, bolt, screw driver; (middle row, from left to right) scissors, can of spam, baseball, clamp, strawberry; (bottom row, from left to right) marker, screw driver, nut, credit card.

While being advantageous over their rigid counterparts, soft gripper capabilities such as contact effort are mostly a consequence of the gripper workspace, defined as the range of positions a robot can reach to interact with its physical environment. This, in turn, is largely constrained by the gripper design. Moreover, soft grippers designed for highly specific grasping tasks such as scooping grains or wide payloads are usually limited in grasping other payload types or in their manipulation versatility.

To overcome these limitations, the SUTD research team designed the RWS gripper using multimodal actuation, in which the grasping workspace of a soft gripper can be changed rapidly for payloads with different contact area requirements. The RWS gripper can modify and increase its grasping workspace volume by 397% using a combination of shape morphing fingers, retractable nails and an expandable palm, enabling the widest range of grasping capabilities to date achieved by a single soft gripper. The RWS gripper’s ability to quickly reconfigure its grasping workspace makes it an ideal candidate for challenging applications for which multiple task-specific grippers would otherwise be required.

The SUTD research team is taking steps to commercialize the RWS grippers in various high-mix automation applications.

“We are in discussions with various logistics companies, both in the food and packaging sectors, to set up proof of value studies. The team is excited to create market impact and provide new solutions for our industry partners,” shared Assistant Professor Pablo Valdivia y Alvarado, Principal Investigator and Team lead from SUTD.

Development and Testing of Unmanned Semi-Submersible Vehicle

by Pascal Spino et al in Unmanned Systems

An unmanned semi-submersible vehicle developed at Washington State University may prove that the best way to travel in water undetected and efficiently is not on top, or below, but in-between. The roughly 1.5-foot-long semi-sub prototype, built with off-the-shelf and 3D-printed parts, showed its seaworthiness in water tests, moving quickly with low drag and a low profile.

This vessel-type isn’t new. Authorities have discovered crudely made semi-subs being used for illicit purposes in recent years, but the WSU project aims to demonstrate how engineer-developed half-submerged vessels can efficiently serve military, commercial and research purposes.

“A semi-submersible vehicle is relatively inexpensive to build, difficult to detect, and it can go across oceans,” said Konstantin Matveev, the WSU engineering professor leading this work. “It’s not so susceptible to waves in comparison to surface ships since most of the body is underwater, so there are some economic advantages as well.”

Since the semi-sub sails mostly at the water line, it does not need to be made of as strong materials as a submarine which has to withstand the pressure of being underwater for long periods of time. The semi-sub also has the advantage of having a small platform in contact with the atmosphere, making it easier to receive and transmit data.

An unmanned semi-submersible vehicle prototype developed at Washington State University.

For this study, Matveev and co-author Pascal Spino, a recent WSU graduate and former president of the WSU RoboSub club, piloted the semi-sub in Snake River’s Wawawai Bay in Washington state. They tested its stability and ability to maneuver. The semi-sub reached a max speed of 1.5 meters per second (roughly 3.4 miles an hour), but at higher speeds, it rises above the water creating more of a wake and expending more energy. At lower speeds, it is almost fully immersed and barely makes a ripple.

The researchers also outfitted the semi-sub with sonar and mapped the bottom of a reservoir near Pullman, Washington to test its ability to collect and transmit data. While not yet completely autonomous, the WSU semi-sub can be pre-programmed to behave in certain ways, such as running a certain route by itself or responding to particular objects by pursuing them or running away.

While the WSU semi-sub is relatively small at 450 mm long with a 100 mm diameter (about 1.5 foot long and 4 inches in diameter), Matveev said it is possible for larger semi-subs to be built to carry significant cargo. For instance, they could be used to help refuel ships or stations at sea. They could even be scaled up to rival container ships, and since they experience less drag in the water, they would use less fuel creating both an environmental and economic advantage.

For now, the Matveev’s lab is continuing work on optimizing the shape of semi-submersible vehicle prototypes to fit specific purposes. He is currently collaborating with the U.S. Naval Academy in Annapolis, Maryland to work on the vehicles’ operational capabilities and compare numerical simulations with results from experiments.

Switched electromechanical dynamics for transient phase control of brushed DC servomotor

by William Z. Peng et al in Chaos: An Interdisciplinary Journal of Nonlinear Science

In every sector, sustainability and energy efficiency have become pressing concerns, and robotics is no exception. Until now, however, researchers seeking to predict the energy usage of robotic systems were forced to rely on a piecemeal method providing only rough approximations under limited conditions. That’s because while an actuator — the device that converts electrical energy into mechanical force and thus movement — is made up of multiple parts, current approaches look only at the motor itself, rather than considering the controllers that also factor in.

“Existing power consumption models of actuators typically omit consideration of the switching power converter circuits required for directional, speed, or torque control, so we established a multi-domain framework for switched electromechanical dynamics,” NYU Tandon Associate Professor of Mechanical and Aerospace Engineering Joo H. Kim says of the work, which was recently published.

“The switched electromechanical dynamics of a servomotor is derived from the individual models of the internal DC motor, gear train, and H-bridge circuit. The coupled models comprehensively integrate all possible distinct switching configurations of on-state, off-state, and dead time.”

Measured electrical power for the proposed integrated model (orange) and an existing proxy model (blue).

As a result of the never-before-published model, roboticists will be able to accurately predict the power consumption of the actuator as a whole — and to effectively minimize that consumption. Additionally, as Kim explains, his team’s revolutionary approach could also be applicable to general electromechanical systems with switching-based control, such as those used in electric vehicles. The end goal of the research, which was supported in part by the NSF and the Mitsui U.S. Foundation, is improved energy efficiency in all high-mobility systems designed for complex tasks.

“It’s gratifying to be working on something that has never been done before and that holds the potential to vastly improve sustainability in so many sectors,” Kim says. “I consider it proof of the importance of collaborating across disciplines for a common goal.”

Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers

by Catherine A. Gao et al in bioRxiv

Could the new and wildly popular chatbot ChatGPT convincingly produce fake abstracts that fool scientists into thinking those studies are the real thing? That was the question worrying Northwestern Medicine physician-scientist Dr. Catherine Gao when she designed a study — collaborating with University of Chicago scientists — to test that theory.

Yes, scientists can be fooled, their new study reports. Blinded human reviewers — when given a mix real and falsely generated abstracts — could only spot ChatGPT generated abstracts 68% of the time. The reviewers also incorrectly identified 14% of real abstracts as being AI generated.

“Our reviewers knew that some of the abstracts they were being given were fake, so they were very suspicious,” said corresponding author Gao, an instructor in pulmonary and critical care medicine at Northwestern University Feinberg School of Medicine. “This is not someone reading an abstract in the wild. The fact that our reviewers still missed the AI-generated ones 32% of the time means these abstracts are really good. I suspect that if someone just came across one of these generated abstracts, they wouldn’t necessarily be able to identify it as being written by AI.”

The hard-to-detect fake abstracts could undermine science, Gao said. “This is concerning because ChatGPT could be used by ‘paper mills’ to fabricate convincing scientific abstracts,” Gao said. “And if other people try to build their science off these incorrect studies, that can be really dangerous.”

Paper mills are illegal organizations that produce fabricated scientific work for profit. The ease with which ChatGPT produces realistic and convincing abstracts could increase production by paper mills and fake submissions to journals and scientific conferences, Gao worries. For the study, Gao and co-investigators took titles from recent papers from high-impact journals and asked ChatGPT to generate abstracts based on that prompt. They ran these generated abstracts and the original abstracts through a plagiarism detector and AI output detector, and had blinded human reviewers try to differentiate between generated and original abstracts. Each reviewer was given 25 abstracts that were a mixture of the generated and original abstracts and asked to give a binary score of what they thought the abstract was.

“The ChatGPT-generated abstracts were very convincing,” Gao said, “because it even knows how large the patient cohort should be when it invents numbers.” For a study on hypertension, which is common, ChatGPT included tens of thousands of patients in the cohort, while a study on a monkeypox had a much smaller number of participants.

“Our reviewers commented that it was surprisingly difficult to differentiate between the real and fake abstracts,” Gao said.

The study found that the fake abstracts did not set off alarms using traditional plagiarism-detection tools. However, in the study, AI output detectors such as GPT-2 Output Detector, which is available online and free, could discriminate between real and fake abstracts.

“We found that an AI output detector was pretty good at detecting output from ChatGPT and suggest that it be included in the scientific editorial process as a screening process to protect from targeting by organizations such as paper mills that may try to submit purely generated data,” Gao said.

But ChatGPT can also be used for good, said senior study author Yuan Luo, director of the Institute for Augmented Intelligence in Medicine at Feinberg.

“AI language models such as ChatGPT have a potential to help automate the writing process, which is often the speed bottleneck in knowledge generation and dissemination,” Luo said. “The results from the paper showed this is likely doable for the field of medicine, but we need to bridge certain ethical and practical gaps.”

For example, is AI-assisted writing still considered original, Luo asked. Also, AI-generated text currently has difficulty in proper citation, which is a must for scientific writing, he noted.

“Generative text technology has a great potential for democratizing science, for example making it easier for non-English-speaking scientists to share their work with the broader community,” said senior author Dr. Alexander Pearson, director of data sciences and the Head/Neck Cancer Program in Hematology/Oncology at the University of Chicago. “At the same time, it’s imperative that we think carefully on best practices for use.”

A memristive deep belief neural network based on silicon synapses

by Wei Wang et al in Nature Electronics

While artificial intelligence (AI) models are becoming increasingly advanced, training and running these models on conventional computer hardware is very energy consuming. Engineers worldwide have thus been trying to create alternative, brain-inspired hardware that could better support the high computational load of AI systems.

Researchers at Technion–Israel Institute of Technology and the Peng Cheng Laboratory have recently created a new neuromorphic computing system supporting deep belief neural networks (DBNs), a generative and graphical class of deep learning models. This system is based on silicon-based memristors, energy-efficient devices that can both store and process information.

Memristors are electrical components that can switch or regulate the flow of electrical current in a circuit, while also remembering the charge that passed through it. As their capabilities and structure resemble those of synapses in the human brain more closely than conventional memories and processing units, they could be better suited for running AI models.

“We, as part of a large scientific community, have been working on neuromorphic computing for quite some time now,” Shahar Kvatinsky, one of the researchers who carried out the study. “Usually, memristors are used to perform analog computations. It is known that there are two main limitations in the neuromorphic field — one is the memristive technology that is still not widely available. The second is the high cost of converters that are required to convert the analog computation to the digital data and vice versa.”

When developing their neuromorphic computing system, Kvatinsky and his colleagues set out to overcome these two crucial limitations of memristor-based systems. As memristors are not widely available, they decided to instead use a commercially available Flash technology developed by Tower Semiconductor, engineering it to behave like a memristor. In addition, they specifically tested their system with a newly designed DBN, as this particular model does not require data conversions (i.e., its input and output data are binary and inherently digital.

“DBNs are an old machine learning theoretical concept,” Kvatinsky explained. “Our idea was to use binary (i.e., either with a value of 0 or 1) neurons (input/output). There are several unique properties (compared to deep neural networks), including that the training of such a network relies on calculating the accumulated desired model update and updating it only when reaching a certain threshold.”

The artificial synapses created by the researchers were fabricated using commercial complementary-metal-oxide-semiconductor (CMOS) processes. These memristive, silicon-based synapses have numerous advantageous features, including analog tunability, high endurance, long retention time, predictable cycling degradation, and moderate variability across different devices. Kvatinsky and his colleagues demonstrated their system by training a type of DBN, known as a restricted Boltzmann machine, on a pattern recognition task. To train this model (a 19x 8 memristive restricted Boltzmann machine), they used two 12 x 8 arrays of the memristors they engineered.

“The simplicity of DBN makes them attractive for hardware implementation,” Kvatinsky said. “We showed that even though DBN are simple to implement (due to their binary nature), we can reach high accuracy (>97% accurate recognition of handwritten digits) when using Y-Flash based memristors.”

The architecture introduced by this team of researchers offers a new viable solution for running restricted Boltzmann machines and other DBNs. In the future, it could inspire the development of similar neuromorphic systems, collectively helping to run AI systems more energy-efficiently.

“We now plan to scale up this architecture, explore additional memristive technologies and explore more neural network architectures.”

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

by Chengyi Wang et al in arXiv

A team of researchers at Microsoft has demonstrated a new AI system that is capable of mimicking a person’s voice after training with a recording just three seconds long. The team explains developing the new app in a paper. They have also posted a webpage demonstrating the app’s capabilities.

Artificial intelligence applications require training on massive amounts of data. But in this new endeavor, the team at Microsoft has shown that does not always have to be the case. The new app was built using Meta’s EnCodec audio compression technology, and was originally intended as a way to improve the quality of phone conversations. Subsequent work showed that it is capable of far more — not only can it mimic a voice, it can also simulate tone and even the acoustics of the environment in which the original recording was made.

The overview of VALL-E. Unlike the previous pipeline (e.g., phoneme → mel-spectrogram → waveform), the pipeline of VALL-E is phoneme → discrete code → waveform. VALL-E generates the discrete audio codec codes based on phoneme and acoustic code prompts, corresponding to the target content and the speaker’s voice. VALL-E directly enables various speech synthesis applications, such as zero-shot TTS, speech editing, and content creation combined with other generative AI models like GPT-3 [Brown et al., 2020].

Microsoft did not do away with the need for a massive data set, of course; instead, the researchers shifted where it was used. The app was taught to “listen” to a string of words and then to replicate its sound using Meta’s Libri-light dataset, which has over 60,000 hours of recordings made by 7,000 people speaking in English.

The examples Microsoft has provided demonstrate that the system works much better for some voices than others, and it has trouble with accents. But because the app is still in its early stages, it is likely its functionality will improve over time.

Microsoft has not made the source code for VALL-E public and likely will not do so, noting that it could be used in less than responsible ways — hoax recordings of politicians, for example. When combined with deepfake video, the results could take “fake news” to new heights. Microsoft’s example has shown what is possible; thus, it would seem likely that similar systems by others will appear soon.

Upcoming events

ICRA 2023: 29 May–2 June 2023, London, UK

RoboCup 2023: 4–10 July 2023, Bordeaux, France

RSS 2023: 10–14 July 2023, Daegu, Korea

IEEE RO-MAN 2023: 28–31 August 2023, Busan, Korea

MISC

Subscribe to Paradigm!

Medium. Twitter. Telegram. Telegram Chat. Reddit. LinkedIn.

Main sources

Research articles

Science Robotics

Science Daily

IEEE Spectrum

--

--