Certain Doom | The Juice

Zumo Labs presents The Juice, a weekly newsletter focused on computer vision problems (and sometimes just regular problems). Get it while it’s fresh.

Maisie Sheidlower
Zumo Labs
5 min readJul 20, 2021

--

Week of July 12–16, 2021

____

We’re doomed. The ice caps are melting, the oceans are rising — a phenomenon that the moon’s wobbly orbit will soon exacerbate, according to NASA — and global temperatures are reaching record highs. Can wildfire season still be considered a “season” if it’s nine months long?

But fear not, for artificial intelligence will save us. Despite the fact that a recent study shows computer vision has not yet passed the “awareness phase,” with only 10% of companies adopting it so far, some are already turning to AI tech to delay dystopia. The Times reports on how vast quantities of improved weather and satellite data are helping AI systems better predict how wildfires will burn once they’ve started. And per this piece in Fortune, there’s hope that the cellphones in our pockets can be used to scalably inspect infrastructure to prevent further building collapses such as that in Surfside. No wonder Google’s Sundar Pichai recently called AI “the most profound technology that humanity will ever develop and work on.”

If we manage to avert catastrophe thanks to artificial intelligence, it would seem there’d only be one thing left to worry about. The AI. This week’s stories are all about teaching robots to walk around, aim guns, and use knives. Cool.

____

#CheatCode

How far would you go to improve your Call of Duty game? If you’ve got the cash for a top-of-the-line GPU and video capture card, this undetectable computer vision-powered auto-aim cheat may be for you. Just one thing: its creator has already shut it down, having caught the attention of (and, no doubt, a sternly worded letter from) Activision. I’d say cheaters never win, but after this there are probably plenty of folks with broken controllers that would say otherwise.

Cheat-maker brags of computer-vision auto-aim that works on “any game”, via Ars Technica.

#BarefootCNNtessa

Food experts have been telling us for a while now that it does matter how you slice it. Perhaps that’s why researchers from NVIDIA and USC have announced a differentiable simulator for robotic cutting — or DiSECt — which will use synthetic data to help robots learn just how to slice it. But cutting potatoes isn’t the end goal. Food prep can be indicative of a system’s ability to learn, adapt, and put into practice the skills needed to take over more meaningful tasks, like surgery.

Researchers Create Simulator to Help Robots Wield Knives, via The Spoon.

#BabySteps

Seeing parallels between robots and children isn’t hard. A Roomba’s notification that it “needs your attention” because it is “stuck near a cliff” pretty accurately conveys a toddler’s capacity for risk assessment. Facebook, IBM, and Google caught this pattern and decided to run with it quite literally, teaching a robot to maneuver and navigate (“walk”) like one would their 9–16-month-old. The researchers use simulated environments to teach the robot to respond to challenging conditions, and even learn from its mistakes. The kids will be alright.

How do you teach robots to navigate new places? Study toddlers., via The Washington Post.

#SpeechAI

The only thing scarier than a robot that can walk, shoot, and stab, might be one that will actually shout at you. Deep learning technology has paved the way for rapid advancements in voice simulation, and several startups, such as WellSaid Labs, Resemble.ai, and Sonantic, are building bots with convincingly emotive tones. After learning from a real voice actor, Sonantic’s voices can whisper, cry, yell, and even laugh. As for who’s not laughing — the voice actors in SAG-AFTRA that are afraid of being cut out of the deal.

AI voice actors sound more human than ever — and they’re ready to hire, via MIT Technology Review.

#Bravo

An experimental brain implant just read someone’s mind (we’ll elaborate), and for now, it’s excellent news. The study’s participant, “BRAVO1,” is in his late 30s. Fifteen years ago, a stroke left him almost completely paralyzed in the arms, legs, and muscles of his vocal tract. But not in the sections of his brain “that once issued speech commands.” So a team at U.C. San Francisco created an AI-powered device that decodes the signals in his brain that previously controlled the vocal tract, rather than requiring him to use the eye or head movements that instruct similar products. Over several months, they implanted sensors on the surface of his brain and had a computer study the patterns of electrical activity as he “spoke” 50 different words. BRAVO1 is currently able to communicate at 15 words per minute with a 50-word vocabulary.

Experimental Brain Implant Lets Man With Paralysis Turn His Thoughts Into Words, via NPR.

#JeNeSaisQuoi

A study from Nature Human Behavior says that AIs know what type of art you like — yes, even you. The researchers collected data from over 1,000 volunteers who rated pieces of randomly selected art in several styles, based on how much they liked it and what characteristics they observed in it. A machine vision algorithm also evaluated the artwork, looking for “low-level” patterns like color or blurred edges that might inform participants’ “high-level” judgments. The combined data allowed a neural network to predict whether or not a participant would like an unseen painting with high accuracy. This research has positive implications for researchers and algorithms’ ability to understand abstract human thinking in non-aesthetic ways, though we imagine advertisers will welcome it for its original use.

A Computer Can Predict if you Prefer Rothko or Monet. Here’s How., via Inverse.

____

📄 Paper of the Week

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Currently ranked #1 on the “Panoptic Segmentation on COCO panoptic” benchmark, this paper is trailblazing semantic and panoptic segmentation. Most semantic segmentation models use a per-pixel loss, but this paper argues (and proves) that a loss based on binary masks results in better performance. This unifies instance and semantic segmentation tasks since the mask prediction approach was already popular in instance segmentation. There is one caveat: the proposed MaskFormer system makes use of a transformer decoder. So does the performance bump come from mask prediction, or is this yet again transformers pushing the state of the art?

____

Think The Juice was worth the squeeze? Sign up here to receive it weekly.

--

--