Robots' Final Frontier

More on Robotics and AI

Tyler Schrenk, above, is a consultant and activist for assistive technologies.

Read more from Clive Thompson on the ways in which gripper technology is changing the reach of automation. Explore the history of AI. And read Gideon Lewis-Kraus on John Deere's experience adopting AI into its business.

Sidd Srinivasa has spent the last five years laser-focused on an ambitious goal: building a robot that can feed someone.

He’s been working on it ever since encountered a paralyzed 11-year-old girl back in 2014, when he was visiting the Rehab Institute of Chicago. “I’m a roboticist,” he recalls saying, “What type of robot could I make for you?”

Her reply: “I want to be able to feed myself.”

Now that, he knew, would be a hard task. In our age of Covid-19, we’re getting a new glimpse of the limits of robots, because the pandemic has produced a new surge of interest in deploying them all over the place. Sometimes it’s to replace humans in situations where the jobs are now too risky; other times it’s to make the workplace safer for us fragile, disease-prone humans. Some of these labor substitutions have been relatively easy, as with using robots to disinfect hospitals. They’re good at rolling around and spraying disinfectant. But in other places, it’s clear the robots are vastly inferior to humans –– they’re nowhere near as deft at deboning beef in meat-processing plants, for example. These latter tasks require a mix of uniquely human abilities involving sight, touch, hearing and judgment that –– as of today –– still exists as a far-off dream for roboticists.

Few people are more intimately aware than Srinivasa of just how far robots are from matching human-level sensitivity, but also of how much remarkable progress roboticists have made in the last decade. A roboticist at the University of Washington, Srinivasa has spent his career focused on everyday manipulation –– trying to get robots to grab and wield everyday household objects with the goal of giving people with limited powers of movement “the ability to live independently on their own,” as he puts it. He’s built robots that can grab bottles and drop them in a recycling bin, that can open doors and fridges and –– in one delightful experiment –– twist open Oreo cookies.

In this video, edited for speed's sake, it takes 17 seconds for the robotic arm to deliver a slice of apple into Tyler Schrenk's mouth after he requests it. In reality, this process took a minute. Schrenk gives verbal commands through an Alexa device.

But when that young girl asked about a robot she could use to feed herself? That, he knew, would be a whole new leap forward in complexity.

It would require engineering robotic movement with a degree of subtly and precision that had long been out of reach. To be sure, robotic hardware and software improved throughout the 90s and the first decade of the 2000s, when Srinivasa was a student and then a young professor, so that robots could become increasingly small, light and precise. Nothing, though, was quite capable of the kind delicate movement and sensing you’d need in order to put a fork millimeters from a person’s face.

But the pace of innovation accelerated in the last decade. Thanks to advances in image recognition fueled by , robots could suddenly be endowed with greatly increased ability to recognize objects, including things like food and human mouths. They could also be taught to match the type of food — say, a chunk of banana — with the best fork-position for skewering it. Additionally, 3-D cameras –– necessary for capturing the images used for this advanced type of image recognition –– had become affordable. So in 2016, when Srinivasa’s lab began working in earnest on a robot that could automatically feed someone, he and his team had more powerful computer vision and sensing than ever before.

Even robot arms had become better and cheaper. They decided to use an arm that was commercially available — the “Jaco” model made by the Canadian firm, Kinova Robotics. The arm was already popular with many wheelchair users, who mount it on their chair and control it with a joystick (or puff stick) to pick up objects. In fact, as Srinivasa learned, some were already using the Jaco arm to feed themselves, though it was a painfully slow process. One Jaco user told Srinivasa that it took him 45 minutes to feed himself three bites.

If his lab could automate those movements with software, Srinivasa figured, they could seriously speed up that process.

Eating food is a surprisingly complex set of actions, which we mostly take for granted. When we eat, we visually scan the plate we’ll be eating from to recognize a discrete piece of food. Then we use our accumulated experience of eating to figure out the right angle to approach the piece of food with a fork. And finally we move it in the right position so it fits in our mouth.

LEFT: Schrenk can use a puff stick as well as a voice activated device to control the self-feeding arm. RIGHT: A plate of foods with varying shapes, colors and consistencies. The feeding system can recognize up to 16 types of foods so far.

These are actions that most humans do without thinking. But for Srinivasa and his team to create a machine that could do the same thing would end up requiring, when all was said and done, months spent capturing thousands of images of food being successfully skewered, the training of AI algorithms to associate a given piece of food with the right fork strategy and also to differentiate between mouths that are open and closed. Finally, the team would have to design planning routines that would get different foods into different mouths — a job that roped in dozens of graduate students and spanned work across two universities.

In 2017 Srinivasa and his team started the work in earnest. They decided that tackling all forms of food would be far too difficult. So they chose 16 that people who are paralyzed regularly eat, and which can also be stabbed fairly readily with a fork, such as pieces of apple, banana, broccoli, and individual cherry tomatoes and grapes.

Next, they trained a vision system to recognize those items of food so that the fork would know where to aim. For this they decided to use RetinaNet, an object detection system released publicly by Facebook’s AI lab earlier that year. Still, they ended up collecting 478 photos of the 16 foods in a variety of jumbled assortments. The way foods pile up can create tricky vision challenges, because often one piece of food lies partially on top of another. “Imagine a plate full of food items, there is no guarantee that the entire strawberry would be visible,” explained Tapomayukh Bhattacharjee, a scientist in Srinivasa’s lab. The photography alone took weeks. Since food can discolor if it sits at room temperature for too long, the team hired an artist to make hyper-realistic physical models of the fruits and vegetables so that they would retain their color during long photography sessions. “So we had gorgeous-looking three dimensional models of food items that you can't eat,” Srinivasa laughs.

Finally, they had to train the system to correctly skewer the food. This required them to build an entirely new supervised — one that had been trained on thousands of examples of food being successfully speared on forks. They built a motion-capture area in their lab, Bhattacharjee says, and “we just called people in and handed them a fork.” This led to some intriguing discoveries about the fork-skewering physics of those 16 pieces of food. To pick up a carrot, they realized, a vertical approach is the most effective. But slippery foods were harder: A slice of banana was best stabbed from an angle so that it wouldn’t slide off the fork.

By mid-2018, they had recorded enough examples of people successfully spearing the 16 foods that they could successfully train a neural network to associate a particular piece of food with the best possible spearing strategy.

That left the final stage: getting the food into the user’s mouth. It’s the phase that is, for the person doing the eating, perhaps the most dicey. A paralyzed person may have limited neck control, and may not be able to move out of harm’s way if a fork is extended a centimeter too far. “You’ve got a fork coming at your face,” Bhattacharjee notes. They used an off-the-shelf face-recognition system to recognize the person’s major features — like the eyes (to avoid going there) and the mouth (the proper place to aim). They programmed the feeding motion to stop just shy of the mouth: To eat the food, the user leans slightly forward and bites it off the fork.

Early tests of the system on able-bodied subjects uncovered a problem: The robot was successfully skewering various pieces of food, but sometimes when it raised the food to the user’s mouth, it wasn’t oriented correctly and was un-biteable. This was particularly true with long sticks of food, like carrots and celery. The robot would stab the food in the center of the stick, which made it hard for the eater to get off the fork.

“Our users are like, I can't bite that. I don't know how to bite that,” said Bhattacharjee. So he and the team went back and retrained the system so it would skewer “long” foods at one end.

By the middle of 2019, the system worked well enough that the robot arm could reliably deliver a piece of food into a person’s mouth in roughly a minute per bite. That’s radically faster than the 15 minute per bite average of paralyzed users manipulating a Jaco arm by themselves — a significant improvement. But it’s also significantly slower than most people are accustomed to eating, and slower than it would take a caregiver to help someone eat.

Nonetheless, when Srinivasa’s team asked users who are paralyzed to test their robot for the first time last winter, many were impressed. One tester was Tyler Schrenk, a 34-year-old man paralyzed in 2012 by a spinal cord injury. He’s become an adroit user of assistive technology, with voice-controlled tech throughout his home that enables him to control his lights, thermostat, TV, doors and computer.

“It's off to a really good start,” he said of Srinivasa’s system. He pointed out that it’s got a limited menu: It can’t handle foods that make up a more traditional full meal, like rice. But, he added, “it would be awesome even to be able to feed myself snacks.” Schrenk does work consulting on assistive technology, and has talked to other paralyzed users of wheelchairs about the system and its potential. They agreed, he said, that anything offering more autonomy would be welcome. “It would be a big deal to them, even if they could just find a way to eat a chip when they wanted.”

In one sense, Srinivasa’s success illustrates the myriad advances that have propelled robotics in the last decade. In under two years, he and his team were able to make progress that would have taken far longer –– if it had been possible at all –– only a decade earlier. Yet it also shows the limits of today’s technology and how far it is from being able to bestow on a robot the kind of nuance and precision necessary to seamlessly feed someone.

Left: Schrenk’s hand rests next to two devices: the wheelchair control unit and his iPhone, which can be used to control the robotic arm through voice or text. Right: An Alexa device, which Schrenk uses most often to control the robotic arm. (The arm can also be used with other in-home voice-activated technology.)

Consider just the enormous challenge in advancing Srinivasa’s system beyond foods that don’t have neatly defined borders, such as mashed potatoes or rice. They’re working now on training new visual AI to detect, say, the shape of a pile of mashed potatoes and segment it into smaller chunks a robot could scoop up. Harder yet will be tackling food like spaghetti, that needs twirling on a fork, or soup, which is hot and can spill out of the spoon.

“I'd be the first person to say we have a long way to go,” he says. And there are odd new problems they’ll need to tackle, as they branch out to new foods. For example, many meals change appearance as they’re eaten. “The taco that comes at the beginning looks like a beautiful taco,” Srinivasa notes, “but after a couple of bites,” it no longer does. So his team is going to have to train the vision system to recognize food even as it morphs. (This isn’t a problem that industrial robots have to worry about; a box of pencils in a shipping facility does not, on the fly, change its fundamental dimensions.)

Plus, Srinivasa’s team wants to improve the robot’s ability to sense things based on the force feedback detected in the motors that drive its limbs. Right now there are people whose paralysis prevents them from craning their necks forward to bite food off a fork. “What they really need,” Srinivasa said, “is for this robot to be able to move the food item actually into their mouth.” But doing so presents an ever more delicate safety issue: The robot needs to recognize the subtle types of resistance that come from a fork interacting with the soft tissues inside a mouth.

The good news is that Srinivasa thinks he can master all these various challenges with hardware that he and his lab already have — the robot arm, the cameras, the forks and spoons. The breakthrough he still needs are in software: He needs to develop new forms of AI that can recognize complex objects like soup, or that can parse with greater nuance the force-feedback signals of the robot arm. These software challenges, though, are pretty significant.

Picking up grapes is hard for robots!

“I don’t even know how to solve it,” he said. “I have to invent new algorithms to solve these problems.”

Building a truly robust feeding system that can tackle a truly wide array of food could well occupy much of the rest of Srinivasa’s career. “I'm going to spend the next 20 years doing it,” he says.

In the meantime, there are intermediary steps. Might the type of feeding system developed in Srinivasa’s lab be something a consumer could buy? If you’d be content with one that could tackle only a limited number of fairly discrete, easy-to-grab foods, like the ones the robot can already tackle, you could probably do that in only a year or so, Bhattacharjee surmises. For something that’s truly flexible for hundreds of foods, for any user, it’s harder to say; perhaps a decade or two. Srinivasa and his team aren’t in the business of making products. They — and the handful of other similar labs around the world experimenting with home-assistance robotics — are pursuing innovations that they describe and publish openly, so anyone in the commercial sector could, theoretically, absorb them and bring them to market.

There’s a school of thought that believes (or AGI) will be necessary in order for machines to flexibly adapt to the many new, unpredictable things that happen in daily life and truly operate alongside humans in our homes. But there is no sure-fire route to achieving AGI, if it’s even achievable. And for a scientist like Srinivasa, such a breakthrough is not something to plan on if he wants to achieve his goals.

“It's something that –– sure –– if it exists, it would make my job easier,” he says. He wants to solve problems that people are facing today, and he doesn’t want to wait. The way forward, for him, is solving one hard problem at a time.

Robots' final frontier? Taking care of us.

To understand how far robots are from replacing humans, just look at the everyday task of using a fork.