Most of the world’s robots are about as charismatic as a coffeemaker. Not so Cynthia Breazeal’s creations. There’s Huggable, a robot teddy bear, and Kismet, a babbler with big eyes, red lips, and pointy ears. Nexi is baby-faced and blue-eyed, and Leonardo has been described variously as a squirrel, a furry alien, and a giant Furby.
After years of making emotionally engaging machines with her students at the MIT Media Lab’s Personal Robots Group, Breazeal thinks the time has finally come for a personal robot to inhabit our homes and help us live our lives. To pursue that goal, she founded Jibo, a Boston startup that has raised US $38.6 million to produce a friendly robo-assistant to families. Equipped with cameras and microphones, the robot, also called Jibo, is a little taller than a toaster and shaped like a desk fan. It can recognize faces, understand what people say, and respond in an amiable voice. Jibo’s purpose is to help busy family members coordinate with one another and communicate with the outside world. In the morning, for example, the robot can remind parents and kids of important events and tasks for the day. You can tell Jibo what you need to accomplish today, and it will update your schedule or to-do list for you while you’re making breakfast. Jibo will also snap photos at parties, read interactive stories to kids, and help grandparents make video calls. First deliveries are scheduled for March and April, and the latest batch of Jibos offered for sale, at a prerelease price of $750, have been sold out since last August.
“We live in a time where so much of our technology is about data, data, data,” Breazeal says. “Social robotics is about saying, ‘Yes, we’ve got all that data. Now let’s focus on the experience and the human engagement.’ ” Indeed, although Jibo doesn’t have puppy eyes or soft fur, like some of Breazeal’s previous creations, it does seem to have a personality. It can stare at you with its oddly inquisitive one-eyed face, emit cute robotic giggles, and swivel its body animatedly. “It’s not just what it does, it’s how it does it,” she says.
To be sure, the dream of a truly useful, capable, and endearing personal robot is an old one (and no, Roomba doesn’t count, unless you talk to yours). For roughly 40 years, entrepreneurs have been introducing home robots. Nearly all of them disappeared quickly and without a trace. Even Sony’s iconicrobot dog, Aibo, which debuted to much fanfare and attracted an avid following, failed commercially.
Now, though, the necessary technology is better and cheaper than ever before. Huge subindustries that feed the smartphone and video-game makers are also supplying components to a new generation of home robots whose main purpose is to entertain and inform their owners. Besides powerful, low-power microprocessors, the parts in these new bots include 3-D sensors that help them detect people and objects, accelerometers and gyroscopes that let them navigate better, and lightweight lithium batteries that give them more autonomy.
The companies behind this emerging category include both startups and established firms. In 2014, the Japanese telecom giant SoftBank began selling a humanoid called Pepper, which like Jibo is mostly designed to entertain. In July, a French startup called Blue Frog Robotics plans to start shipping its first robot, Buddy, which resembles a Jibo on wheels. And several robot makers in China, including Ainemo, in Beijing, and Ubtech, in Shenzhen, are working on similar products.
Will Breazeal’s robot stand out from the crowd? Frank Tobe, an analyst and publisher of The Robot Reportwebsite, thinks it will. He calls Jibo a “game changer in the new social robot marketplace,” noting that the company has assembled a talented team of experts not only in robotics but also in speech recognition, human-machine interaction, gaming, and animation. One other factor that helped sell Tobe on Jibo: He showed a promotional video of the robot to his wife, who afterward declared that “any device that can order Chinese food”—a scene shown in the video—“is a winner.”
Others are less enthusiastic. Media pundits have described Jibo as an “animated lampshade” and “an alarm clock on steroids.” A Time article says “it’s unclear why you’d actually need one,” adding that much of what Jibo promises to do are things your smartphone already does. And responding to another scene in the promotional video, a writer for the tech news site GeekWire said: “No way am I going to leave an Internet-connected, motorized camera next to my daughter’s bed,” citing privacy and safety concerns.
Breazeal takes the criticism in stride. She acknowledges that Jibo will launch with only a small set of apps (she calls them “skills”) but adds that the robot is a platform open to developers to expand its capabilities. New apps will increase Jibo’s usefulness over time, she asserts, and allow it to do things undreamed of now. And her team is taking privacy and safety concerns seriously, she hastens to add.
Consumer-wise, so far, so good. As of December, Jibo had done $3.7 million in preorder business on the fund-raising website Indiegogo. It’s a staggering sum, especially in view of the fact that during the crowdfunding campaign, the robot was still just a crude prototype that couldn’t perform any of the tasks advertised. Having already delayed the original shipping date once, Breazeal knows she and her team need to deliver on their promises. “It’s heads down right now,” she told IEEE Spectrum in an interview in late 2015. “Heads down.”
The first prototypes of Jibo looked like soft-drink cans with smartphones glued on them. Some models had a cartoonish antenna sticking out of the top. Breazeal and her team worked with an industrial design firm in San Francisco to create the current version, which is sleek and minimal. Packed inside is an impressive amount of electronics: high-resolution stereo cameras, six microphones, a pair of speakers, an LCD touch screen, two cooling fans, Wi-Fi and Bluetooth modules, LED lights, touch sensors, and an ARM-based embedded processor running Linux.
But perhaps the most ingenious part of the design is Jibo’s body. It consists of three roughly cylindrical sections, one each for the base, torso, and head, which connect at an angle rather than horizontally. The result is that rotating these sections relative to one another causes the body to appear to bend into a variety of expressive poses. The rotation is accomplished by three DC motors with belt drives, which move smoothly and quietly.
Getting the movements right required substantial engineering. In early prototypes, the sections could turn in either direction by only a limited amount, which restricted the robot’s range of motion and made the movements seem unnatural. To allow each section to rotate continuously, Breazeal’s team rearranged the electrical and mechanical components more tightly. The team also ran all the wires through the center of the robot’s body so that they won’t get twisted when it spins around. According to Matt Berlin, who worked on the motion problem, Jibo now “feels much more fluid and loose, like it can just keep flowing from one pose to another without hitting any stops.”
Ultimately, Jibo will succeed only if it offers a mostly flawless user experience. And this experience will in turn depend greatly on the robot’s speech recognition and synthesis. That’s a big challenge: Give hearing and voice to a robot and people expect it to be intelligent.
“Human conversation isn’t like a Siri or Google Now request, where you form a well-structured phrase,” says Tandy Trower, founder of Hoaloha Robotics, a Seattle startup developing health-care robots that will also rely on voice-based interactions. “When we speak to each other, we use a tremendous amount of context to help us understand what someone is saying.” In other words, a robot like Jibo will have to take contextual details into account to be able to conduct open-ended dialogues.
It won’t be easy. Breazeal has said that the approach used by Siri and Google Now, which transmit speech to powerful cloud-based computers for processing, is poorly suited to a robot. A cloud-based speech engine would introduce delays. Worse, if Jibo loses Wi-Fi connectivity, it would become unresponsive. The alternative is to run voice recognition locally, on the robot. And that’s not such a great choice either, because it would likely overwhelm its CPU. Although Breazeal won’t say how her team is tackling this problem, it’s possible that Jibo will combine a hybrid approach, using local voice processing for some basic functionality—when a user tells the robot to “wake up,” for instance—while relying on a cloud-based engine for more complex speech processing, such as contextual evaluation of statements, for example.
Breazeal did say that her speech engineers are building natural-language models to allow Jibo to respond in an engaging manner. They’ve also given Jibo a unique voice, courtesy of a voice actor who recorded some 14,000 phrases. From those, a text-to-speech engine can generate millions of utterances. But she adds that the actual words the robot will say are only part of its response—it will also use body language as well as alter the tone of its voice to suggest happiness, sadness, and surprise.