Aibo Hive Mind: Rise of the Robot Dogs

Interfrastic
6 min readFeb 11, 2018

--

In the future pets will use the internet to communicate, share experiences and learn from each other. That is not a plot to a Disney movie, that is the surreal future as Sony envisages it.

The first Aibo pet robots were made in 1999. They spawned a cult following in Japan and the US. In an effort to return to profitability Sony scrapped the program in 2006, under the leadership of the first American CEO, Howard Stringer. The inventor of Aibo and head of Sony’s Digital Creatures Lab at the time, Dr. Toshitada Doi, subsequently held a funeral for his pet robot.

The Aibo program was revived in 2016 under Sony’s current CEO Kazuo Harai and the new model (ERS-1000) is now on sale in Japan. The focus in this refresh is unsurprising in 2018: artificial intelligence and cloud integration.

The all new Aibo is bristling with features designed to provoke a human response. Large OLED puppy dog eyes are able to convey more emotions. Pressure sensitive and capacitive sensing pads on its back, head and chin respond accurately to touch. Ultra compact actuators provide more lifelike movements. Two fish-eye cameras (one in its nose and one on its back) help Aibo analyse its environment. Specialised time-of-flight depth sensors enable more advanced environmental sensing and modelling. It has 4 separate microphones which allow it to not only locate the sources of sound but also to recognise 100 different voice commands. It is powered by a 64-bit quad core processor and not only has WiFi connectivity but also 4G LTE. This stack of electronics is designed to make you fall in love. But make no mistake: this is not a dog for dog lovers: this is a pet for gadget freaks.

Sony’s Aibo press release talks of the company’s “ well-cultivated deep learning technology” and this is where things get interesting. There is no public information on the new Aibo’s machine learning architecture so what follows is largely speculation.

Given the array of sensors on the new Aibo there is no doubt that it can measure human emotional responses. After all, detecting emotions, such as a smile, using facial landmarks is old hat (the slide below is from 2002):

Similarly, detecting “oohs” and “ahhs from its microphone arrays would not be much of a challenge for modern AI. Machine learning relies on collecting data to model behaviour. Aibo could record input from all its sensors, as well as the output of it’s actuators, at the moment of and leading up to a detected positive engagement with a human. This data can be used as a kind of positive training sample for desirable human interaction. It may be more difficult to capture negative examples for training. A negative example would involve the absence of positive feedback, or even sadness. But whether this is caused by Aibo or by some external factor in the environment may be more difficult to determine. But regardless, Aibo could record such chunks of data (memories, experiences?) and upload them to Sony’s cloud servers for analysis.

From a machine learning perspective, it is interesting to consider how well positive interactions from one owner might translate to another owner. In other words can the Aibo hive mind learn from collective experience? One would expect that there are some Aibo behaviours that are universally appreciated. Given sufficient training data learning such behaviours should be feasible. Demographics may well play a role in this transfer of experience: for example, it is likely a young girl may appreciate different kinds of interactions to a young boy. Camera based face analysis is mature enough to determine demographics such as age and gender so Aibo could conceivably play to its audience.

There are effectively three parts to a typical deep learning machine. The inference engine is the core computational engine; the network is the structure of the learning machine and finally, the weights control how data flows through the network. In a simplistic analogy to the brain the inference engine encodes the physical properties of neurons; the network represents the structure of the brain and the weights are the synapses that pass electrical signals between neurons.

It is likely that all Aibos will have the same inference engines and networks. An interesting design choice is whether to sync the weights of all Aibos. In other words sync updates to all dogs so they behave in the same way. There would be a number of practical advantages associated with having a single set of weights across all Aibo’s. Technical support, for example.

According to the feature list: “aibo keeps on growing and changing, constantly updating its data in the cloud. Over time, your approach to nurturing your aibo will gradually shape its personality … It’ll even learn new tricks through interactions with other aibo, experiences with changing seasons, and different events.”

But Sony is evidently allowing each Aibo to maintain a unique set of weights and characteristics. Potentially there could be a global pre-trained network with data (experiences?) from across the Aibo population. This could be complimented by data sets from the local interaction with the dog’s owner. The learning rate (the rate at which the weights in the network are updated) for a specific Aibo would have to be carefully managed to ensure that the local data was sufficiently biased. This would allow it to respond well to its owner while interacting effectively with other humans it comes into contact with.

It is worth noting that once the architecture for a distributed network of Aibo intelligence is in place it could effectively update automatically. Data would stream into the cloud; servers would train up improved networks and transmit these down to the dogs who adapt them to their local environment. Forget about the cute dogs for a moment, this is a network of artificially intelligent things that learn and evolve autonomously. There is nothing quite like this in AI right now*.

[ * Edit: Ok, I was wrong about that: see Federated Learning ]

Is Aibo going to be a commercial success this time round? It is hard to say and I suspect the real benefit to the company is developing an AI that can interact with humans in a positive way by coordinating learning across a network of autonomous agents. It is not difficult to see how knowledge gained from the Aibo project could be transferred to large volume commercial applications such as robotic receptionists, checkout operators or err, other professions.

It is a stereotype that the Japanese are fascinated with humanoid robots. Up until recently, this obsession has always seemed a little fanciful (and in stark contrast to the American’s focus on practical military robots). But with the rise of deep learning, Sony may well have opened up an important new frontier in human — robot relations.

--

--