Trends in AI: Precursors to a “Common Sense” AI

Jim Burrows
Personified Systems
6 min readApr 12, 2016

[Updated: Reposted to fix broken links. Sorry.]

Intelligent robots and smart phones, “mixed reality” goggles, and a hacker’s home-grown self-driving car are all trending towards a better breed of artificial intelligence.

This column is something of a complement to Earl Wajenberg’s regular “AI in the News” column. Whereas Earl generally deals with the latest AI stories in the science and popular press, I’ll be dealing with trends and developments that might not be quite as recent, but which I think are significant.

One of the concepts that I have been writing and talking about during the past year is what I have called “Common Sense AI”. (See, for instance, the latest version of my working document, “AI, a “Common Sense” Approach”.) By “Common Sense” in this context I am referring to what Aristotle called the “common sense” and the Medievals, the “common wit”. This is the mental faculty that takes the data from all of our various physical senses and integrates them into a unified model or perception of the world. (See “Aristotelian common sense” in Wikipedia.) It is my contention that this ability to integrate sense data into the perception of a physical world comprising distinct objects is at the heart of natural intelligence, and thus Artificial General Intelligence (AGI).

There are at least four very interesting developments that I think are worth mentioning in this area.

  1. Google’s Project Tango
  2. Microsoft’s HoloLens
  3. Autonomous’s Deep Learning Robot
  4. George Hotz’s self-driving car

These all represent efforts that go back a couple of years, and each represents an aspect of the trend towards being able to create autonomous systems that begin to demonstrate an integrated perception of the world, a model within which artificial intelligence can operate.

Google’s Tango

Project Tango Development Kit

Tango is an Android-based hardware and software project focusing on vision, spacial perception and the area of “the common sense” as I’ve been using it. The tag line on their web site reads, “Project Tango’s Development Kit is equipped with technology that allows it to understand space and motion. Let’s build something amazing together.”

The project is a couple of years old and has resulted in a number of hardware platforms, and is expected to result in a consumer product this year. The current development platform has a large collection of senses to integrate: Motion tracking camera, 3D depth sensing camera, Accelerometer, Ambient Light, Barometer, Compass, GPS, & Gyroscope. The software concentrates on three main capabilities: Motion Tracking, Area Learning, and Depth Perception.

So far, most of the technology shown in Tango has been of a more traditional graphics, 3D mapping and virtual world approach, but earlier this year, Google and the Tango 3D vision chip manufacturer, Movidius announced plans to bring Deep Learning (DL) technology into mobile devices (see this article from Yahoo Finance, for instance). Google has been a leader in practical Deep Learning technology, and integrating into a chip that can be used in small devices and into the Tango platform will be a step in the direction of using a common sense implementation that parallels that in natural systems.

While Project Tango is aimed explicitly at hand-held devices, phones and tablets, they have been deployed mounted on a number of self-mobile devices. These include wheeled and treaded robots, quadrotor drones, and even NASA’s floating SPHERE “satellites” on board the International Space Station (seen here).

Microsoft’s HoloLens

Microsoft’s HoloLens

Whereas Google is looking to integrate vision systems into mobile devices, phones, and tablets, Microsoft is looking to upgrade Virtual Reality (VR) and specialized hardware, namely VR goggles, to what they call “Mixed Reality”, the hybridizing of creating convincing simulated worlds with “Augmented Reality”, displaying visual information integrate into views of the real world. Their premiere effort in this arena is the HoloLens device and the attendant (misnamed) “holographic technology” into Windows 10.

In order to create and maintain a convincing mixed reality, the HoloLens system needs to create and maintain an accurate model of the world and the device’s position and orientation in that and the real world. Once more, this is only the barest start on developing the type of complex integration of multiple modes and sources of sensory data, but it is a start. Like Google, Microsoft has deployed open source machine learning APIs and libraries on the net.

Microsoft has recently started selling and shipping a $3000 developer model of the the HoloLens. (see “Inside Microsoft’s HoloLens” on the Verge). It is yet to be seen how interested the world at large is in adopting VR goggles as massive as the HoloLens, or even light weight devices like Google Glass, but viewed as one element of an ecosystem that includes Kinect, and “Windows Hello”, Windows 10’s visual recognition feature (which like Tango uses Intel’s RealSense camera technology), Microsoft is definitely moving in the direction of developing an artificial common sense technology.

Autonomous’s Deep Learning Robot

Autonomous’s Deep Learning Robot

Whereas both Google and Microsoft are concentrating on devices carried or worn by humans, Autonomous has released a development platform that is specifically designed to integrate Deep Learning into self-mobile devices. The hardware used in the Deep Learning Robot is similar in many ways to the Project Tango developer devices. The main processor in both is the Nvidia 192-core Tegra K1 graphic processor. The DL Robot uses an Asus Xtion Pro 3D Depth Camera. The pre-installed software includes Ubuntu, Google TensorFlow, Caffe, Torch, Theano, cuDNN v2, and CUDA 7.0.

It remains to be seen how this platform will be used, but at $1000, it would seem to be well suited to advancing the art of an artificial common sense.

George Hotz’s self-driving car

George Hotz’s self-driving car

My final “trend”, is a particular and interesting development in the overall trend towards autonomous and, eventually, self-driving cars. George Hotz, also known as “geohot”, is a young hacker who first caught the world’s attention as a teenager by unlocking the iPhone, and went on to hack the Sony PlayStation and Samsung Galaxy S5. He has since worked at both Facebook and Google. His age and hacking background made him a rather surprising entrant in the nascent self-driving car business, but in late 2015 he created a company, Comma.ai, with a reported $3 million in backing, to build a self-driving upgrade kit for a number of late model vehicles.

In January, Bloomberg ran a story entitled, “The First Person to Hack the iPhone Built a Self-Driving Car. In His Garage”, interviewing him, and showing a demo of his self-driving hacked 2016 Acura ILX on a California highway (which earned him a cease and desist order when it was published). According to the interview, the car is the product of two coordinated efforts. First, he hacked into the computer network that controls the Acura, and added his own joystick “drive by wire” controls that allowed him to control the car directly, rather than through the steering wheel and pedals. Second, he added a Lidar unit and a number of cameras to the car and connected these sensors to a computer running a deep learning system.

Rather than trying to program driving skills and rules into the the car, he used the DL system to, first, monitor him driving, and then, when it was skilled enough, to take over and drive itself. Like the AlphaGo team, he found that the process of teaching a skill to a well constructed DL system went very quickly. He was extremely pleased when he observed that the car had learned that when there is a bike lane with bikes on it, to pull a little to the left of the center of its own lane to give them room, a skill that he had not intentionally taught it.

While it is not clear that the Acura has much of a common sense, Hotz’s approach to self driving of creating an autonomous machine learning system with a complex array of senses, and then teaching it a skill by example, certainly fits into the mindset that is likely to lead to a common sense AI.

--

--

Jim Burrows
Personified Systems

On the ‘net (the ARPAnet) in ’74. 4 decades career doing hi-tech things I never did before. Researched Machine Ethics. Retired to create novels and comic books.