My Dream About AI

How Martin Peniak went from being a Slovakian builder to one of the world’s experts on GPU computing for cognitive robotics

Second Home
Work + Life
11 min readSep 28, 2016

--

Martin Peniak went from being a young man with no clear future in Solvakia, to one of the pioneers in parallel processing. He has worked for the European Space Agency and also in the NVIDIA research centre in Sillicon Valley. During his post-doctoral studies at university in Plymouth, he trained the humanoid robot iCub and was the first to apply parallel programming in the field of cognitive robotics. Currently, Martin works for Cortexica at Imperial College in London where he is creating a biologically inspired system for visual searching.

“In 2004 I was still in Slovakia working as a builder, I got fed up of Slovakia so I travelled to Plymouth. I had very little money, I used to play African drums so I brought a djembe drum with me in case I needed to play on the street or something.

I didn’t speak English and I was going to sleep in the bus station, but when I arrived somebody told me a woman was killed there a day before, so I decided to migrate to a bench. In the morning I started applying for jobs at the Jobcentre.

I eventually found a job in a fish factory, it was a good place to be. One day when I was 20, exactly on my birthday, I met a guy who turned out to be a professor of biology from Plymouth University. He offered me help and he told me, ‘Why don’t you go to uni?’ I never wanted to go to uni, I never wanted to study anything, but then I decided to work at night and study during the day at Plymouth uni.

So I did my undergrad and after three years I received some awards. After that they offered me a PhD scholarship, so I skipped Master’s and started working with the iCub humanoid robot. I worked also for ISAM and then I postponed PhD for some time I worked for a media in Santa Clara. Then I started working here in London, so I moved here, working at Cortexica.

At Cortexica we are developing a search engine — something like Google but not quite of that scale — we provide an engine which you can use to search by images. So you can take an image of anything, and we will return you the visually most similar items. The reason why I work here is because I specialise in parallel computing and we are writing a lot of code that is able to use GPUs and parallel computing because we need to be able to deliver the results very fast.

Other than this, I’m also quite interested in innovating and looking for new areas of applications where we might be able to use our visual search. We are exploring also the potential of future shopping — imagine rather than just going to the website you could go to a different place, you would be able to see all the clothes and everything in 3D and see the real scale. Then if you were interested in something but you wouldn’t want to buy it, you just want something similar, you could just look at an item and we would bring you the most similar results.

I believe augmented reality is the next big thing. Virtual reality has been around for ages but never quite to that level where we are able to make it a product and ship it. Now, because of the improvements in displays and processing power and everything, there are all these new headsets — Oculus CV1, HTC, all very impressive. So it’s just the beginning, but it’s very interesting.

Back in, I think, 2008/9 I started exploring the idea of using the GPU to accelerate artificial neural networks, which are computational models of the real biological networks. Why I was doing that is because I wanted to use these neural networks to allow the robot that I was working with to learn actions and language. I wanted him to understand when I am telling him things, and I wanted him to be able to execute the actions that I was trying to teach him.

I will briefly go through some of the experiments I was doing at Plymouth. One of them was how we can teach the robots to be able to walk and do actions. We know that humans are good at learning actions, we do the same repetitive motions over and over and then the muscles somehow learn to do the proper movement.

But it’s a little bit different when it comes to robotics, it’s reasonably easy to achieve a robot to do things, but to be able to teach the robot to do things so that it can generalise also adapt to situations that it previously hadn’t experienced, to demonstrate a little bit of that capability that humans have — a generalisation, [is] when the neural networks are becoming very interesting.

I created a model of a neural network that had lots of neurons, it had different layers also, and it was connected to the robot. Then I took the robot and show him how to do the actions basically. At the same time, I was recording all of the sensory motor patterns while the robot was doing the actions.

These are what is quite interesting because it’s really complex — it’s got 51 degrees of freedom, that means 51 different ways the motors can move — and why my work was interesting to NVIDIA and other people, was because of the complexity of the controller. So I used a large neural network with thousands of neurons which allowed the robot to control all the 51 degrees of freedom. It had the input from the eyes, and it also had linguistic input so I was able to speak to the robot and tell him simple verb/noun combination like, ‘Push the cube, touch the ball’.

“We were training the robot to do certain actions on certain objects, but we didn’t train the robot to do all the actions on all the objects.”

What is interesting here is we were training the robot to do certain actions on certain objects, but we didn’t train the robot to do all the actions on all the objects. So the goal is here — something we were expecting and hoping for — that after the neural network training, the robot will be also able to do those actions that he’d never done before.

[iCub] is able to predict the step in the future, so it receives certain input and depending on that he’s able to predict the next step where the motor should be going. It’s fairly expensive, £200,000 I think it was.

When you compare the performance of the neural network training — how fast it can learn — and use a GPU rather than a CPU, you will achieve a certain amount of speed-up. That speed-up is kind of related with the number of neurons the neural network has, so the more neurons the network has, the more speed-up you would see as compared to the CPU. That’s simply just because the neural network is a computational model that can be easily paralysed and therefore it’s a good fit for the GPU processor, while the CPU needs to do one neuron at a time while the GPU can do all the neurons at the same time.

“Animals, humans, we have an amazing ability to recognise objects and to quickly know what’s where, but the standard computer region algorithms have been failing many simple tasks like recognising objects.”

The interesting part is that animals, humans, we have an amazing ability to recognise objects and to quickly know what’s where, but the standard computer region algorithms have been failing many simple tasks like recognising objects. So I looked into another approach which inspired me, which was inspired by biological vision and especially by the fact that we don’t process the entire visual field at the same time like a computer does — one frame and then look for something in ‘x’ number of pixels and try to find something, it just doesn’t work like that.

When I was working at NVIDIA I created a neural network model again and its task was to recognise 3D objects. The interesting part was I never really programmed it to do anything — I used genetic algorithms, which is an algorithm inspired by natural evolution, and therefore you just say, ‘Neural network, I want you to recognise these objects’, and you give it a simple function. Then you run it for hours and hours — if you use GPU it’s even better because it will be faster — and then if you’re lucky, at the end of this evolution you will end up with a neural network control system that is able to automatically move from one place to another and recognise 3D objects. Evolutionary robotics is the approach when you combine neural networks together with genetic algorithms, so many people in the past did lots of cool stuff and that can be, for example, you can be evolving the entire morphology of systems.

You can be evolving virtual organisms in a simulation — they will have their own links and everything, they will able to do tasks, avoid obstacles and all that, you don’t have to do anything, just design the experimental kind of set-up. There are people who evolve creatures in a simulation and then 3D-printed them and they will be functioning in real life. But of course there are many, many other applications.

Me and my colleagues also work on a lot of obstacle avoidance — at Space Agency we designed a simulation of a Mars Rover, Curiosity — and we designed this system that had just a simple neural network, let it evolve, and the rover was able to use a simple camera autonomously, was able to avoid obstacles, avoid holes, recognise different types of terrain — whether it’s slippery or what.

[With the evolutionary robotics approach] you don’t just have one organism; you create the entire population. So you treat these things like organisms and you create, like, a hundred of them. At the beginning the neural network had no clue — it didn’t have any knowledge or understanding, it was randomly initialised, the behaviour is determined by the weights on the synaptic connections. Throughout this process, when you run the entire population it’s basically trying to make sense of the environment, trying to recognise this object but it’s doing really badly. But there would be one or two that by accident did it correctly at least in some cases. So what you do is take those two and let them reproduce and create an entire new generation.

So these two will be like parents for the next generation and the next generation will inherit their so-called genes (genes would be in this case the weights of these neural networks). So you have a second generation slightly better because it was based on the previous successful parents, and then you run it a hundred times, a thousand times, a million times — whatever is necessary — and through the process you can arrive to a control system that is able to recognise 3D objects in this case.

I was training this neural network on all of these orientations, it wasn’t just a single image, under different lighting conditions because the database also had different lighting conditions. So it was quite robust in that sense.

Its perception was loosely based on biology. It wasn’t anything that we would pre-programme, it would be something that it evolved so it just learns by itself to do. You train it and you specify some criteria — you say, ‘I want you to be able to recognise that this a cup, this is this, this is that’, then I measure your performance. Depending on how well you do you will either die or survive and be able to have offspring.

I left half of the data set deliberately to test how well the neural network actually learned to recognise that object. When I give it a different view of that object is it still able to recognise it or is it not? At the end the neural network was able to, in most cases, recognise it, even when it saw a different angle of the object that it did not see before.

The interesting part was also this little retina that could move from one place to another — it’s computationally very cheap, it doesn’t really need much processing power, only the training is expensive. It takes ages to train, but once it’s trained it’s extremely fast to give you an answer.

When I started in 2008 NVIDIA published a toolkit, CUDA, that you could use to code these cards. CUDA was relatively simple compared to what you would have to do, now it’s super simple. Before you would have to use DirectX or OpenGL, represent your algorithm as some sort of pixels and give it to the graphics card, and the graphics card would give it back.

I remember when I presented my first talk at NVIDIA it was maybe 2010, they had no clue about neural networks. I said, ‘I want to do neural networks’, my poster was the first one on neural network and robotics, and they were just looking like, ‘What are neural networks good for?’.

Everything now is about deep learning, of course now they’re pushing for virtual reality… So they’re creating more efficient platforms — the amount of improvements is incredible. I don’t know if I see anything super-imminent coming from AI. We are definitely making good progress — if you saw Boston Dynamics, these robots running around in the forest and people kicking them, so that’s pretty cool. But to design an actual intelligent machine, I think we are still a little bit far from that.

But I’m very grateful for these revolutions in terms of, for example, deep learning and systems that can just recognise many different objects in a single image. Now we go to Facebook and you just upload it and it goes, ‘Oh, this is your mate, Alex’, and auto-tagging and all these things. It’s just because they’re using these technologies.

https://www.youtube.com/watch?v=NsmiKJ8BAq4

This talk took place st Second Home, a creative workspace and cultural venue, bringing together diverse industries, disciplines and social businesses. Click here to find out who’s speaking next.

--

--

Second Home
Work + Life

Unique workspace and cultural venue, bringing together diverse industries, disciplines and social businesses. London/Lisbon/LA