Tips from Arjun Bansal, co-founder of Nervana Systems12 Things Deep-Learning Startups Need To Know
12 Things deep-learning startups need to know
With a startup, you have to jump before you can be certain where you’ll land. Being comfortable with that is really important.
Founded in 2014, Nervana Systems is a San Diego-based startup that’s building an easy-to-use cloud platform for deep learning — one that companies can use to make their applications smarter. So far, the startup has raised close to $25 million in funding.
Its three founders — Arjun Bansal, Naveen Rao and Amir Khosrowshahi — met while working at chipmaker Qualcomm. Their shared goal for Nervana, their newest venture, is to bring much-needed simplicity to the application of brain-inspired algorithms.
Many customers don’t really know what deep learning can do or how they could integrate it into their pipeline.
Deep learning typically comprises of feeding data into artificial neural networks, then feeding in new data for inferences in response. The technique is praised for making sense of large amounts of data and is popular for processing images, video, text and speech.
Last year, Nervana released Neon, its deep-learning software under an open-source license, allowing free use to anyone keen to try it. One such user, Kenzo Takahashi, recently Tweeted his feedback on the product.
Leading a recent HackerUnit online masterclass (01/18/16), co-founder and head of the machine learning team Arjun Bansal opened up about his early interest in AI and neuroscience, his aspirations for Nervana, and how the startup prefers an open dialogue when it comes to making mistakes. But first — a little about Arjun.
Meet Arjun Bansal
Arjun Bansal has more than 12 years of research experience under his belt, spanning computational neuroscience and brain-machine interfaces, along with stints as a venture capitalist at Slater Fund, and as a software engineer at Microsoft. His post-doc work took place at Harvard Medical School and Boston Children’s Hospital in the neurosurgery department, where he helped analyse large-scale neurophysiological data from epilepsy patients.
This kind of research has applications for making prosthetics, where chips are inserted into paralyzed patients, and those signals are then used to control such things as a robotic arm or a cursor on a screen.
In his own words, “Since early on, I was interested in AI applications. This was towards the late ’90s when we were kind of entering the second AI winter. My professors and mentors advised me to look into other areas of investigation, so I latched onto neuroscience because I was really interested in how the brain works and, at that time, people were looking at neuroscience as a way to get inspiration for making smarter AI.”
From a corner of his office in San Diego, Arjun answered questions from eager attendees of his online HackerUnit masterclass. Here are some of the experiences he shared with building a deep-learning startup.
Arjun’s wise words on startups
LAUNCH QUICKLY AND ITERATE QUICKLY
For most software startups, it’s quite possible to do. With hardware, however, it can be hard. For any kind of software-based startup, try to get a minimally viable product out there and start collecting feedback so you can begin to iterate. We’ve kind of adopted the Tesla model, which is to build the fastest, biggest kind of sports car that we can, before building the economy version. So we’re targeting the cloud as our first product, so our chip will be quite big, and very fast and allow us to scale. And it’ll be more power efficient than a GPU or CPU.
GETTING THE PRIMITIVES AND API RIGHT
For the kind of work we’re doing, which is trying to build a platform for deep learning that other people can use, and we can use, what we really focus on is getting the primitives and API right. Then others or we can build whatever research applications on top of that. So our efforts are mostly engineering, in terms of building things fast, in a scalable way. So right now, it’s mostly speeding up and scaling up existing algorithms.
ON COMPETITORS
A question I’ve been asked is, how did we convince investors that we had something better than nVidia? It’s interesting because nVidia has taken an approach of building on top of the GPUs. So whatever card or chip they build, you still have to support graphics, and we see that as being similar to how things were when GPUs started off in the late ’80s, early ’90s – where you could do graphics on a CPU, but you could do it a lot faster on a GPU. You could do machine learning and deep learning on a GPU, but you could potentially do it much faster on a dedicated processor, so that’s why we think we can do better. Obviously, there are a few key technical points as to why our chip will be faster than GPUs, and we presented that argument to our investors – people who are hardware savvy. They immediately grasped what the technological opportunity was here.
OUR BEST MOVE SO FAR
I think some of the key decisions that seemed to have helped us were being able to achieve the fastest performance for deep learning applications, even on a GPU, so we can beat nVidia on their own hardware. That was thanks to the efforts of one of the engineers on our team, who came up with these optimised kernels, all the way down to the assembly level, and then a lot of the other people were able to contribute their system’s engineering around that, to speed up the non pure-compute parts of the workload as well. Things like loading the data from disk, and doing that in a very efficient way. All that’s helped us get quite a bit of credibility when it comes to speed and scale for deep learning.
SETTING OUR FEES
We’re still fairly early in terms of figuring out exactly what consulting fees should be. I think the market is potentially really big, if you look at the kinds of amounts that enterprise companies pay for their IT services on a yearly basis. They can be in the tens of millions. If not, hundreds of millions of dollars. So we’re still trying to figure out exactly what that’s gonna look like. We approach it holistically as a combination of cloud services, plus the consulting aspect. And it really depends on the customer, what their problem is, how quickly they want the solution. All those things will go into figuring that out. But we’ve been experimenting with a few different models and, so far, it’s been slightly different pricing, but not too different.
MY DEEP-LEARNING PREDICTION FOR 2016
Since I’m representing Nervana, my prediction is that Nervana becomes the platform for deep learning in 2016. In terms of the field itself, I’d hope to see more multi-modal applications. There’s some interesting work going on in robotics, where people are combining vision and speech and language, with motor control and reinforcement learning. There’ll be a tremendous amount of progress in 2016; the problem won’t get solved, but it’s gonna be really exciting what the convergence of all those different models and state-of-the-art results looks like by the end of the year.
MAKE A BET ON NEW HIRES
We’ve hired people who are really talented — either programmers or machine learning researchers who picked up the deep learning aspect really quickly. In terms of the math, deep learning is a little bit easier to understand than some of the more complicated machine learning methods, like Bayesian graphical models. We’ve gone after really smart developers and made a bet they’d be able to pick up the deep-learning aspect really quickly, and most of them have been able to do that.
IF YOU’VE GOT THE BUDGET TO HIRE, START RIGHT AWAY
We’ve definitely used Kaggle a lot. One of the machine learning engineers on the team used to be No.2 on Kaggle and others have had Top 10 finishes on its competitions. You’re going to be getting lots and lots of resumes. You should reach out through your network as well, that’s what we did really well at Nervana, which is to tap into networks of people that the founders knew, to get them excited about the company and get them to the point where they would want to join the company. It’s definitely a long process.
We were faced with very big engineering challenges every day — that’s part of what makes it really exciting.
WHAT I LOOK FOR IN EMPLOYEES
It’s not one thing but a combination of things. It also depends on the position. Since most of our [machine learning] programming is in Python, we tend to look for very strong Python programmers with some level of deep learning or machine learning, as well as an attitude of wanting to learn different things and wanting to solve problems. This can make up for when someone doesn’t have the skills right away; we know that we can make a bet on them and they’ll pick up the skills as and when needed.
We’ve kind of adopted the Tesla model, which is to build the fastest, biggest kind of sports car that we can, before building the economy version.
RAISE AS MUCH FUNDING AS YOU CAN
It’s an interesting time in the market; this year has already started off in a very volatile way. The general advice that VCs and advisors are giving out is to try and raise as much as you can now because it’s hard to predict what’s going to happen — it’s a little bit more uncertain. It’s been about six or seven years since the last crash. These things tend to be cyclical so, it’s possible there might be some kind of correction this year.
WHAT THE MONEY ALLOWS
It feels great to have raised almost $25 million. In the past, I’ve worked on a couple of smaller scale startups, where we were trying to bootstrap and not dilute as much. A consequence of that was we didn’t end up growing, only getting to two or four persons big. Having the money [with Nervana] lets us scale up, and we don’t have to make any short-term trade offs that don’t make sense for the company. It also allows us to think longer term and optimise over a two-year timeframe rather than three to six months. That’s really valuable for us because we’re building hardware, and that requires longer term thinking than a software- or web-based product.
CONVINCING CUSTOMERS OF THE VALUE OF DEEP LEARNING
We run into an entire spectrum of customers who are already convinced about the value of machine learning and deep learning. These are people who just want the tools and the infrastructure, and for somebody to help them get started. Then there are definitely others who need to be convinced or, even if they’re convinced, it’s not a priority for them. So we definitely have these kinds of conversations. Obviously, if they’ve already bought into deep learning and are convinced it’s good for their business, then it’s a lot easier to start working with them.
Are you in the midst of building a startup? Did you find Arjun’s savvy useful? Let us know what’s helped you on your deep-learning journey.
Follow us on Twitter @hacker_unit