AI Research is Both Simpler & Harder Than You Think

Daniel Bohen
Writ340EconSpring2024
9 min readApr 30, 2024
Photo by Igor Omilaev on Unsplash

Artificial Intelligence has swept across the world in the past few years, largely due to advancements in deep neural networks and their applications to generative AI and autonomous vehicles. For the first time in modern history, there appears to be a serious intellectual competitor with the human race. Opinions on where this might lead are divided both within the research community and the general public; some see it as the end of humans, while others believe that it is a path forward to a technological utopia for all. Regardless of the long term outcomes, one thing seems abundantly clear to everyone: there’s a lot of money to be made, and companies are willing to pay top dollar to machine learning engineers and researchers. Given the capabilities of artificial intelligence, and the overrepresentation of Stanford & MIT grads in the field, many people assume that the field must be beyond their capability to understand. This assumption is detrimental not only to the future of AI research, but also the growing number of fields that AI is affecting, including medicine, finance, and law.

To be clear, research into artificial intelligence does involve some inherent complexity. It is a cross-disciplinary field spanning math, statistics, computer science, electrical engineering, neuroscience and philosophy. There’s almost no field that artificial intelligence hasn’t drawn from, and the notation and terminology is conflicting and vague as a result. Despite the breadth of the field, it’s well within everyone’s capability to understand, because it is largely just many very simple concepts stacked together.

Here it’s important to note the differences among simplicity, difficulty, and complexity. Many people conflate simple things with being easy, and complex things with being hard. In fact, this was one of the first misconceptions in the field of artificial intelligence itself, which led to initial applications focusing on chess and other complex topics rather than the “easy” topics like vision and reading. Simplicity means that a concept is easy to understand and break down, which often correlates with but does not directly imply that something is easy to accomplish. As an example, building habits is a very simple thing to do: just keep doing the thing. That’s all there is to it. However, for the majority of people, building habits is very difficult despite the underlying simplicity of the concept. Similarly, the human system of motion and vision is so complex that it is still not fully understood, yet nearly any toddler can accomplish both with ease.

Artificial intelligence research, and research in general, usually falls into the category of ‘simple but difficult’. All that it requires is a consistent and focused effort over a period of time, first developing the prerequisite skills, then getting up to speed with the state of the field, and finally advancing the field yourself. Google X Mathematician Patrick Kidger has one of the most succinct pieces of advice on the matter: “Just know stuff (Kidger 2023)”. Working for arguably the most selective lab on the planet, after getting his doctorate from Oxford & publishing the most well regarded textbook in the world on neural differential equations, he’s certainly an authority on STEM research. He goes into detail on the list of topics to ‘just know’, from topics as abstract as “optimal Jacobian accumulation for reverse mode auto-differentiation” to topics as elementary as linear regression, but the central idea remains very simple. Research is the result of great knowledge of the fundamentals, and then building upwards. Cofounder of OpenAI John Schulman expressed a similar sentiment in his advice to their young researchers (the same researchers now responsible for ChatGPT & the generative AI explosion). He summarizes his advice by claiming “The keys to success are working on the right problems, making continual progress on them, and achieving continual personal growth (Schulman 2020)”. Consider a quote from Richard Feynman, considered one of the greatest Nobel Prize winning theoretical physicists of all time: “You ask me if an ordinary person—by studying hard—would get to be able to imagine these things like I imagine. Of course. I was an ordinary person who studied hard (Wonders of Physics 2021).”

This argument begs the question, “If it’s so easy, why do we only hear about advancements from CS/Math PhD’s, working at top companies with their ivy league colleagues? And can we really trust the opinions of the above Oxford, Caltech and Princeton PhD’s (respectively) on what is easy?” It’s a fair question, and the reality is that success in artificial intelligence research comes down primarily to three things: time spent, which collaborators you work with, and funding.

Photo by Aron Visuals on Unsplash

Time spent is an obvious one, and is the argument made by the researchers above, yet it is still often overlooked by many who are interested in artificial intelligence. There’s no real way around learning linear algebra, calculus, statistics, along with some computer programming concepts. Again, this falls into the “simple but difficult” category – anyone who can multiply and add can learn every single one of those concepts if they’re willing to put the time in. The notation and jargon can take some time to get used to, and many concepts take time to develop an intuition, but there’s nothing in any of those fields that you couldn’t write out by hand as an equation of just multiplication and addition. The complex concepts are just a notational abstraction – they just make it simpler for practitioners to quickly communicate with each other and record their ideas.

Take deep learning as an example. Deep learning has been around for several decades, and is by definition just “function approximation”. Deep learning uses an optimization algorithm to find a mapping from a given input x to a desired output y while making the least amount of errors. The fundamental concepts are just a weighted average (taught in elementary school), the chain rule of calculus (taught in calculus 1 to middle school and high school students around the globe) and a “nonlinear activation function”. Despite the math-y wording there, a “nonlinear activation function” just means “if the output is a, do x, else, do y” where a is some condition like “a > 0.5”. Basic conditional logic is a pre-kindergarten concept taught to us by our parents: “if you behave, you get a cookie, otherwise, you get a time out”. Incidentally, the core “non-convex optimization algorithm” (more technical jargon) of stochastic gradient descent can also be distilled into a concept from early childhood: it’s just basic operant conditioning, rewarding some actions and punishing others. Despite its success in a wide range of applications, there is no magical formula or secret math that AI researchers have hidden away – deep learning boils down to just a few simple concepts any high schooler could do. One needs to spend time mastering these fundamentals and learning the notation of how they scale (largely in linear algebra) but the underlying concepts remain simple to the point of being boring.

Photo by Kaleidico on Unsplash

What’s more interesting is the importance of early collaboration, which addresses the ‘complex and difficult’ issue in AI research: “What on earth do you actually work on?” Dr. Richard Hamming led Bell Labs, worked on the Manhattan project, and received the “Nobel Prize of Computer Science”, the Turing Award. In a speech given to Bell Labs Researchers, he repeatedly emphasized “solid work, steadily applied, gets you surprisingly far. The steady application of effort with a little bit more work, intelligently applied is what does it. That's the trouble; drive, misapplied, doesn't get you anywhere… it must be applied sensibly (Hamming).” Deep learning is successful because it satisfies the ‘universal approximation theorem’ – it can model anything, to any level of accuracy. This flexibility means that there is a mathematically infinite number of ideas to work on, just using the simple building blocks defined above. Navigating an infinite list of options is both difficult and complex, and the best solution we’ve found is to rely on experience. Collaborating with great, experienced researchers early on in your career allows you to leverage their experience to work on the right problems early, rather than attempting to navigate the daily deluge of newly published papers and research directions on your own. A Nature study supports this, finding that collaboration with great researchers early in one’s career is the greatest predictor of research success rather than intelligence metrics like GPA or IQ (Li 2019). Having the right connections early on enables you to capitalize on early successes and compound that advantage, imprinting young researchers with the right attitudes & principles to drive their career success. These attitudes and principles culminate into someone’s research style, with different styles being beneficial depending on the individual’s chosen profession.

One’s research style is determined by a variety of factors, and is pretty unique to each individual. However, we can apply some broad strokes for artificial intelligence research, categorizing it into purely theoretical, applied theoretical, applied research, and purely empirical applied research. Purely theoretical takes place almost exclusively in universities, and hopes to have applications decades from now. Applied theoretical most often takes place in universities collaborating with industrial stakeholders, with timelines of roughly one to five years. Most applied research comes from industry labs seeking immediately profitable ideas, and has led companies like Google and Meta to become the largest publishers of artificial intelligence research by a large margin. Purely empirical is a looser classification, and is an amalgamation of ‘best practices’ collected from online forums and personal experience. Most of these have little to no theoretical backing, but are widely applied in industry and academia alike to bridge the gap between what we know works and what we know should work. Beyond just those research areas, each with their own style, we can’t ignore the importance of technological and hardware advancements that have been arguably more important than anything theory has provided over the last several decades.

Photo by Annie Spratt on Unsplash

These technological and hardware advancements are where the last factor comes in: funding. Successful deep learning is fundamentally an engineering task, supported by advancements in parallel computing software & hardware, necessitating an army of software engineers and IT professionals. The key improvement that allowed for ChatGPT and generative AI is just an algorithm that makes it easier to train in parallel rather than sequentially, at the cost of hundreds of millions of dollars in computing power on Nvidia GPU’s and Microsoft cloud servers. This is an incredibly simple problem to solve, just reaching into your wallet, but that doesn’t make it any less difficult. It's getting cheaper by the day, but funding is a realistic and common obstacle for any solo artificial intelligence researchers. Luckily, venture capitalists and big tech are shelling out for anyone with a little bit of experience, which provides a great opportunity to work at the cutting edge of applied research.

Despite the claims of oversaturation and difficulty in artificial intelligence research, there really can never be enough researchers, and anyone can do it if they are willing to put in enough time. Artificial intelligence is not the end in and of itself, it’s a tool that can be applied to fields such as medicine, classic engineering, education, and more. Encouraging more people to join the field will increase the diversity of researchers and research output, which is essential in increasing the quality of the technology and the effectiveness of deep learning in real world applications. The more experienced researchers there are, the easier it will be for individuals to become successful as they benefit from better early collaboration. It might be hard, but the steps are simple and certainly achievable.

--

--