Coming up with research ideas

9 min readJul 13, 2022

“What project should I do next?” is a recurring question for anyone doing research, from undergrad to PhD (and beyond). This question can be broken down into (1) coming up with ideas, (2) organizing and fleshing out ideas, and (3) deciding which ideas are worth pursuing as research projects. In this blog post, I present a collection of heuristics for step 1, while part 2 deals with steps 2 and 3.

I use a lot of anecdotes from my own work — not because I think everyone should do what I do, but because it’s where I have the most insight into the process by which it came about. Of course, this means my advice and heuristics are biased towards the research I like doing, and thus should be taken with a grain of salt and / or adapted to your own style and preferences.

Precondition: expand your adjacent possible

Ideas tend to come in the ‘adjacent possible’ of your current knowledge. For example, von Neumann was able to immediately conjecture duality for linear optimization after George Dantzig presented him with a linear optimization problem, by making a connection to game theory that would probably not be obvious to anyone else (see this, second column of page 3, for Dantzig’s amusing account). Other examples abound (see this book for more).

A corollary of the above is that the more you know, the easier it is to come up with ideas. I think a good rule of thumb is to try to get a broad overview of many areas adjacent to your own, and going into real depth in your own area (however you define it), or as needed (e.g. to solve a subproblem in a project). Courses and survey papers are great for getting broad overviews, since someone else already went through the trouble of compiling, organizing, and presenting the important information in a coherent thread.

While great, courses can get outdated quickly, or not cover a sub-area you want to get an overview in with enough detail (e.g. it’s easy to find an NLP course but harder to find a course on ‘interpretability in NLP’). Recent papers or talks can be more up-to-date (e.g. I gave a talk on interpretability for QA here which I think is still current, but will probably be outdated soon). The key is trying to get the big picture rather than getting bogged down in details (unless you’re going for depth, in which case you do want to understand details). In particular, I would pay careful attention to what kinds of problems people are working on, what techniques are state of the art, and how evaluation is done.

While I make a conscious effort to always have some ‘learning’ time on my schedule, I am not very systematic in terms of figuring out what to read or learn (I figure some randomness is probably good). If I see a problem or technique I’m not familiar with mentioned in different papers, I’ll typically make an effort to learn about it. I also ask collaborators to send me recent papers they really like. Another heuristic I’ve enjoyed using is reading papers that get awards from ML and NLP conferences, even if they have nothing to do with my research (I know awards are biased and somewhat random, but it’s a good filter for papers that are at least good). Finally, I highly recommend walking around at random during poster sessions (and talking to authors) to get fodder for future learning efforts.

Heuristics for coming up with ideas

In addition to trying to always expand my adjacent possible, I like the heuristics below. Note that these are overlapping and obviously not exhaustive, but I think they are still really useful.

Heuristic 1: Set your filters on ‘important problems’

Richard Hamming has a well-known piece of advice that says you should ask what the most important problems in your field are, and work on them. One lesser known addendum to this advice is how he describes great scientists:

They have something between 10 and 20 important problems for which they are looking for an attack. And when they see a new idea come up, one hears them say “Well that bears on (is relevant to) this problem.”

I think the principle at play here is that when we consciously prioritize something, we tend to notice how other things relate to it, and to downweigh everything else (e.g. the famous invisible gorilla). Having a list of important problems is a way to intentionally ‘set your filters’, helping you notice connections while you read papers, watch talks, or work on other things. In other words, it’s a way of making sure you always check if the new knowledge you acquire opens up the adjacent possible near certain problems. Hamming himself set his Friday afternoon apart to think “great thoughts”, i.e. where his field was going, what was important, what was going to be important, etc.

I think trying to ‘set your filters’ is a really good practice in general, and one way to do that is making lists of things you want to have ideas about. I like Hamming’s suggestion of keeping a list of most important problems, but I change it slightly such that my list has the most important problems for me (i.e. the ones I’m most interested in). My list is of course determined by my interests and idiosyncrasies, but I’ve found it useful to ask researchers I respect for their list of most important problems, and then ask them to push back on items in my list.

Heuristic 2: Investigate failures / annoyances

In the course of doing research (or class projects, or whatever), most people fail quite often, i.e. they try things that don’t work. It’s particularly annoying when you feel like some technique or other should solve a problem you’re having, but it does not. Such failures and annoyances are fertile ground for new ideas, but only if you take time to understand what the problem and its causes actually are. Since these failures are usually tangential to whatever you are actually trying to do at the time, it is tempting to ignore them when you find a workaround or give up. Instead, I suggest writing it down and putting it into a list of ‘failures I don’t understand’, to investigate later.

I am very fond of this technique, because I accidentally used it to find my PhD thesis topic. I was doing an internship at Google and a model I had trained was terrible ‘in the wild’ even though it had good cross validation accuracy. I spent a lot of time getting new data, perturbing inputs and looking at predictions, and it really bothered me that it took so long to understand what my model was actually doing. My advisor asked me to give a presentation on my internship project, and I ended up talking exclusively about this annoyance (and a half-baked vision of how great it would be if we could understand the behavior of any ML model), rather than what I actually did in the internship.
After the talk, he asked me if I wanted to work on this problem. At the time I was working on an unrelated topic (distributed systems + ML), but we made the switch and wrote this paper, which set the trajectory for the rest of my PhD. Many of my other papers also came about in a similar fashion, as I tend to work on things that annoy me.

Manufacturing failures: what if you don’t have an existing list of ‘failures I don’t understand’? Worry not, there is a very reliable way of bootstrapping that list. If you read a paper you like, or if there is a new technique or model ‘in the air’ (e.g. BERT, GPT-3, CLIP, Dall-E, whatever), spend time thinking about how the technique or idea could be used to do something cool or useful that was not done in the paper, and then try to do it. It will almost always fail in some way, and you can start investigating that failure.

Heuristic 3: Use analogies

Try to find analogies between existing work and open problems (your list of important problems from Heuristic 1 will come handy here). This may feel like a rehash of the von Neumann / Hamming examples above, but I think it’s worth pointing out that this heuristic goes beyond ‘applying technique X to problem Y’. Let me illustrate with a couple of anecdotes, with the relevant analogy in bold.

Back in 2017, there were a lot of papers on adversarial examples for vision models, but nothing really satisfactory for text models. Part of the reason for this was that images are in a continuous space, and thus it’s easy to define distance metrics that roughly correspond to human perception (e.g. pixel perturbations with a small L2 norm are mostly imperceptible to the human eye). I was working on another project, where I was trying to find the subset of the input that is ‘sufficient’ for a prediction to use as explanations, and it bothered me that this subset for text models was often much larger than I expected, because models were really sensitive to small changes (notice how Heuristic 2 is also at play here — this annoyance was tangential to that project). At some point it dawned on me that what bothered me was when predictions changed with small change in semantics, not number of characters or words. For example, it’s fine if a sentiment analysis model has different predictions for “This is a good movie” and “This is not a good movie” (even though only one word changed), but not fine if it has different predictions on “This is a good movie” and “It is a good movie”. I realized that the right analogy to ‘small L2 norm’ in text was ‘small semantic distance’, not ‘small edit distance’, and that turned into this paper.

This project in turn led to another analogy (and another project). In addition to individual adversarial examples, we ended up proposing adversarial rules — regex-like rules (e.g. replace ‘What is’ with ‘What’s’) which caused models to change a lot of predictions. When I showed this to a friend, he told me ‘these rules are just like unit tests for ML’. I loved that analogy, and started thinking along the lines of ‘what other kinds of unit tests do I want to write?’, an investigation that turned into it’s own project.

I think these anecdotes make it clear that using analogies is not simply applying a technique to a new problem, but exploring similarities and dissimilarities. The analogy acts both as a constraint (which can be discarded in the parts where the analogy doesn’t hold) and a source of ready-made ideas from the original domain, which you can try to apply to the new domain.

Heuristic 4: Challenge the status quo

Challenging some status quo can be a good source of ideas, but it is crucial to first try to understand why it is the status quo. The well-known principle of Chesterton’s fence applies here:

There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.”

Even though Chesterton is talking about political change, the principle also applies well to research — if you don’t know what led to a status quo, trying to challenge it often leads to wasted effort, where you hit the exact roadblocks that led to the status quo in the first place.

One benefit of this heuristic is that even when it fails, it leads to a better understanding of some status quo, which expands your adjacent possible and leads to better ideas in the future. This is important, since this heuristic often fails, as there is usually a reason why something becomes the status quo (when challenging it is easy, other people tend to do it before you). When it does succeed, however, it leads to really cool work. An example off the top of my head is this paper (which won the test of time award at NeurIPS 2020), which challenged the (very reasonable) assumption that some synchronization is needed when trying to parallelize SGD (i.e. locks). One of the reasons the paper is cool is because the status quo was so reasonable — everyone ‘knows’ you need to deal with race conditions when parallelizing any algorithm, and you do this with locks (except they didn’t).

Are ideas cheap?

“Ideas are cheap, execution is everything”

The quote above (attributed to various authors) may apply to other domains (e.g. startups), but I don’t think either clause holds very well for research. While generating ‘ok’ ideas is easy, good ideas are not cheap, and generating more ideas is one of the best ways of getting a few good ones. Further, while execution is certainly important, the main contribution of some great research papers is the idea, even if execution is not stellar.

This post was an attempt to make ideas a little cheaper. In part 2, I’ll talk about organizing these ideas, and deciding which ones are worth pursuing as research projects, in order to avoid the common pitfalls of wasting time on ill-defined ideas, or working on the first idea that seems ‘good enough’.

Acknowledgments

Alex Cabrera, Gabriel Ilharco, Adarsh Jeewajee, Fereshte Khani, Scott Lundberg, Shikhar Murty, Sameer Singh, Tongshuang Wu, and Yilun Zhou read a previous (and much worse) version of this post, and contributed to it with helpful suggestions. Thanks!