Applied AI Grants in the AI Summer
Perhaps you’ve had this experience at a conference over the last year or two? You listen to an applied AI project that uses some relatively new model, and the thought that goes through your head is, “Didn’t [OpenAI|Google| Anthropic|some random startup|someone else] just do/solve this a couple of months ago?” Maybe you’re noticing this happening more and more? It’s not your imagination. Welcome to the AI Summer. This feeling that something is scooped or obsoleted will (a) only grow and (b) be exacerbated in academic activities that require more lead time. Important activities. Specifically, grants.
I’ve sat on several grant panels over the last couple of years for various directorates and agencies. In participating, I noticed a few PI (and sometimes panel) behaviors that I think are hurting applications with applied AI components. My best theory is that there is a problematic mismatch between the fast rate of change in the AI Summer and the long lead time of grants. The result is that problems that are impacting papers are even more pronounced in grants.
Grants in the AI Summer
Here’s the model: you get your idea, you do some preliminary work, you write, you apply for the grant, you get reviewed, you maybe get the funds released, maybe reapply, etc. Time passes between each step. Two months? Three? More? In that time, the world of applied AI is happily living its best life. It is, after all, the AI summer. And your idea? Sorry…scooped and made obsolete.
I suggested to a program officer that the agency should put out some guidance about this. People who don’t apply often or review enough will not understand the fundamental ways in which times have changed. Writers may burn through multiple cycles trying to get it right. The reply was that maybe some “senior, respected member of the community would want to do it.” I don’t know about respected, but I have gray hair. Welcome to my take.
I strongly believe that learning to apply for applied AI grants in the AI Summer requires different strategies and defensive writing. Thinking this through will hopefully save someone (you? the reviewers?) a needless application cycle. The goal here is to mitigate the risks and hopefully make your reviewers happy.
Scoping and Scooping in the AI Summer
Nothing can save an overly specific idea. There are many reasons ideas can be bad, but I want to emphasize here that overly specific ideas are more likely to be scooped or obsoleted. This was true before the AI Summer but is more true today.
In earlier times, if the idea is too small, there was a non-zero chance that some of it will get scooped by year 1 or 2 of your grant. In the AI Summer, there’s a non-zero chance that you will get scooped by the time your idea ends up in front of a panel. The field is just moving too fast. Your panel reviews will point to published papers or just released models that weren’t even an idea in someone’s head when you first wrote your grant. The panelists will understand that you had no chance of knowing about this, but that won’t save the grant. You have been scooped and your idea obsoleted.
Realistically, your ideas should never have been too small, too narrow, or too specific. But you may have gotten away with it before, and think you can now. This is a risky model, and it is more critical now than ever to vet grant ideas for appropriate scoping. Can someone spend a few days with CoPilot or ChatGPT and get pretty close to a working system? That’s an idea that may require some rethinking.
Some questions to ponder: First, you may get paper A out of your grant, but will you be able to hold off others from scooping you beyond that? And maybe it’s even worse, and you actually published paper A before you sent in that grant. Now I don’t know you, but I’m guessing you may have already published some part of Aim 1 as an initial paper? Even if not directly part of the aim, these are the preliminary/pilot results that demonstrate feasibility and momentum. It used to be ok to do this (even encouraged by some agencies). And it was fine to publish because it really took a while for someone else to pivot into your space. But that’s a big jump start to give others during the AI Summer. It will be hard to ever get that lead back. Panelists will see this and it will invariably impact your review. Second, if your idea’s contribution was dealing with current model limitations (error rates, execution speed, hallucinations, etc.), would those limits still apply in 3 months? Will some new model make it so no cares about that unpublished paper B? Or even worse, will paper A be obsolete on day 1? The nature of the AI Summer is that many ideas, especially poorly scoped ones, will simply have no shelf life.
The scope of ideas for the AI Summer needs to be different. They need to be more immune to someone hacking a solution together in a few days and more independent of any specific technology or model version. Grant ideas that may have slipped by unremarked upon in the past — they may have been simple or small, but still hard for someone else to pivot to — will not receive the same read today.
Just for the sake of having a working example, let’s consider a scheduling feature that automatically adds things to your calendar. Let’s make it Spring of 2022. It’s an exciting time, and LLMs are the rage. We’re still on Zoom a lot, and you realize that people have chats during the call to schedule follow-up meetings. Instead of reading emails (we solved that in 1999), your tool will listen to video conference calls, find agreed-upon meeting times, and add them to participant calendars after the meeting. You tested the idea and have some initial benchmarks. It mostly works, but you are finding that the LLM extraction sometimes fails. The semester ends, and you pause for summer vacation. Then there is the CHI deadline, and now it’s October of 2022. It’s time to start writing the grant. You decided to frame the problem with three Aims. Aim 1 will use the noisy speech-to-text model (state-of-the-art in Spring of ’22) and identify a prompt to extract the times and dates (using the best of the best: GPT-3.5). Aim 2 will test different prompts and characterize their behaviors because your initial finding was that results were sensitive to the prompt. Aim 3 is to find a subset of the prompts identified in Aim 2 and have them vote to do the extraction. Seems reasonable in what you assumed was a relatively static world. Except your panel meets in April of 2023, one month after the release of GPT-4. Suddenly all your assumptions about what doesn’t work are broken. GPT-4 yields virtually perfect extractions without prompt variants or voting. Not to mention some startup just built a Zoom plug-in to do this exact task. Scooped and obsoleted. But let’s see if we can help this (admittedly, bad) idea…
Two years is a long time in the AI Summer
Your panelists are not fortune tellers (some may think they are, and I can’t help you there). The fast-moving pace of AI Summer makes it even harder to predict accurately what the world will look like in 2+ years. But part of the panel’s task is to decide whether it’s worth funding a general line of research for that period.
What they do have is the advantage of more time having passed relative to when you conceived the grant. Their predictions of where the world is going will be better than yours. They have more data. All those papers/projects/models in the future when you wrote the grant, are in the past when the panelists read it. This will invariably change a person’s conception of where the world is going. Critically, it will change their evaluation of how you and your ideas fit in that world. The assessment will not be whether the idea is future-proof or “defensible” beyond the time of the grant’s completion. It will be: will it be future-proof in the next year? During the AI Summer, the answer will be “no” far more often than “yes.”
One (probably) obvious way of addressing this in a grant is finding ideas general enough that they will be interesting no matter what is happening in the world of models. Can you say that your idea be as important given trends and new models? Even better, will it be more important? Here’s the advice. First, make sure the answer is yes (to hopefully both) questions. If not, maybe rethink/extend the idea. Second, even if it’s obvious, write down your reasoning. If you leave this to the reviewers to guess, you are gambling. Both the hype and reality of the AI Summer have the risk of leading a panelist to envision a future where the answer is no.
A more “general” grant instance of our working example is possible. Rather than a specific implementation target that depends on a specific model, maybe the focus changes to modeling the ambiguities in how people talk about time and schedules in spoken conversations. For example: what are the fundamental features of spoken language when it comes to this task, and how do we characterize ambiguities that are easy for humans with shared contexts but not for machines?A focus on this formulation of the problem is no guarantee that you won’t get scooped, but it’s less likely. The benefit is that you know the problem won’t go away with newer models. A revised Aim 2 may have the system automatically generate a bunch of recommend prompts given the ambiguity model. Maybe this modeling and generation is itself done by an LLM (LLMs all the way down!) and we will characterize this process? Not only will the problem not go away with new models, but now you can argue that this proposal defines the baseline and that new LLMs will just make it better.
Write good limitations sections for the AI Summer
Program officers admonish panelists to review the grant rather than the person. That is, we should not give undue deference to someone famous if the grant isn’t perfect. We accomplish this with varying levels of success. What’s important here is that successful people are more likely to be “trusted” to make changes and “fix” limitations or concerns in the grant. They are assumed to know how to change directions to avoid being scooped or obsoleted.
While I wish you the good fortune of having no one doubt your ability to pivot, I think it’s possible to signal this ability regardless of fame. Specifically, by writing better risks and limitations sections. This was always a good idea in grants, but I think it is even more important today. Just as writing about ideas in ways that demonstrate the ideas are defensible and future-proof, it’s worth acknowledging the possibility of being wrong and providing good evidence that the ideas can pivot. For example, if some new model comes out that does what you didn’t think was possible, how will you use it? Think about specifically answering this in your writing (maybe even on a per-aim basis): “What if the technical gaps here become obsoleted in the meantime? What would we do?”
A related thing that you should consider is the ethical element of AI. Your panel will invariably consist of many people with many different takes (hot or not) on whether it’s a good idea to use AI for whatever you’re trying to do. It’s probably a bad strategy to assume that everyone is with you on this. Now, I’m not a fan of proforma ethics sections (and if you give me an AI Prop 65 warning, I probably won’t think well of the grant). But for better or worse, these sections have become the norm in multiple communities. Whether you stick with that section or integrate the discussion into the grant some other way, you will likely be helping yourself. For example, you have likely (hopefully?) thought about the societal/user problems of automating whatever it is that you are automating. You’re also probably writing a broader impact section. Maybe that’s a good opportunity to think, and write about the broader broader impacts of AI?
Our working example is now immune to new developments in the LLMs. Unfortunately, we haven’t directly addressed changes in speech-to-text models. The initial assumption was that they were too slow, so the extractions would happen offline. While it wasn’t as clear at the time how they would change, we probably could acknowledge that evolution will happen. Likely, they’ll become more accurate, and there’s a possibility that they will work in real-time. If that happens, users may expect your calendar adder to also work in real-time so they can quickly make corrections. Unfortunately, the assumption in your proposal was that you would be working offline — after the meeting was over — and the system would have the complete text of the conversation. But now you only get to see part of the conversation and this is likely to increase ambiguity. A good limitation section might directly acknowledge the possibility of this updated model and changing expectations. What pivot to your aims would be needed given this possible, even likely, change in capabilities? How does it change your ambiguity models? How will you update your specific research approach? (In case it’s not obvious, this entire proposal example is a pretty bad idea. It’s just for discussion.)
TL;DR
Again, I don’t have a magic recipe here. These are just some observations of how the AI Summer has changed the way reviewers read grants and evaluate ideas. There are new problems to think about, and some problems that were small in the past are bigger now. So:
- Recognize that no one, even you, is likely to see the future perfectly during a fast-moving AI summer.
- Scope your ideas so you don’t get scooped.
- Future-proof your ideas and tell your panelists why and how.
- Describe where, when and how you will pivot.
- Appreciate that the limits of AI today may not be the limits of AI next year, next month, or even next week.
- Don’t assume panelists agree with your ethical framework.
Finally: The caveats to all this should hopefully be obvious. First, I come from a certain sub-area of applied AI work (mostly HCI). I suspect this applies to other areas, but you’re welcome to tell me why (not). Second, the AI Summer has changed the funding situation for everyone and you can find others’ advice if you’re not doing applied AI. Third, I hope I don’t need to say this, but following my advice here is no guarantee of receiving funding. Funding levels change, and you’re up against many other good applicants. As an aside, I will say that if you’ve never been rejected, you may be doing something wrong (but that’s another article). None of this advice is official from any agency and you should always check in with your program officer. Finally, some of this applies to individual papers/projects, but I make no promises. Grants are supposed to describe many projects/papers/dissertations over 2–5 years. You may be able to get away with certain things for short projects/papers. That comes with its own costs, but that is also an article for another time.
Thanks to Jessica Hullman, Eric Gilbert, Hari Subramonyam, Michael Bernstein, and Sarita Schoenebeck for their feedback.