Becoming an Independent Researcher and getting published in ICLR with spotlight
Why I became an Independent Researcher, why you should avoid it, and advice for those who make the difficult choice anyway.
In April 2019, I decided to become an independent researcher, to work on getting published at a major conference. Then in December 2019, after 7–8 months of work without funding, I did just that. I got published in ICLR; even better, I got a spotlight!
After posting my publication on Twitter, it went about as viral as a researcher, like me, can hope for. As a result, I have now received nearly 100 messages from others, asking for advice on how to publish as an independent researcher.
In this article, I want to give that advice. However, more importantly, I also want to discuss why I choose to become an Independent Researcher. Working 7 months without funding is no joke and deserves serious consideration!
Why I became an Independent Researcher
In March 2018, I got published in distill.pub with “Visualizing memorization in RNNs,” as a sole-author, demonstrating how an interactive saliency visualization for NLP reveals that two models, with nearly the same accuracy, can behave very differently.
I thought this and a Master's degree in Machine Learning would be enough to be considered for a Ph.D., a Research Software Engineer position, a Residency Program, or an ML Engineer position. So I wrote to various professors and applied to Google, Microsoft, Rakuten, ElementAI, Nvidia, Hypefactors, Intel, JD, Amazon, Samsung, Shift Technology, Corti, and a few others. — I did not get a single interview!
For the Research Software Engineer position at Google, I was even reached out to by a Senior Researcher at Google, who encouraged me to apply to his department and made an official referral. To be safe, I also asked for a referral from a Senior Software Engineer at Google, whom I have worked a lot with (in open-source, Node.js). — Still, no interview, no contact from HR, nothing!
Reality check, “1–2 publications in top ML venues”
However, I did get this from a professor:
“Thank you very much for your mails. It looks super-interesting. In general, getting a PhD here is currently a bit out of the way, without 1–2 publications in top NLP/ML venues, but Distill and your industry experience partially compensates for this. Right now I have no open positions that can be applied for by Danes, unfortunately.” — Professor at University X [Top 100, in QS ranking].
A friend of mine, who knew a former committee member at another university, got this answer.
“A top PhD program for AI will unfortunatly require a top publication already. I’ve been on the committee at [REDACTED]. If you don’t have an ACL/EMNLP/CVPR/ICCV/ICLR/NeurIPS or ICML paper, it is a very low probability you will get into [REDACTED].” — Former committee member at University Y [Top 5, in QS ranking].
Yikes! I have received other emails (most don’t answer), where the professor is less clear, but definitely not interested in my application. I suspect they also require 1–2 top publications, but they don’t want to admit that because it is a terrible policy that favors only the very most fortunate students. — At my university, where I did my Master’s, we were definitely not encouraged to publish. And Denmark is not a bad place to get your MSc degree.
To start a PhD, without insider referral, you need to do work equivariant to half of a PhD.
Getting started, how I funded myself
How did I get my research idea? How did I fund myself? Those have by far been the most asked questions. However, while they do deserve to be addressed, I do think they are not that relevant. There are so many ways to get an idea, and if you don’t have any commitments to others, there are many ways to cut down your expenses. Feel free to skip to “A lonely life, not losing my sanity or hope,” you won’t hurt my feelings.
I had been working as a freelancer from September 2017 to October 2018. I got started when NearForm reached out to me, my friend Emil Bay had recommended me, regarding a new project they wanted to do, called clinic.js. The project required a detailed understanding of the internals in Node.js, a statistical background, and web-visualization skills. Having done lots of visualization, been involved in node.js internal for 6 years, and just finished my MSc in machine learning, that was pretty much as perfect a fit as anyone could imagine. As a result, I was paid quite well. Enough to support myself for 3–4 years if I kept my expenses very low.
To say I lucked out regarding funding, would be an understatement. But Denmark is also an incredibly expensive country. Funding myself in another country would probably have been easier.
In 2019, they also asked me to develop the TensorFlow part of the IoT smartwatch/badge given out at NodeConf EU 2019. While not enough to cover my all expenses for this year, it definitely helped.
Getting started, where do ideas come from
As I said, there are so many ways to get an idea, so don’t take my words too seriously, apply your own creativity.
It was February 2019, I was at an opening event for a Student Society in AI at my former university. I was there hoping to talk to my former supervisor about a Ph.D. Unfortunately, he had nothing to offer me.
However, I did meet Alexander R Johansen, who was an Assistant Researcher and told me he was looking for people to collaborate with. Later in March 2019, I wrote to him, he explained that had several students try to reproduce the DeepMind paper “NALU” but that none succeed and asked if I wanted to investigate it, maybe it could become a NeurIPS paper.
Both my MSc thesis and my distill publication, were about thinking critical about others’ work, which was either exaggerated or misleading, then improving those works. And the problem mostly centered around optimization, which was an area I felt somewhat comfortable with. So this felt like something I might be able to do.
Almost all publications exaggerate how well they perform. Just improving others work is a viable research strategy.
So, there you have it. It is not a super inspiring strategy for research ideas, it does also come with some significant challenges (just read further), but it is a viable research strategy.
Now on to the stuff that really matters!
A lonely life, not losing my sanity or hope
As an Independent Researcher, you can’t really expect encouragement from anybody. I know, I know, a lot of Ph.D. supervisors don’t encourage their Ph.D. students either, but hopefully, they can get it from their Ph.D. peers who will have similar struggles. That is pretty much impossible as an Independent Researcher and is the number one reason I would recommend against becoming an Independent Researcher.
Not having a support network that are experiencing the same struggles of writing a paper, as first-author, is the number one reason I would recomend against becoming an Indepedent Researcher.
Everybody needs at least some encouragement, don’t think you can go on for 7 months straight without any encouragement. I constantly worried about: Not finding a solution, getting unjustified peer-reviews, not getting useful results, discovering a significant flaw, even if I get published, it might have no effect because it’s a niche subject.
Spending 7 months of funding yourself is a significant risk. If I didn’t get published, it would have been a colossal waste. And as an Independent Researcher, it is reasonable to say that my chances are less than average, simply because I get less feedback.
However, while you probably won’t have a support network that is experiencing the same struggles, there are other things you can do.
- I usually meet weekly, sometimes less, with Alexander to discuss the ideas. While Alexander does not have a Ph.D., he is excellent at critical thinking. I don’t think it is necessary to discuss with someone who has a lot of first-author publications, or many years of supervisor experience. The most important part is to talk with someone who can question your work. In the end, we are likely all lazy, and as such, we will be blind to faults in our own work. Just discussing your paper with someone, puts much greater pressure on not taking any unintended shortcuts.
- I did side-projects. Allocating all your time to one piece of work is too high a risk. Spend some time doing other smaller projects that you think are useful. Write an open-source tool, implement a known paper, etc.. It is helpful to take a break from research, and if your research project fails, you have at least accomplished something. — In my case, having some of these side-projects be recognized by known researchers, was also a great source of encouragement.
Writing a great paper
With only about 20% of submissions getting accepted and your peer-reviewers looking for any excuse to reject your submission, “good enough” is not enough; you need to do “great”! However, you don’t have any supervisor to help you, and you have never submitted before, so how do you actually accomplish “greatness.”
In my case, my first publication was in the distill.pub journal, in retrospect, that was very fortunate. Distill cares a lot more about writing well, to explain and educate, than to get past peer-reviewers who may themself not be good writers anyway. For me, writing to explain and educate, is a lot easier than writing to please anonymous peer-reviewers, so this was a good start.
However, my first submission to Distill was rejected by the editors! They were confused about what my contribution was, was it a criticism of Nested LSTM, was it proposing a new NLP task “autocomplete,” or was it the interactive visualization.
There will be 1–2 messages in a paper, that if misunderstood, will completely confuse the reader and be the first cause of getting rejected. Do not be afraid of repeating a message to prevent that.
In my Distill publication, that message was: “Visualization can give critical insight into models, that an accuracy measure can not. However, you also need a problem that anyone can have an intuition about, Chinese Poetry Generation is probably not such as problem”.
In the ICLR publication, it was: “Gating between heterogeneous units is much much harder than it appears, but other issues need to be solved before solving gating, so we consider gating future work.”
Just writing those messages won’t cut it, they need to be expressed throughout the paper such that even a lazy peer-reviewer will get it.
I submitted to Distill again, with significant changes, and this time they were more open. Chris Olah and Ludwig Schubert, from Distill, were then super helpful in providing feedback before submitting for peer-review. I’m not sure I could have made it to ICLR, without what I learned from them. — You can consider reading Novelist Cormac McCarthy’s tips on how to write a great science paper, it covers some of the other things I learned, reasonably.
Finally, I want to mention that I spend some long days with Alexander, polishing the paper, especially when it came to getting the abstract and introduction done, his help was super valuable.
Rejected from NeurIPS 2019
Well, you read the subtitle. We submitted our paper to NeurIPS 2019 and got rejected. Wow, I felt very ill after that. All that time spent, and nothing! My dreams shattered, there was almost no chance for me to pursue my dream — to become involved in ML research.
Having anonymous peer-reviewers hold your life in their hands is so strange.
So why did we get rejected? I would summarize it into:
- Some reviewers did not believe we successfully reproduced the results from NALU, the paper we proposed improvements to. “Why are the results in the original paper much better than the results in your proposal,” was a recurring question.
- Some reviewers required our proposal to do everything that the NALU paper claims to do, even though we provided clear evidence that the NALU model didn’t do those things to any satisfying degree. — If you read the papers, it is division and gating between addition/multiplication that we did not solve. However, we did improve everything else.
The second point, also came down to some reviewers not beliving in our results and reproduction. This is the hard part when improving others’ exaggerated results.
Reviewers are likely to side with already published results. They will only think critical about your submission not previous publications, especially if they come from DeepMind.
I want to clarify that the results from the DeepMind paper NALU are not fake. They are entirely reproducible. However, the results are not framed in the most sensible way for extrapolation tasks, which was the main objective, making the model look better than it is at first and second read-through (you have to read the results really carefully). Also, the NALU paper only shows results from a single seed, while our paper show results for 100 seeds. — We have a workshop paper on just these issues.
Submitting to ICLR, what was different
We had already made several improvements to the paper, up to the NeurIPS 2019 rebuttal. And for ICLR, we added even more evidence and experiences to support our claims.
One extra smart thing we did do, was to publish our experimental setup and reproduction results of NALU as a workshop paper in the SEDL workshop at NeurlIPS 2019. We wrote it on twitter, tagged the NALU first-author A. Trask, who responded with “Great work! We can’t improve without good benchmarks.” This helped because we didn’t have to argue both the experimental setup and our new model. Instead, we could just focus on arguing our proposed model.
I wish I could tell you that our changes made a big difference. However, I believe we were mostly just lucky to get an outstanding reviewer. Also, the discussion happening on OpenReview encouraged better discussion, more critical thinking, and less abusive commentary, as the comments were not secret.
In particular, we got 4 reviews, which to me, indicates that our area-chair was quite engaging. One of our reviewers had also reviewed our paper at NeurIPS, probably reviewer #3, which was the most constructive reviewer from our NeurIPS submission. After all of our changes that reviewer, went from Weak Reject at NeurIPS to Weak Accept to ICLR, and finally to Accept at ICLR and even commented on other reviewers’ idea that “the contributions presented in this paper are too incremental.”
“I understand other reviewers’ concerns that the model presented in this paper is incremental, but I don’t see the strength of this paper to be just the model itself but the whole informed theoretical + experimental analysis leading to improvements of the model plus the open code which is there to stay, as opposed to the original paper. This paper nicely reads as a try to work with a recently presented models, the failure of the presented model and then a detailed process of analyzing and fixing the model and the benchmarks. The paper directly confronts the reproducibility issue with the original model, and improves drastically upon it. That is why I think this paper should definitely be accepted.
I’m not sure whether the presented model will make a big change in the area, but the approach might influence and inspire other researchers to do more thorough analyses.
Consequently, I’m increasing my score to accept.” — Reviewer #4 at ICLR
Reading that made me very happy. At the time, there were still two Weak Rejects, so rejection remained a possibility. However, even if we had been rejected, I would at least have felt that it wouldn’t be a failure of me, but rather a failure of the review process.
Worth the effort?
In the end, we got published. I wish I could say for sure that it is going to help me get a position where I’m involved in research, but honestly, I don’t know yet. I just saw an email saying you need “2 top-venue publications, preferably with famous co-authors, to get into a top Ph.D. program”, which is something I will never achieve as an Independent Researcher. Let’s hope it is just that university.
Q & A
The above didn’t answer all the questions I received, as I don’t think that would have been very engaging. So here are answers to the questions that didn’t make it.
- Q: How did you make such beautiful plots? A: I use the R library ggplot2 for all my plots, that library just makes me happy. I don’t do much else in R, I only export to a CSV file from Python, import it in R, then use R for calculating confidence intervals and plotting.
- Q: Your work is useless, and you are a joke. A: Thanks. Have you considered becoming a reviewer?
- Q: How much time did you spend. A: I spent about 48h/week just on this publication. Some weeks close to 100h/week, others much less. Remember, I also did side-projects and freelancing.
- Q: Where did you get the computing resources from. A: Alexander was able to provide us with that because he was a Research Assistant.
- Q: I was offered a Ph.D. by my supervisor, should I take it? A: If you really want to research, yes, probably. The competition is crazy now, I think you should take what you can get.
- Q: I’m doing an internship, but I’m not getting anything from it because my supervisor is absent. A: Take charge of your own faith. Don’t expect your supervisor to come to you. Be happy you got an internship, I couldn’t. Start to arrange meetings, and remember there are more people you can ask than just your supervisor.
- Q: I’m doing my masters, how can I prepare myself? A: If you can find a professor who is open to it, try to publish. Also, try to get internships while you study. Most internships are only available to those who study, I was offered internships I couldn’t get because I had graduated.
- Q: How do I become a better programmer? A: Writing open-source for many years, this allowed me to be mentored by some of the most exceptional programmers I know of.
- Q: What extra does it take to get a spotlight? A: To be honest, I think it mostly just luck. But it sounds good, so I put it in the title.
- Q: I have seen others become a researcher at Google, with just a Master’s degree. How did they do it? A: Yeah, the golden years were 2013–2015. If you finished your Master’s there and got lucky, you could get really far.
I want to finish by just repeating my main advice for other Independent Researchers.
- Collaborate with someone. This does not have to be an expert in writing papers, just someone who can constructively criticize your work. Finding someone at university, who has access to large computational resources will most likely be a necessity.
- Expect failure, but try anyway. Only about 20% of publications get accepted. As an independent researcher, your chances are likely less than average. Don’t be discouraged just because you get rejected once. And do side-projects, such that if you do fail, it doesn’t feel like a complete waste of time.
- Avoid it. It is hard to think of a reason to become an Independent Researcher if you have an alternative. Maybe your MSc supervisor was often absent, but at least there will be other Ph.D. students to talk to. As an Independent Researcher, you are missing out on a critical support network.
For the researchers, who may also be reading along
Finally, I did go to NeurIPS 2019, because we got a workshop paper accepted. There I had the chance to talk to several recruiters, professors, and researchers. I was shocked! There is a gigantic gap between what recruiters require, researchers want, and professors provide. — I want researchers to know that over the last 2 years, the landscape has completely changed. Just getting into a Ph.D. program, is now likely harder than completing a Ph.D.
If you finished your Master’s in 2017 or earlier, getting into a good PhD program was achievable. Now it requires 1–2 publications in top venues (NeurIPS/ICLR/ICML), preferably with famous co-authors. I hope researchers, professors, committee members, and conference organizers, will help to stop this new elitism that is rapidly developing! It only amplify the existing biases that are already challenging the industry.