Of (coding) sprints and (half) marathons: how I learned to give it all & to pace myself.
Personal thoughts following the 2019 Bay Area scikit-learn sprint
Not many happy stories start with an email on a Monday morning.
This is the exception.
Last February, a friend, fellow neuroscientist, and open source contributor, @eickenberg, asked me whether I was interested in helping out WiMLDS’s organizing a scikit-learn sprint in the Bay Area.
Flash forward to Saturday, November the 2nd. After months of careful planning skillfully orchestrated by Reshama Shaikh, we are finally here: the sprint is happening! It was a great success, both objectively (So. Many. PRs.) and subjectively: you can read all about it, directly from the participants' voices, here. The day after, still a bit hangover for the celebratory drinks that ensued, I’ve put on my running shoes and went out for my weekly long-distance training.

All this might sound rather trivial unless you consider that just a few years ago, when I started my Ph.D., I could not run more than 2 miles, I could not code in any language, and I had no idea of what hackathons were.

Yet here I am. Grad school is over, I’m training for my second half-marathon, and I’m managing those nagging imposter symptoms well enough as to wear a t-shirt that consecrates me as a member of the open-source community.
Who would have thought?
Scikit-Learn and the democratization of machine learning
The first time I asked an expert, @fabianp, what machine learning was, whether what I was doing with (to?) my neuroimaging data was indeed machine learning, the answer was:
whatever you need to import scikit-learn for
For me, it all happened at the same time: learning to code in Python, learning to fit an SVM, learning scikit-learn API. It was a steep mountain to climb, I won’t lie. And the results are likely suboptimal - do learn proper coding and basic stats early in life if you are serious about this thing called science! Yet, I made it and it was all thanks to the great vision and hard work of a herd of French nerds lead by Gaël Varoquaux. He said it best here.
Open-source software & coding sprints: everyone can fit
I do interdisciplinary research which first and foremost means I constantly feel like an outsider: I’m definitely not a computer scientist, but I’m also not a real neuroscientist nor a true psychologist. How did I get trapped in the open-source community? Honestly, I came for the software, I stayed for the people. I needed the tool to conduct my research, but I kept being involved because I connected. With the people. And their philosophy.
Here comes the magic: in principle, everyone can contribute. Writing code, testing features, debugging, writing documentation. And yes, also planning and organizing coding sprints. This delicate ecosystem needs the full package, which means no one can do it all, anyone can do some. Everyone can fit.
In practice, not all environments are diverse and inclusive enough to make everyone feel welcome, and the above mentioned steep learning curve can be daunting for many. Consistency is the key. Keep knocking at the doors that seem impenetrable. Keep stackoverflowing the error message. Keep putting one foot after the other, and you’ll run past the finish line.
The long-run: know your limits…to push past them
Becoming a runner, you learn a lot about yourself. About the inevitable weaknesses and surprising strengths of your body and your mind. Day by day, you discover how to push yourself while honoring all the signals you weren’t even able to decode. Now I need a break, now I need more fuel. Today I’ll run a bit faster, a bit longer.
Open-source communities face similar challenges. Sustainable, scalable growth requires acknowledging limitations and roadblocks. Funding is never enough, you need to constantly be on the lookout and collect from multiple different sources. Knowledge transfer is hampered by a high rate of turn over among volunteers: some may sprint with you one day, then never contribute to the software again.
Since 2018, scikit-learn is backed up by a foundation, a vital step to provide structure and support for its expansion. Among the staff members, there’s a new figure: a community and operation manager, Chiara Marmo. Hopefully, together with the post-sprint follow-ups Reshama has in mind, this addition will help to ensure that voices are heard, needs are met, and the whole community can get into its stride.

This is the story of how I’ve learned both to sprint & to run (half)marathons. Sometimes you have to give it all, pull an all-nighter, glue your butt to the chair. And just. Keep. Typing. Other times you have to pace yourself, wait for the right moment, the right idea. That slow and steady wins the race, ultimately. I am deeply grateful to all the advisors and (peer)mentors that helped me grow and reach this stage. And I am here to tell you that if I made it, you can make it too.
One thing I haven’t learned? To stop saying “yes!” to all the emails I receive on Monday mornings. Thus, I’m gonna cut this short and move on: the 2019 Bay Area Brainhack won’t organize itself!