Google Summer Of Code 2018 with GA4GH

Somesh Chaturvedi
3 min readMay 13, 2018

--

Journey so far!

Got selected as a Google Summer of Code student in Global Alliance for Genomics and Health for the project Reference Sequence Retrieval API. Results came out on April 23 09:30 P.M. IST, couldn’t sleep the night out of excitement, although had an exam on 25th which got fucked up for obvious reasons :P

Here is the mandatory picture which I have been seeing in GSoC blogs and every time I used to think why the same screenshot and everyone. Isn’t it boring or too mainstream? But now I understood why the same and why everyone.

About GSoC

Google Summer of Code is a program which promotes open source contribution from students all over the globe. A 3 months long coding period, get to work with awesome people, great organizations connecting tech enthusiasts, scientists and leaders and last but not the least, a competitive stipend. I won’t take any more lines describing GSoC, you can read a lot about it from the website.

Journey to GSoC

It all started when my friend Gaurav was working with a professor from the University of Nebraska on a data visualisation project, for which he too got selected for GSoC, told me about his organization, mentors and project. That conversation ignited the fire in me which eventually led to this. Thanks, Gaurav. Here is the Gaurav’s project. I didn’t plan for GSoC. I even booked tickets for a hiking expedition to Roopkund in Uttarakhand which I had to cancel and I was way too late for GSoC already.

After the selected organizations were declared, it was time for a wise and smart selection of organization(s) which can provide me with an opportunity to connect with awesome mentors and hone my technical skills. I finally settled for Global Alliance for Genomics and Health (GA4GH). Talked to mentors, multiple iterations in the project proposal and finally submitted on 27th March, the very last date of proposal submission. Mentors were helpful and equally enthusiastic.

Organization and Project

GA4GH is a policy-framing and technical standards-setting organization, seeking to enable responsible genomic data sharing within a human rights framework. Its a huge organization with 500+ organizational members, 2000+ subscribers and is spread in 71 countries. You can read about organization’s mission, structure, members, current projects etc in detail from its website

Being a biotechnology undergrad, I have always wanted to work on a project which is an amalgamation of both biotech and computer technologies. My project which is Reference Sequence Retrieval API is just that.

Overview of the project

In this project, I’ll be working on the compliance document and test suite of reference servers which provides genomics data via RESTful APIs, a python client to talk to these servers along with a CLI and finally an end to end compatibility matrix and test suite to test server-client architecture.

Next Steps

Now the hard part begins. From today, 14th May, the official coding period starts. Its gonna be an exciting summer. I am planning to write a series of four blogs, this one and three others after each monthly evaluation, so stay tuned for some exciting coding adventure. You can check this GitHub repository for regular updates.

Enough talk, start coding already.

Talk is Cheap, Show me the code — Linus Torvalds

P.S.: I am not a regular blogger. Constructive criticism required.

--

--