The GSoC Selection Experience

Kush Kothari
Developer Students Club, VJTI
5 min readJun 8, 2021

Open-Source sounds scary for every newcomer. One can’t imagine the number of times I’ve reread the text for a typo in the docs while creating my first GitHub issue. For a student who barely has any confidence in his abilities, contributing to open-source seems like a huge task. That’s where the Google Summer of Code (GSoC) comes in, a perfect opportunity for any student looking to dive into the world of open-source.

But what even is Open Source?

In general open-source is source code that is made freely available to use, redistribute and modify. You can see the very code that makes a system up. But to me, it is much more than that. Open Source is a community of some of the most creative coders; creating software that will make the lives of themselves and thousands of others much easier. Software for the people, by the people.

Think of Open-source like a big machine that takes in budding ideas and coding talent, and churns out some of the best solutions to real-world problems. Want to help improve scientific research through code? You have awesome organizations like CERN or Open Chemistry. Want to work on deep learning? mlpack and Tensorflow at your service. And to add to that, the organizations selected by Google for GSoC are just a small subset of the large ecosystem of open-source! There’s something for everyone. Right from contributing to documentation, to making recipes, videos, and more. You are contributing to help, every skill count!

The GSoC selection experience and everything leading up to it

I made my first open-source contributions in VJTI. My first Pull Request(PR) was to the Community of Coders (CoC) website. I feel that it gave me a firm understanding of how to communicate with the maintainers of open-source projects. It also taught me how to effectively ask questions regarding the code changes that you are going to make. These few initial steps were crucial in getting me habituated to larger codebases of large organizations with strangers I’ve never met.

The Initial Trouble

Once the GSoC organizations(orgs) were announced in March, I started hunting for organizations(orgs). I was looking for orgs with projects that used a tech stack I was familiar with. But, this seemed to be a Herculean task. I was not confident about the skills I had, I found it difficult to find the most appropriate organizations. The projects that I did find, with tech stacks I knew, had a huge number of people already competing for them.

I tried out multiple orgs in this period, like EOS icons (under Python Software Foundation (PSF)), FURY (also under PSF), AutSPACEs (under INCF) but everything either seemed to not be my cup of tea or had too many people involved. You can google these orgs if you want to! You never know, maybe you’d find something I couldn’t :).

The GSoC mindset

After consulting other people, I finally realized what was wrong with my approach. It was the mindset.

So far, I was only looking for projects with all technologies I had worked on before. Every new opportunity to learn something was being ignored because I was scared of exploring domains that were slightly out of my comfort zone.

The Google Summer of Code program has always been a way for students to improve their skills and get familiar with Open Source. Yes, proposing something with technologies you are slightly new to, is not an easy task. You will have many doubts along the way. That’s when the next part of the mindset comes along… communication.

Communication

Communication is super important in every open source project. If there is proper communication among the contributors, only then a viable project can be delivered. I realized something in the proposal and the selection period. All the doubts I had regarding a particular topic could be very easily answered, by asking the right questions in the IRC/Chat of the project itself, or of the library/tool that was giving the error.

I tried ensuring regular communication with the mentor and promptly asking doubts when required.

My experience with writing a proposal for Data Retriever

Like my seniors suggested, a good GSoC project will also help you grow your skills by helping you learn new things as well.

I had experience with Python, Django, and PostgreSQL. While writing the proposal I also learnt a lot about the tools required for spatial data analysis with Vector or Raster data. I made sure that I promptly googled or asked the appropriate questions that I had while writing the proposal.

I am glad I selected this project as this provides an excellent opportunity to learn and grow a lot through the summer.

What is a Data Retriever? And what is my proposal about?

Data Retriever is a sub-org under the numFOCUS group. It is a python library that automates the first steps in the data analysis pipeline by downloading, cleaning, and standardizing datasets, and importing them into relational databases, flat files, or programming languages. The automation of this process reduces the time for a user to get most large datasets up and running by hours, and in some cases days.

There is a project under Retriever, which is the Retrieverdashboard. The Retriverdashboard helps the team make sure all datasets are downloading. In the case of tabular datasets, they are additionally tested to check if they are installed into SQLite.

My proposal aims to extend the capability of this dashboard to also provide support for the download and install of spatial datasets into a PostgreSQL database that has the PostGIS extension enabled.

What after the proposal?

The selection for GSoC is not only based on the proposal but also on how well you’re able to provide a proof-of-concept of the proposal you provided. In my case, it involved writing scripts that were similar to the data set testing pipeline that I am going to implement.

What now?

Now that the coding period has begun, I am hoping to complete the proposal aims one by one and also provide appropriate documentation for the code I will be adding. I will be blogging on the things I code/test regularly and hope I will get a chance to learn new and exciting things!

--

--