Google Summer Of Code 2019 with CloudCV

Khalid Riyaz
6 min readAug 26, 2019

--

I am Khalid Riyaz (Github-@KhalidRmb), an undergraduate at BITS Pilani, Hyderabad selected for GSoC ’19 with CloudCV for the project: “Implement Robust evaluation pipeline in EvalAI”. It’s been a very productive and exciting 3 months where I got to work with an amazing community while vastly increasing my skills.

A big thanks to my mentors Rishabh Jain (Github-@RishabhJain2018), Ram Ramrakhya (Github-@Ram81)and Deshraj Yadav (Github-@deshraj) for their constant support and guidance throughout the Summer.

About CloudCV and EvalAI:

CloudCV organization logo.

CloudCV is an open source cloud platform with the aim to make AI research more reproducible. CloudCV actively works on building tools that enable researchers to build, compare and share start-of-the-algorithms. EvalAI, a project of CloudCV is an open source platform for evaluating and comparing Machine Learning (ML) and Artificial intelligence (AI) algorithms at scale. Some of the features offered:

  • Custom evaluation protocols and phases
  • Remote evaluation
  • CLI support
  • Portability
  • Evaluation inside custom environments

For more information refer: EvalAI

Project:

The whole project was divided into 3 phases with each one dealing with different aspects of EvalAI, with an initial bonding period.

Over the duration of the project, I’ve worked extensively with REST APIs & AWS. The project was mainly in Python, AngularJS, Django and Docker.

The bonding period coupled with a learning period where I got an in-depth exposure to the codebase, while solving various issues and writing unit & integration tests for the evaluation workers of EvalAI.

Phase 1:

My first phase was mainly related to migrating the Evaluation worker service to AWS Fargate from EC2 and create various management features for the containers operated through a Django admin dashboard, over the current and the next phase.

Using boto3 (Python SDK for AWS), I successfully implemented the features to Start & Stop the Evaluation worker services on Fargate (with each container configured to a particular challenge) through a single click from a Django admin dashboard. Along with this, I also implemented features for Horizontal scaling of the worker services to deal with increased load & traffic during select periods of a challenge.

Phase 2:

I continued to work on the Evaluation pipeline adding the following features:

  • Restart the containers manually through the admin dashboard, along with Automatic Restarting whenever the evaluation specifications or the script changes in the Database.
  • Also, a feature to Stop the workers on demand was also implemented.

Paralelly, I continued refining the features from the previous phase to handle various corner cases and more scenarios after reviews by my mentors, along with the Tests for all the features.

  • Another task I finished is a feature to Resend the verification Email for new users with Throttling using MemCache.

Link to my work during Phase1 & 2:

Demo — Management commands for Evaluation worker service on AWS Fargate.
  • Pull Requests:

Final Phase:

This phase got me more involved with the other parts of the codebase including a CLI (Command Line Interface) tool.

I added 4 main features to EvalAI:

  • Custom Requirements for Challenges: The Challenge hosts want to have custom environments and consequently Custom Requirements to evaluate the submissions in. I added a feature such that the users can send EvalAI their custom requirements and we will deploy the challenge-specific workers accordingly.
  • Private Annotation Files: The hosts may not want to share their evaluation metrics/annotation files with EvalAI in case of remote challenges. I implemented a feature such that the hosts can load the annotation files in the workers themselves without ever loading the files in the EvalAI database when creating the challenge.
  • Local Chalenge Config file verification: To create the challenge, a config file is uploaded to EvalAI by the hosts. I added a feature to verify the file for any errors or warnings locally, before creating the challenge, so the whole process can be smooth. (Through EvalAI-CLI)
  • Database seeding: Lastly, a feature to seed the database with a Submission and Leaderboard data when first start the EvalAI service locally in the dev environment to give an overview to the users.

Here’s a link to my work during Phase 3:

Future:

  • Deploy the Features for Worker management on Fargate to production.
  • Write tests for the Config Verification feature (evalai-cli PR #170 & EvalAI PR #2430)

Conclusion:

Major tools & libraries used.

My key takeaways:

  • Learnt about the various tools involved.
  • Understood effective coding practices on large scale projects, while integrating various components.
  • Learnt how to break down a complex problem and solve it effectively.

I am grateful that I got to work with such a great community, which without doubt was one of my best learning experiences. It was a pleasure working under my mentors, and I would surely work with CloudCV in the future on more exciting stuff.

Thanks!

--

--