Building Digital Public Goods

Janhavi Lande
4 min readAug 15, 2022

--

Code4GovTech | Learning Machine Learning Operations

Contributing to the Open Source system

In the summer of 2022 I got an opportunity to contribute to government digital public goods as a part of Code4GovTech. Code for GovTech (C4GT) is an annual summer coding program to create a community that can build and contribute to global Digital Public Goods. I was a contributor to the Machine Learning Platform Project. The project is broadly involved the building of a feature store, model training pipelines, deploying service, model monitoring and a prediction service.

Machine Learning Platform Brief

Code4GovTech Timeline

C4GT ’22 accepts contributor applications from mid-may up till June.

  • Proposal Submission Period (20th May 2022 to 10th June 2022): Understanding the requirement of the project by breaking it down into various components with framework and libraries for each with block diagrams allows a more graphic and easy-to-understand flow to your submission.
  • Proposal Review Period (10th June to 17th June): Proposals are reviewed and shortlisted candidates are invited for a 1:1 session with project mentors.
  • Announcement of Shortlisted Contributors (18th June): Selected contributors are informed. This year a total of 13 contributors out of 400+ candidates were selected.
  • Coding Period (June 21st, 2022 to August 5th, 2022): Contributors are allotted mentors and the coding period commences.
    - Midpoint Evaluation: Present and demonstrate the code, milestones achieved, working demo of the project, documentation, and framework which you have worked upon. Midpoint Evaluation of C4GT 2022 was on 10th July 2022 in the presence of eminent panelists Mr. Jagadish Babu (COO, EkStep), Mr. Manish Srivastava (CTO, eGov Foundation), and Mr. Ravi Prakash (Head of Architecture, Beckn Foundation).
    - Endpoint Evaluation: The finale presentation is scheduled to be held on August 17, 2022 in the presence of Gandhali Samant: Director, Enterprise Advocacy at GitHub, Kiran Anandampillai: Technology Advisor at National Health Authority (NHA), Puneet Joshi: Principal Architect, Modular Open Source Identity Platform (MOSIP)

Experience

As glad as I was to contribute to government use cases which shall in fact impact millions, if not hundreds I was equally excited to learn Machine Learning Operations. My fellow contributor for the Machine Learning platform project was Pratyaksh Singh and we worked under our mentor Ashish Yadav. The first couple of weeks were spent on building the framework, researching compatible libraries for each component and seamless integration of the same. I worked on building a feature store, integration with the cloud and integration with TFX pipelines, APIs.

Through the entire experience I was able to learn Feast, an open source feature store library, BigQuery, Amazon Web Services, Amazon Sagemaker, Google Cloud Platform and Hopsworks Open source library.

For each dataset, from data ingestion into a feature store, data cleaning, data preprocessing, building feature groups, conversion into feature view, feature engineering, integration with common cloud services such as AWS, Azure, GCP; I was able to learn various integrations.

Basic Framework

The feature store has been built with Hopsworks Open Source library and can be easily integrated with AWS, Azure for limitless execution and usage. The training, deploying, querying services were built with Tensorflow Extended and the API for each service was built with Django and Django REST Framework.

In order to keep milestones in check, C4GT also has an internal website and a weekly check in with mentors for PR merges and Code review.

The mid point evaluation and detailed feedback from the government tech leads and mentors proved extremely beneficial. Moreover, I was encouraged by panelists to connect with Bhashini, a NLU based government project with aim to “Harness natural language technologies to enable a diverse ecosystem of contributors, partnering entities and citizens for the purpose of transcending language barriers, thereby ensuring digital inclusion and digital empowerment in an AatmaNirbhar Bharat” which I look forward to contributing to soon.

Post mid point evaluation, contributors were also encouraged to take peer-to-peer sessions on any skill/technology they are familiar with like NextJS, GraphQL, Data Structures/ Algorithms and for each learning I have been able to take back from these sessions, I am truly grateful.

My experience was made wonderful was getting in touch with like-minded contributors, my mentors and prominent people in the field of GovTech. I found this programme to be extremely active and engaging throughout my contribution period here. I learnt various open source technologies in Machine Learning Operations & pipelines, importance of good documentation, writing clean code and following an agile scrum methodology to secure each milestone.

How to apply?

If you want to contribute to Govtech use cases and contribute to their projects, Code4GovTech is a dynamic opportunity for you to get started with! You can contribute to their usecases here. Join their active discord community here. Moreover, you can also be a mentor/contributor to various other projects through their campus programme 2022. Apply here!

Machine Learning Platform Resources:

The Github repository to the Machine Learning project 2022 can be found here with their documentation.

For any queries, please feel free to reach out at: Janhavi Lande. Thankyou for reading!

--

--