Google Summer of Code 2023 Experience with OpenVINO™: Integration of Timm Models to Optimum-Intel

Published in

OpenVINO-toolkit

8 min readSep 7, 2023

Hello Everyone! This is Sawradip. I am writing this blog at the end of Google Summer of Code 2023, about the experience of my GSoC project Showcase performance of PyTorch Image Models (Timm) with OpenVINO. OpenVINO toolkit is an open-source toolkit for optimizing and deploying AI inference, maintained by Intel. This will be a comprehensive guide of my whole journey so that I can refer that to GSoC enthusiasts in the future.

At the Beginning

I first heard about Google Summer of Code (GSoC) 2–3 years back, thought it was an amazing initiative but also was sure that cracking GSoC was totally out of my league. After all, what does a non-CS background inexperienced coding enthusiast 1st-year Mechanical Engineering student has to do with one of (if not) the most prestigious international open-source coding challenge, organized by Google, where some of the competitors were full-time CS-major MSc/PhD level experienced programmers, who have had several years to prepare and ace their development skills. But I always had a special spot for the open-source projects like Linux, PyTorch, etc. that I used on a daily basis.

Then as time went on, I took coding (more precisely Machine Learning) more seriously as well as planned to fully focus on shifting to Data Science/Deep Learning in the future. I was reminded of Google Summer of Code 2023 at the beginning of February and checked that the 2023 GSoC timelines were already published. I just thought, why not give it a try? You never try, you never know if you have the capability for something or not.

I did some research, read blogs, and watched YouTube videos, and some steps I can suggest for the next GSoC enthusiasts are:

Read the official GSoc Announcement details carefully. Every year some small details change about GSoC, so keep yourself informed.
Select the development domain you want to work in. Ask yourself and fix what your preferred language(e.g.: Python, C++, Java, etc.) and domain(e.g.: Web, Android, Deep Learning, BlockChain, etc.)
As you need to start preparation even before Google announces the approved organization, your best bet is to shortlist (at max.) 3 organizations that are in your preferred domain as well as participating in GSoC at least for the last 3 consecutive years(because those are the ones that have a higher chance of joining this year also). You can use this amazing site for the research and history of organizations.
Visit the shortlisted organizations’ dedicated page for GSoC as well as codebase (GitHub), and check if you can understand and align with their goals.

As I had plans to work on a Machine Learning project using Python, I shortlisted two organizations, one of which was the OpenVINO toolkit. It is maintained by Intel and works for model optimizations through various modern optimization techniques as well as inference in various types of devices including CPUs, GPSs, and special DNN accelerators.

Approach for Communication

Once I visited the main toolkit repository, I was a little intimidated as most of it was written with C++ which I did not have much experience with. However, in the previous year’s GSoC Page of OpenVINO, I found that there were a number of projects in previous years that mainly required Python. So I looked for what prerequisites they had for participants the previous year.

Some suggestions for future applicants once you have selected an organization even before official announcements:

Some organization publishes pre-requisite tasks as well as some guidance for the participants, so try to find those. For example, we had to submit a PR with the conversion of some models to OpenVINO IR format.
Sometimes, if you are too early, and they haven’t updated the requirements yet, you can find the previous ones. Usual requirements are like some preferred background, or a specific type of contribution to some code base so that the mentors can evaluate the quality of the code.
Try to make yourself familiar with the maintainers of the project. You can write an introduction post as well as why you feel that organization’s goal aligns with you.
You can suggest some ideas for improvements to the projects if you feel something can be enhanced. It does not matter if it is accepted or not, but you will get more chances to converse with the maintainers as well as show them your enthusiasm for the project.

Reach Out to Project mentors

Those organizations usually give a list of expected projects, details about the requirements, and mentors of the respective projects. Mentors are the ones who will play a prominent role in selecting you as well as guiding you throughout the project. So,

Once you are done with the common prerequisites by the organizations, you should reach out to the mentors.
Some organizations, give special guidance about interacting with the mentors, if those are present, you should follow. Otherwise, reach out and ask the mentors if you can do additional tasks, or solve any initial issue, that will be useful for him/her to evaluate your skills as well as for the project.
If you have any pet projects related to the target domain, please mention them and discuss them with the mentors.
Try to keep frequent communication with the mentors, but do not annoy them.

Project Proposal

This is one of the most important steps, for the contributor. To write a well-prepared project proposal is a mixture of adhering to the organization’s guidelines, collaboration with the mentor, and your own organizational skills.

Most organizations provide some instructions about the expected structure of the proposal.
For guidance, you can check out other GSoC proposals from previous years. If available, prefer the ones that were submitted to your intended organization.

This was my project proposal, that got me accepted. Once I finished preparing this, I forwarded it to my proposal for an unofficial review, and he gave me some amazing feedback.

My Project

The main goal of my project Showcasing the performance of PyTorch Image Models (Timm) with OpenVINO is the integration of the Timm library (hosting 1100+ pytorch computer vision models, as of writing this) into the OpenVINO ecosystem. As Timm is now part of huggingface ecosystem, the integrations are done through Optimum-intel, the interface between the Huggingface libraries, and the different tools and libraries provided by Intel(OpenVINO) to accelerate end-to-end pipelines on Intel architectures.

Background

Timm is a collection of pre-trained and optimized models for deep learning in computer vision. By providing a wide range of state-of-the-art models with ease of use, it encourages research and development in the field of computer vision, making cutting-edge technology accessible to both professionals and enthusiasts.

OpenVINO is a toolkit designed to fast-track the development of high-performance computer vision and deep learning inference. By offering optimization across multiple hardware types, including CPUs and GPUs. OpenVINO allows for efficient deployment in various environments.

The integration of Timm with OpenVINO combines the robust and accessible models from Timm with the high performance and flexibility of OpenVINO. This synergy enables enhanced performance and scalability, making it an ideal solution for various applications ranging from research to production deployment.

My Contributions

As most Timm models are for Image Classification or feature extraction, the integration is done through OVModelForImageClassification, so the user can load models like any other Huggingface models from HuggingFace Hub. Note that, previously the attempt to load the models in this way raised a number of errors and unexpected behaviors, similar to loading through AutoModelForImageClassification some of which are mentioned here.

Example Usage:

import requests
from PIL import Image
from optimum.intel.openvino.modeling_timm import TimmImageProcessor
from optimum.intel import OVModelForImageClassification

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
model_id = "timm/vit_tiny_patch16_224.augreg_in21k"
preprocessor = TimmImageProcessor.from_pretrained(model_id)
inputs = preprocessor(images=image, return_tensors='pt')
ov_model = OVModelForImageClassification.from_pretrained(model_id, export=True)
outputs = ov_model(pixel_values=inputs["pixel_values"])

Additionally,

A feature extractor TimmImageProcessor has been implemented, as transformers didn't have any dedicated feature extractor/image processor to handle the processor according to the provided model config.
Several tests have been added, to test these new features.
Relevant changes to the docs.

All my contributions are combined together in this single PR.

Major Challenges

One of the main challenges I faced was, though Timm is now under the HuggingFace ecosystem, as both the transformers and timm library evolved totally separately, and so a large number of abstractions work very differently. However, the optimum-intel library’s components were written totally parallel to the transformers abstractions.

So, I was facing a lot of errors, especially when attempting to use timm models through existing classes. Some of those errors I have added to the transformers repository as an issue here.

Creating our own Abstractions

Every model from huggingface-hub loaded and exported through OVModelForImageClassification usually expects models and config through a specific abstraction structure.

We finally created our own wrapper TimmModel for timm, similar to the models from huggingface-hub, but gets its configuration information from the (differently structured) timm model configs, contained in a TimmConfig object. Finally, as we needed to preprocess the images according to the model’s requirements, we also implemented an image-processor TimmImageProcessor.

In order to test our added features, two additional unit tests were also introduced.

test_timm_save_and_infer: Exports a model to OpenVINO IR format, and then saves the model in the local directory, then loads the model again from the local copy.

test_compare_to_timm: Loads the same model directly using timm.create_model as well as with OVModelForImageClassification.from_pretrained, and compares the output of the models for a single input diverge from each other.

Achievement and Benefits

Participating in GSoC has a number of benefits, such as:

Being introduced to Open-source culture and workflow.
A generous amount of stipend from Google depending on the project length

Being a part of the OpenVINO toolkit also had some extra perks.

Being introduced to a number of excellent ML Engineers from Intel.
An awesome Swag Hoodie! (which I sadly could not redeem due to some complexities of my country’s custom retrieval system)

Some important technical lessons I have learned from this GSoC project would be:

Diving deep into the structure and abstractions of Huggingface’s transformers library.
Writing my first unit test.
Being introduced to ML model optimization frameworks like OpenVINO and Optimum-Intel.
Being familiar with industrial practices like code reviews and collaboration of different organizations(Intel and Huggingface in this case).

Finally, it was a huge push to my self-confidence, that I was able to crack GSoC as well as contribute to the amazing OpenVINO organization.

Wrap up

I still can’t believe the GSoC’23 period is almost at its end. I really want the express my gratitude to my mentors Alex and Liubov, for answering all my queries (often silly ones!). Both of them are amazing engineers as well as super-friendly and patient human beings. Learned a lot from them about how to design and approach such a large code base as well as from their experiences.

Another huge thanks to Adrian for managing all the GSoC projects as well as the demo meetings with so much patience and perfection! Finally to all the mentors of different OpenVINO GSoC projects. All of you are doing a spectacular job.

Finally, if the reader (a prospective GSoC applicant) has reached this point, I just want to add that the main motivation of Google behind organizing GSoC is to introduce new contributors to these amazing open-source projects. So, the ending of GSoC is not actually an ending, just the beginning of a journey.

Have a great day!