Fine-Tuning for Image Classification using Transformers
For a presentation at DataHour, I needed to train a Vision Transformer model on images of Indian Food (more about the talk here).
The overall approach is based on this blog post: Fine-Tune ViT for Image Classification with 🤗 Transformers
I started with this dataset of Indian Foods. I then walk through preparing the dataset, preprocessing the images, using transfer learning for training, and inference/predictions of the final model.
You can find the Colab Notebook: https://bit.ly/raj_foodimage
Here is my GitHub repo: https://github.com/rajshah4/huggingface-demos/tree/main/FoodApp
I have a long YouTube walkthrough video here:
If you want a shorter introduction, check out my Tik Tok