Week 3 — Project LEAFS
Hi everyone,
Last week, we talked about our data collection method and dataset. As we mentioned, the dataset contains some duplicated and watermarked images. This week, we deleted these duplicates and cleaned the watermarks.
Deletion of Duplicates
We are planning to use YOLOv5 for student attitude detection. YOLOv5 is a CNN structure that is developed by distributed developers. CNN is translation invariant, which means that the system produces exactly the same response, regardless of how its input is shifted. So, duplicated images won’t do anything but increase the dataset size for YOLOv5.
Cleaning of Watermarks
YOLOv5 should be able to handle the low level of noise but in our dataset, some of the watermarks mask the significant features which provide the distinction between labels. Therefore we cleaned up these watermarks that could cause trouble as much as possible.
Next Week
This week, the collection and cleaning of the dataset steps are finished. Next week, we are planning to label the small samples of images from each class and test the YOLOv5 model with these labeled samples to gain insight into YOLOv5 models depending on the number of images.