K-fold file splitting for segmentation networks like U-net
If you are using Image data generator with U-net and you want to have a K-fold splitting for your data, Here is something that can help you.
There is no in-build functionality with Image data generator to have a k-fold splitting. One way is to use a flow_from_dataframe functionality, But that won’t work with any segmentation dataset which have image and mask.
The possible solution is to create a python code to do a random splitting of the files and corresponding mask. Also to save those images and masks in the sub-folders like “test” and “train”.
Okay, Now lets get started
Step 1 : Import os, shutil and random libraries
Step 2 : Give names for the sub directories where you want to save the files and set the root data directory where we have source images and masks.
Define the number of images that you would like to have in the test set. everything else will be saved to the train directory
File structure
Source file structure->data
---->images
---->maskDestination file structure->data-->dir1
------>test
----------->image
----------->mask------>train
----------->image
----------->mask-->dir2
------>test
----------->image
----------->mask------>train
----------->image
----------->mask.
.
.
.
Step 3 : Lets read the folder names and start segregating and moving files based on random function
Vola!!!!. It’s done. now you have 4 directories with files and masks split according to a random function in separate test and train folders. Now you can start training and testing it with your U-net for individual folders.