They say in NVIDIA paper that they feed images as YUV planes, while you seem not to.
Ivan Kazakov
1

You can do YUV, i however put a simple color transformer network in front to skip the work of me actually having to specify colorspace :)

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.