Without having seen the code, i’m guessing the network was trained, using images of a fixed size. The network may prefer smaller image sizes, because it was trained using smaller images. The network may be too specialized, also known as “overfitting”, which in this case, means it doesn’t generalize well to other image sizes.
But that’s all guesswork. No facts here ;)