A new kind of pooling layer for faster and sharper convergence
Sahil Singla
1.3K11

What are the big jumps in the CIFAR-10 and 100 plots caused by? They occur at a round number, 40000 and occur at exactly the same iteration for every run. Is 40000 the number of samples you’re training on and the jump at the end of the epoch? I guess not as that would mean we are only looking at 1 or 2 epochs. Alternatively have you run the training for 40000 iterations and then restarted it from a snapshot? Or changed the learning rate?

Like what you read? Give James Griffin a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.