I have some question on SSD:
Walid Aly

Hi, below are responses.

1- During prediction, you output bounding box and class scores for each convlayer you took predictions out from. Then you apply NMS to get final bounding box prediction.

2- final detection layer is nothing but convolution layer with 4+n_class+1 number of filters.

3- scale you can take care of in 2 ways, you can either scale the ground truth box down or apply deconvolution.

