Relation Networks for Visual Question Answering using MXNet Gluon
Overview
Visual Question Answering (VQA) is a multi-modal task relating text and images through captions or a questionnaire. For example, with a picture of a busy highway, there could be…