How Scene Graph Helps in Robust Image Retrieval

Sandeep S Kumar
Razorthink AI
Published in
2 min readNov 30, 2018

Scene Graph is a data structure that describes contents of a scene. A scene graph encodes object instances, attributes of objects and relationship between them.

Source: https://hci.stanford.edu/publications/2015/scenegraphs/JohnsonCVPR2015.pdf

Current Image retrieval pipelines are based on mainly on two types Content based and tag based. When it comes to content based User will be giving a search image and related images will be given back. In tag based every image will be assigned a tag. So if you want to do a descriptive query like “ Boy holding a baseball bat wearing a blue cap ” most of the images returned by both methods will be false positives. So using new approach called scene graph query we can resolve the problem. Because a scene graph itself a graphical representation of an Image. Each object in the Image will be considered as nodes and relationship between objects will be considered as edges. By using scene graph one can promise robust image retrieval and can overcome the time and work spent in tagging images.

Recently Conditional Random Field (CRF) models have become popular owing to their ability to directly predict the segmentation/labeling given the observed image and the ease with which arbitrary functions of the observed features can be incorporated into the training process. Click here to read more about How are Conditional Random Fields applied to Image Segmentation?

--

--