Content Based Image Retrieval Using Deep Learning

san rock
9 min readMay 30, 2020

--

SANTOSH S
M.Tech, Computer Network Engineering
Dr. AIT College, Bengaluru
Karnataka, India
shrisanthosh95@gmail.com

Dr B S SHYLAJA
Professor
Dr. AIT College, Bengaluru
Karnataka, India

Abstract — In many image processing apps, using Convolutional Neural Networks (CNN) with deep learning conducted an outstanding performance. Similar pictures can be identified by using CNN-based methods to obtain picture characteristics from the final layer and using a single CNN framework. Extraction of the learning function and efficient comparison of similarities includes the Content-Based Image Retrieval (CBIR). Extraction of features, as well as similarity measures, play a crucial role in CBIR. Two datasets like Dataset UC Merced Land Use and SceneSat Dataset perform the experiments. Use a pre-trained, millions-trained model of pictures and tailor-made for the assignment of recovery. Pre-trained CNN models are used for the recovery method to generate image feature descriptors. This technique deals with the extraction of features from the two fully linked layers that are used in the VGG-16 network different similarity steps to transfer vector learning and retrieval. The suggested architecture shows excellent efficiency in extracting characteristics and learning characteristics without previous understanding of the pictures. The outcomes are assessed and the performance comparison was made using different performance metrics.

  1. Introduction

Identifying a suitable collection of pictures near an input picture. Content-based image retrieval is the next step towards keyword-based frameworks where pictures are retrieved based on their content data. The efficiency of recovery of a content-based image retrieval scheme relies primarily on the two variables; 1) representation of features 2) measurement of resemblance. Convolutional neural networks are a class of learning architectures that can be used in apps such as Image Retrieval, Image Classification, Image Annotation, Image Recognition, etc. Inspired by the excellent accomplishment of the excellent achievements of the technology in this project with deep learning algorithms, they used the pictures to be retrieved. It involves how pictures can be represented and organized based on the picture of the input request. The primary aspect of image retrieval based on content is the method of extraction of features. Some picture characteristics will include colour, texture and shape from which to determine pictures in content-based image retrieval. For accurate picture retrieval, 100% matching with the request is performed and the retrieval depends on the content or characteristics of the picture in the appropriate picture retrieval.

  1. Implementation

Modules description:

A. Images Acquisition

Images acquisition in image processing can be widely described as the retrieval of an picture from some source so that it can be passed through whatever procedures need to follow. The picture was acquired, distinct processing techniques can be applied to the picture to accomplish the many distinct duties of vision.

B. Pre-processing

The suggested technique involves image extraction from the 2 fully linked pre-trained layers of the CNN model feature vectors. Because the datasets only had fewer pictures, ImageNet model’s can be used for pre-trained CNN weights recovery phase. Using pre-trained models trained on huge millions of image datasets, it is possible to use the weights directly and architecture learning and apply the learning to the CBIR tasks. This is learning transfer that transfers learning “according to the pre-trained model particular declaration allocated to the issue. By using transfer learning, model pictures that are distinct from or outside the ImageNet model can be generated. The model used here for pre-trained CNN is VGG-16.

C. Features Extraction:

The input image is given to a pre-trained CNN and features were extracted from the last two fully connected layers of the VGG-16 for both the input image and database. The input picture provided as a request can be from any source, and the datasets do not need it at all. The query picture from the datasets is drawn here. The VGG-16 has 3 fully linked layers that extract the characteristics the yield classification layer is Fc1 and Fc2 as well as the final layer. Due to the fine tuning of the pre-trained model by using transfer learning. It is the method that removes the final classification layer of the output and uses the rest of the architecture as a fixed extractor of features.

D. Detecting Similarities

Using different similarity metrics, these extracted characteristics, the query was compared, input picture based on picture storage characteristics. A threshold value is set to sort images that are familiar to the picture input and that are will not be familiar on the basis of the value of familiarity measures taken. When similarity value reaches above the limit will therefore will be filtered out and the suggested mode will deals with the assessment of image resemblance using different stretch metrics.

E. Statistical analysis

In order to assess and compare the efficiency of the CBIR assignments using the suggested technique, several studies were performed on the two datasets. Several studies with several queries were performed on pre-trained network VGG-16 qualified with ImageNet dataset and the outcomes consistent with the query picture were filtered based on their declining resemblance. The size of the pictures is 224x224. The function vectors obtained from the VGG-16’s Fc1 and Fc2 and both the query’s extracted features as well as the index collection. The pulled out functions from both the problem and the index are provided as inputs to the parallel metrics and using distinct parallel metrics, the range measurement, they are drawn on the basis that the images of top 5 and the top 10 comparable images are taken to assess the performance of the scheme.

  1. III.Feasibility Study

At this point, the feasibility of the project is assessed and a very overall project plan and some cost estimates are submitted to the business proposal. During the assessment of the system, a feasibility study will be carried out on the suggested system. This is to guarantee that it does not burden the business with the suggested scheme. Some knowledge of the main system requirements is crucial for feasibility evaluation.

There are 3 main important feasibility considerations study are:

A. Economical Feasibility

This research is conducted to verify the organization’s financial effect of the plan. The business can take a restricted quantity of money into the system’s R&D. The expenses have to be justified. As a consequence, as most of the methods used are freely available, the sophisticated scheme was also implemented within the budget. Only the customized products were required.

B. Technical Feasibility

This study is carried out to confirm the technical feasibility, i.e. the technical requirements of the system. Any sophisticated system should not have a powerful demand for the available technical resources. This will result in heavy requirements on the technical resources available. This will result in elevated customer requirements being placed. The system created must have a humble requirement, as the implementation of this scheme requires only minimal or null modifications.

C. Social Feasibility

The study aspect is to check the user’s scheme’s acceptance rate. This includes the method of user training to use the system efficiently. The system should not put the user at risk, but must recognize it as a requirement. The level of user acceptance relies exclusively on the techniques used to teach and familiarize the user with the scheme. His confidence level it must be raised in order to generate some positive criticism which is welcomed as the ultimate customer of the system. Times New Roman or the form of the symbol (kindly do not use any other font). It may be necessary to treat the equation as a graphic and insert it into the text after your paper has been styled to create multileveled equations.

  1. IV.System Analysis

A. Existing System:

There are many methods to extract picture characteristics from the database of the image. One way is to use handmade characteristics like the color histogram, the gradient-based histogram. The use of histograms in color and space, Local binary patterns (LBP) Gabor filters, dual-tree complicated wavelet transform, Gradient Histogram (HOG) and GIST is a different way to use a handmade Global descriptor.

The repossess of pictures involves indexing, searching and visualizing the pictures that have been obtained. There are many techniques available to describe an image’s visual content. These descriptors are further split into characteristics learned and handcrafted. Handmade descriptors use a predefined algorithm to extract characteristics where descriptors are instructed manage the functions acquired with CNN. Varies techniques include SIFT and SURF, which have proven to be powerful for apps for picture recovery. SIFT is a handmade descriptor locally and will assist locate salient patches from a selected picture main points.

B. Problem Statement:

The most difficult problems in the advancing CBIR fields are the semantic gap between human-eyed high-level semantics and computer-captured low-level images.

C. Proposed System:

The suggested technique involves extracting image feature vectors from the pre-trained CNN model’s two fully linked layers. Because the datasets only had fewer pictures, the ImageNet model’s can be used with pre-trained CNN weights in the recovery phase. Using pre-trained models trained on vast millions of picture datasets, weights can be used directly as well as architecture learning and teaching can be applied to CBIR assignments. This is transfer learning, which is ‘’ transfer the learning ‘’ according to the particular issue declaration allocated to the pre-trained model. Model images that are separate from or outside the ImageNet scheme can be produced by using transfer learning.

The model used here for pre-trained CNN is VGG-16. Two distinct remote sensing image datasets were set up for the experiment. The testing method was provided the entire dataset. The characteristics are obtained a pre-trained model with convolutionary layers, max-pooling layers and fully interconnected layers. The model’s fine tuning using both datasets was first technique introduced. Model good tuning with the two datasets is required using learning to transfer. First, pre-trained CNN weights are initialized randomly, followed by information set training. Then the weights obtained were directly used to train two datasets.

V. System Design

A. System Architecture:

Fig:1 Overall system flow diagram

VI. System Specification

A. System Requirement

  • Software Pre-Requisites:
  • OS : Windows 7 and above
  • Language Code : Python 2.7.14
  • Python IDLE
  • Hardware Requirements:
  • System : Pentium i3 and above
  • Hard Disk : 40GB and above
  • Monitor : 15’’LED, 15 VGA Colour
  • Input Device : Logitech
  • Ram : 8GB

VII. Result and Screenshot

Fig:1 GUI of CBIR

Fig:2 Uploading the Query Image

Fig:3 Displaying the Query picture

Fig:4 Search for relevant Ouery image

Fig:5 Output of the relevant query image

VIII. Conclusion

The efficiency and precision of the picture recovery scheme will be increased by initialization of ImageNet pre-trained for fresh pictures. With the models pre-trained and also by means of teaching transfer, new images can be obtained with better outcomes. The comparable pictures are obtained by calculating the resemblance of characteristics combined together to completely linked surface. However image features is extracted with help of canny edge detector.

A. Final Outcome:

Result Outcome:

  • Accuracy in retrieving the similar type of query images.
  • We can implement in Google image search, digital library, Photo archives, catalogs for retail, Face Finding, Industry of Textiles.

Future Enhancement:

  • Query Image size cannot be changed its fixed to 224*224 pixel.
  • Training time i.e., indexing the images takes longer span.

XI. References

1. Liu H, Liu B, Lv X and Huang Y Image Retrieval Using Fused Deep Convolutional Features. Procedia Computer Science in 2017, 749–754.

2. Chen J, Wang Y, Luo L, Yu J.G and Ma J Image retrieval based on image-to-class similarity. Pattern Recognition Letters in 2016, 379–387.

3. Wu S, Oerlemans A, Bakker E.M and Lew M.S Deep binary codes for large scale image retrieval. Neurocomputing in 2017, 5–15.

4. Alzu’bi, A., Amira, A. and Ramzan N Content-based image retrieval with compact deep convolutional features. Neurocomputing in 2017, 95–105.

5. Simonyan, K. and Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 in 2014.

6. Yang, H.F, Lin K and Chen C.S Cross-batch reference learning for deep classification and retrieval. Proceedings of the ACM on Multimedia Conference, 2016, 1237–1246.

7. Zhu H, Long M, Wang J and Cao Y Deep Hashing Network for Efficient Similarity Retrieval. AAAI in 2016, 2415–2421.

8. Cao Y, Long, M, Wang J, Zhu H and Wen Q Deep Quantization Network for Efficient Image Retrieval. AAAI in 2016, 3457–3463.

9. Cao Y, Long M, Wang J, Yang Q and Yu P.S Deep visual-semantic hashing for cross-modal retrieval. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining in 2016, 1445–1454.

10. Cao Y, Long M, Wang J and Liu S Deep visual-semantic quantization for efficient image retrieval. CVPR in 2017.

11. Lin K, Yang H.F, Hsiao J.H and Chen C.S Deep learning of binary hash codes for fast iimage retrieval. IEEE Conference on Computer Vision and Pattern Recognition Workshops in 2015, 27–35.

--

--