Creating a C++ application to search images for text

Evans Ehiorobo
How-Tos
Published in
5 min readOct 21, 2019

We’ve all been in this situation before - where we are trying to remember where we saw something or read about something but can’t. Sometimes, what we are trying to recall may be in text form so it’s easy to just search for it, but what if it was in an image?

In this article, I will describe how to build an application to search all the images in a folder for a piece of text with C++.

Getting the text in an image

To determine whether an image contains a piece of text, we need to first extract all the text from the image, then we search the extracted text for the query. To extract the text in an image, we will be using the code from this article where I described how to perform Optical Character Recognition (OCR) on an image which would give the text in the image. To reuse the code, I created a header file for it (called basicOCR.h):

#ifndef BASIC_OCR_H
#define BASIC_OCR_H
#include <string>
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <opencv2/opencv.hpp>
std::string getText(std::string imagePath);

#endif

where getText() is the function that extracts the text in an image. It takes in the path to the image as its argument, performs OCR on the image and returns the text in the image.

We can now reuse this getText() function by including basicOCR.h:

#include <iostream>
#include "basicOCR.h"
using namespace std;bool contains(string str1, string str2) {
// Returns true if str1 contains str2, else false
return (str1.find(str2) != string::npos);
}
bool imgContainsText(string imgPath, string text) {
// Returns true if the text in the image contains the specified text, else false

string imgText = getText(imgPath);
return contains(imgText, text);
}

In the above, we have included the basicOCR header file, then we defined two functions — contains() which simply determines whether one string contains another, and imgContainsText() which gets the text in the image using the getText() function, then checks if that text contains the query the user is searching for.

Searching multiple images

We want this application to be able to search all the images in a folder for the text the user entered. To help us get the images in a folder, we will be making use of the dirent.h library.

#include <dirent.h>...int main(int argc, char **argv) {
DIR *dir;
struct dirent *ent;
char *imFolder = argv[1];
string fileName;
if ((dir = opendir(imFolder)) != NULL)
{
while ((ent = readdir(dir)) != NULL)
{
fileName = ent->d_name;
// Do something...
}
closedir(dir);
}
else
{
/* could not open directory */
perror("");
return EXIT_FAILURE;
}

In the code above, we allow the user to pass in the path to the folder containing the images as a command-line argument, argv[1], then we call opendir() — a function from the dirent library — on the path supplied which returns a pointer to the directory.

If the result is not NULL, then we call readdir() on the directory which returns a collection of all the files and folders in the directory. The property ent->d_name gives us the name of each file or folder.

However, if the result is NULL, then we return an error message saying we could not open the directory.

With the above, we are able to get all the files in a folder, but we need only the files that contain images. So we’ll create the following functions:

bool endsWith(string s, string endString) {
// Returns true if s ends with endString, else false

return s.length() >= endString.length() && s.substr(s.length() - endString.length()) == endString;
}
bool isImage(string fileName) {
// Returns true if the file is a GIF, PNG or a JPG image, else false
return endsWith(fileName, ".gif") || endsWith(fileName, ".png") || endsWith(fileName, ".jpg");
}

The first function, endsWith(), helps us determine if a string ends with another string, while the second function, isImage(), helps check whether the file name of a file ends with an image extension. For simplicity, we will only be working with three image extensions: GIF, PNG and JPG.

We are now in a position to search all the images in the folder for the given query:

int main(int argc, char **argv) {...string query, results = "";cout << "Enter the text to search for: ";cin >> query;while ((ent = readdir(dir)) != NULL)
{
fileName = ent->d_name;
if (isImage(fileName) && imgContainsText(imFolder + fileName, query)) {
results += fileName + "\n";
}

}
closedir(dir);
...

}

and print the result to the console:

if (results.length() > 0) {
cout << "The text was found in: \n" + results;
}
else
{
cout << "Sorry, the text was not found in any image.";
}

Wrapping up

We have successfully created an application that can search all the images in a folder for some piece of text which the user will provide. Now, let’s test the application. Below are the images we will be searching:

random.jpg
wallpaper.jpg
asleep.jpg

Here’s the console output when I run the application and search for the text “to”:

Enter the text to search for: toThe text was found in:
asleep.jpg
wallpaper.jpg

and if I search for “text”, I get:

Enter the text to search for: textThe text was found in:
asleep.jpg
random.jpg

However, if I search for “Medium” which isn’t in any of the images, I get:

Enter the text to search for: MediumSorry, the text was not found in any image.

This shows that the application works as expected. Thanks for following this far. You can find the full code for this project here. If you have any suggestions or comments, please drop them below.

--

--