Intruder Detection using Image Processing (Research)

Adya Tiwari
12 min readJul 17, 2023

--

Introduction
In the present world, ensuring the safety and security of various surroundings is a crucial concern, and intrusion detection systems have become essential for protecting residential and commercial buildings, as well as public areas, from potential dangers and unauthorized access. The development of digital image processing technology has allowed for increasingly sophisticated and reliable intrusion detection systems.
The objective of this project is to develop an intruder detection system that can successfully detect intruders during both day and night time and produce a noise-free output with clarity. The system collects image or video data using surveillance cameras, preprocesses the data to remove noise or unwanted elements, detects objects present in the data using object detection algorithms, tracks the objects over time using object tracking algorithms, and finally, identifies any suspicious activity based on the movement and behavior of objects using intrusion detection algorithms.
The use of image processing for intruder detection has gained significant attention in recent years for its potential to enhance security and surveillance systems. The proposed intruder detection system utilizes image processing techniques to detect and track intruders in a video stream. The system involves preprocessing steps such as image acquisition, image enhancement, noise removal, and background subtraction, followed by object detection and tracking using algorithms such as connected component labeling and centroid tracking. The system can detect intruders based on their size and shape and generate alerts to prompt a rapid response from security personnel. The proposed system offers numerous advantages, including high accuracy, speed, automation, and reduced human burden and potential for error.
Overall, the proposed system has the potential to revolutionize security and surveillance systems and significantly enhance public safety. By utilizing image processing techniques, the system offers high accuracy, speed, and automation while reducing human error and burden. The incorporation of real-time monitoring and alerting mechanisms ensures immediate notification to security staff or property owners as soon as an intrusion is detected. The use of computational digital image processing algorithms enables effective detection and classification of intruders, making this system a valuable addition to security and surveillance systems.

Advantages

  • Automation: By automating the detection process as well as tracking the intruders this methodology ensures the reduction in the need for human security for monitoring the video feed in real-time and in turn reducing the risk from the error caused by humans.
  • Speed: Since the whole process is automated it allows the detection process to be fast and reduces the response time as compared to a manual surveillance system and the intruders could be easily detected and tracked within real-time which in turn allows the fast response to any threats.
  • Accuracy of the system: All the computer vision algorithms that have been used in this methodology are highly accurate allowing the detection as well as the tracking process to be more reliable and consistent.
  • Smooth working: This methodology can work even if environmental factors such as lighting change.

Workflow

Several steps are involved in the methodology that leads to the effective detection and tracking of intruders.

Figure 1: Workflow of the proposed methodology
Figure 1: Workflow of the proposed methodology

Below is the detailed workflow of the system.

  • Video Acquisition: This is the very first step which involves capturing the video and extracting the frames through the camera which will be further processed to detect and track the intruder.
  • Image Gray-scaling: The frames which are being captured contain the colored images in the RGB (Red, Green, Blue)format so this step is for converting those colored images into gray-scale images i.e. into black and white. It is an important step as it reduces the processing complexity by lowering the amount of data that is required for processing in turn increasing the speed and accuracy of the algorithms that are used in the subsequent steps Fig.2. shows the output of an image after converting it to a grayscale image.
Figure 2: Gray-Scaling
  • Noise Removal: The output of image gray-scaling will be the input for noise removal. It is important to remove noise from the image before it goes for further processing as the grayscale image may contain some kind of visual noise that can be caused by low lighting conditions or due to camera quality. So over here, the Gaussian filter is being applied for removing noise from the image. Fig.3. shows the output of an image after applying the filter into it.
Figure 3: Gaussian Blur
  • Background Subtraction: Once the image is denoised it is ready for further processing in background subtraction. In this step, the background of the object is removed and the foreground image is obtained.
    The detection of objects that are moving is realized by the subtracted background image to distinguish the frames of the video in which there is movement and which one is stationary. Fig.4. shows the output of an image after applying background subtraction.
Figure 4: Background Subtraction
  • Morphological Operation: The foreground image that has been obtained previously contains the real moving object and so it may have some noise and interference area which may pave the way for incorrect tracing and detection of the object it is necessary to revise the image, and therefore morphological operations such as — opening and closing, are being applied on the image. These operations will smooth the image contours and fill the holes and thin gulfs and join the breaks and gaps in the image if any, this will help in representing the descriptions of the shape of the region such as boundaries.
  • Object Detection and Tracking: Once the image has been preprocessed, morphological opening and closing object detection and tracking algorithms will be applied to detect the intruder from the scene and track it.
    After the closing and opening are applied, Labeling connected components will be done which will help in obtaining the location of the object. Labeling connected components is useful in detecting the regions that are connected in the binary image. It will first scan an image and then based on pixel connectivity will group its pixels into the components. So basically, For any foreground pixel(p) all the sets of the foreground pixel that is connected to it will be called the connected component containing p. After the labeling connected component the next move usually is to describe and represent the aggregate of the region as the centroid of an object does not change or move with the rotating object or noise, It is a kind of global description and its location is based on the pixel of the connected component. The centroid of the object is on the center of it so it will be used to label the object and track them.
  • Intruder detected: Once the object gets detected based on its shape and size it can be classified as an intruder and non intruder.

Technology

Techniques: Four major techniques are used in the above methodology that are Image Gray-scaling, Background subtraction, Morphological Operation, and object detection and tracking. Below is a detailed description of these techniques.

  • Image Gray-scaling: In grayscale images, the brightness of each pixel is represented by a single value, ranging from 0 (black) to 255 (white). This is useful in image processing as grayscale images are easier to process and require less storage space than colored images. The Average method is being used for gray-scaling in which the addition of the values of all three color channels are divided by three and the following values are assigned to the image. This process is repeated for every pixel in the image to obtain the grayscale image.
  • Background Subtraction: In intruder detection, background subtraction is a key step as it allows the system to detect moving objects in the scene. The background image is typically obtained by averaging the pixel values of several frames of the video sequence, which results in an image that represents the static parts of the scene, such as the walls and floor. This background image is then used as a reference for detecting moving objects. To obtain the foreground image, the current frame of the video sequence is subtracted from the background image using a pixel-wise subtraction operation. The resulting image contains the moving objects in the scene, which appear as white regions on a black background.
  • Morphological Operation: The resulting foreground image from background subtraction may still contain noise and other artifacts, such as shadows and reflections, which can interfere with object detection and tracking. To address this, morphological operations such as opening and closing can be applied to the foreground image to remove noise and fill gaps in the object contours. Here closing will be used in particular to fill gaps in the object contours and remove small holes in the object regions. The closing operation is performed by convolving a binary image with a structuring element, which defines the shape and size of the neighborhood around each pixel. In the dilation step, the structuring element is expanded to cover the neighboring pixels in the binary image, enlarging the object regions and filling in small holes. In the erosion step, the expanded structuring element is then contracted to its original size, removing any small objects that were created in the dilation step. The result is a binary image with smoother object contours and fewer holes.
  • Object detection and tracking: Object detection and tracking using connected pixels is being used to identify and track objects in an image or video sequence. Once the background subtraction is done to obtain a binary image that represents the foreground objects. The resulting binary image is then analyzed using connected component labeling, which assigns a unique label to each group of pixels that belong to the same object. It involves analyzing the pixel connectivity of the foreground objects in a binary image obtained from background subtraction. Pixels that are connected, either horizontally or vertically, are assigned the same label to form a connected component. This process is repeated for all foreground objects in the binary image until all pixels belonging to each object are grouped and have the same label. The resulting labeled image contains the same objects as the original binary image but with additional information about which pixels belong to which object. The labeled image is then used for object detection and tracking. Once the object in the image has been labeled, the next step is object detection. The result of object detection is a set of objects identified in the image, each with its own set of attributes such as position, size, shape, and orientation. The next step is object tracking, which involves analyzing the position and shape of the objects in each frame of the video sequence to determine their trajectory. By tracking the objects over time, it is possible to determine their speed, direction, and acceleration, as well as predict their future positions.

Result and Analysis
The result of implementing an intrusion detection system using image processing with day and night vision, along with alert generation, depends on various factors such as the algorithms used, dataset, system configuration, and the evaluation criteria chosen.
In the result and analysis of this project, we have seen that the intruder can be detected efficiently using an intruder detection code. As an output, we are able to see precisely where the intruder is detected in real-time.
As an analysis, we can see that this methodology and results are far better than some of the studies published where intruders are detected using other methods. One work suggested using a pixel-by-pixel comparison method for every frame of the video but this has a lot of drawbacks like the computation is abruptly big. This delays the calculation and does not allow the response to be real-time.

This study on the other hand shows the potential of motion detection for robust intrusion detection in challenging lighting conditions. It also proposes a real-time intrusion detection system that combines adaptive background subtraction and object tracking techniques. It focuses on accurately detecting and tracking intruders in dynamic environments. The system achieves high detection rates while minimizing false positives.
The methodology also relies on simpler algorithms and techniques, which typically have lower computational requirements compared to more advanced methodologies. This makes it more feasible to run on devices with limited processing power or in resource-constrained environments. This also helps with having a real-time experience. The methodology also does not rely on complex deep learning models or extensive training data, which can be costly to develop and deploy. Instead, it utilizes traditional computer vision techniques that are more accessible and cost-effective.
Another work suggested using an image subtraction method where two frames get subtracted. This resolved the real-time issue but can give false positive results.
This work on the other hand uses morphological operations and object tracking to give more accurate and real-time results. It also uses histogram operations to detect noise accordingly and apply appropriate filters for respective noises. This reduces false detection.

Below you can see a histogram for a CCTV video recording. Histogram analysis was done to study and analyze the noise over the image so that a suitable filter could be applied to remove it. So the output of the histogram analysis is depicted in Fig 5.

Figure 5: Histogram of day vision

Fig5. depicts the histogram graph for day vision and it can be seen from the graph that speckle noise is being distributed majorly with the impact of it in the whole image frame. The distribution of speckle noise is often non-Gaussian, with tails that are heavier than a normal distribution. This means that the noise values can deviate significantly from the mean brightness level of the image, resulting in a distribution that is skewed towards the high and low ends of the range of pixel values. Based on the histogram Gaussian filter was the suitable filter to remove the speckle noise as it can produce a smoother image.

Figure 6: Division of Zones

Fig6. shows a set of pre-established zones utilized that correspond to the distance of an object from the camera. These zones are categorized into an entrance zone, a main zone, and an exit zone, based on the range of distances from the camera as shown in Fig.6. To determine which zone an object belongs to the distance of the object from the camera using its height in meters and the focal length of the camera in pixels, was calculated By comparing the calculated distance to the predefined zone ranges, the code can identify the zone that the object is in. If the object’s distance falls within one of the predefined zone ranges and its aspect ratio is within the specified range, a green bounding box around the object will be drawn. Additionally, the name of the zone will be displayed next to the bounding box as shown in Fig.7 and Fig.8.

Source code
I am providing the GitHub repository link below. The code is very simple and is written in Python. It also uses some Python libraries which make it fast and usable.
https://github.com/adya14/Intruder-detection-system

Output

Figure 7: Intruder detected in day vision
Figure 8: Intruder detected in night vision

Conclusions and Future Scope
In conclusion, the implemented intrusion detection system using image processing with day and night vision, along with alert generation, has shown promising results. The system efficiently detects intruders in
real-time, providing precise information about their location. For the future scope, we can include deep learning methods using which the intruder can be detected more precisely and can also tell in which part exactly the intrusion detection is done. This is one of the main shortcomings of this methodology and can be improved using deep learning methods.
Classification of the intruder can also be a part that can be included in future scopes which includes identifying the intruders and classifying them into vehicles, humans, animals, etc. This will help in understanding exactly who can be the possible intruders.

--

--