1 Introduction

The Intelligent Surveillance Support System (ISSS) is a sophisticated platform crafted to augment the surveillance infrastructure of public and private domains. ISSS integrates a host of features such as monitoring, recording, noise detection, motion detection, identification, and rectangle selection. The system’s user interface is effortlessly navigable and is developed using the Python’s Tkinter library. The software enables users to seamlessly interact with the graphical interface to conduct diverse surveillance operations, such as face recognition, alarm on theft detection, visitors in/out detection and motion detection.

Computer Vision is a highly evolving scientific field in the area of Artificial Intelligence which helps in giving the computer with human like vision capabilities. It is the study of how computers can gain sophisticated knowledge from digital image or video sources. Understanding and automating processes that are carried out by the human visual senses is beneficial to engineers. Intelligent Surveillance Support System is designed light-weighted so that it does not burden the hardware it is running on. However minimum required hardware for this would be: a working PC/Laptop, webcam with installed drivers or other camera sources like CCTV or USB connected cameras or wireless cameras connected to the PC/laptop and night vision needs flashlight or LED or night vision enabled camera.

The Intelligent Surveillance Support Systemworking in real time is quite a complicated task as the goal is to monitor the frame, identify the person, detect noises, detect the visitors in the frame of focus, in addition to general video recording in real time. This is achieved using different algorithms for each application. In this paper we have achieved an all in one light weighted solution for this, which can run in any type of python supporting devices with any basic camera setup.

The motivation behind the development of the Intelligent Surveillance Support System (ISSS) stems from the growing need for advanced and efficient security solutions in today’s world. Traditional surveillance systems often rely on manual monitoring and analysis of security footage, which can be time-consuming, error-prone, and overwhelming for human operators. The ISSS aims to address these limitations by leveraging the power of machine perception and real-time analysis.

By incorporating cutting-edge features such as face recognition, theft detection, visitors in/out detection, and motion detection, the ISSS significantly enhances the capabilities of surveillance systems. Face recognition technology enables the identification and tracking of individuals, allowing for proactive threat detection and efficient monitoring of personnel. Alarm systems triggered by theft detection help prevent and mitigate potential security breaches, while visitors in/out detection ensures accurate monitoring of people entering or exiting a premises.

Furthermore, the ISSS’s real-time motion detection capabilities provide instant alerts and notifications when unusual activity is detected, allowing security personnel to respond promptly to potential threats. By leveraging optimised algorithms available in the OpenCV library, the ISSS ensures efficient processing of video footage, enabling real-time analysis and decision-making.

The overarching goal of the Intelligent Surveillance Support System is to improve the overall efficiency and effectiveness of security and surveillance operations. By automating the detection and identification of potential threats, the ISSS minimizes human error, reduces response time, and enables proactive measures to enhance safety and security in public and private spaces. With its potential to revolutionize the field of security, the ISSS holds immense promise in safeguarding communities, businesses, and critical infrastructures in an increasingly complex and challenging security landscape.

This section describes the organization of work in rest of the paper. "Literature survey" Section discuss the Literature Survey or the relevant work to real time threat System and other closely related domains. In "Methodology" Section, the methodology followed to build a base model for ISSS is discussed. In "Results" Section, results of the proposed System, a test for the performance for parameter evaluation of the system to identify stability and responsiveness under pressure. "Conclusion and future scope" Section, provides the conclusion for the paper and mentions the future work.

2 Literature survey

2.1 This section presents a literature survey of previously done work in the same domain

A study has [1] depicted inferences which emphasised on OpenCV, an open-source, computer vision library for identifying and transforming useful information from images. Another researcher [2] proved that processing’s purpose is to help a computer understand the content of an image. The majority of image processing is done using a group of libraries offered by OpenCV. This provides the de facto standard API for computer vision applications. We can manage several issues that pop up in real time by using image processing software. Real-time OpenCV applications for image processing are also presented, along with instructions and examples.

According to some researchers [3], The Local Binary Pattern Histogram (LBPH) technique has been offered as a simple solution to the face identification problem since it can recognise both the front and side faces. Yet, in the presence of varied lighting, changeable emotions, and deflected attitude, the LBPH algorithm’s recognition rate decreases. This problem is addressed by a modified LBPH method based on pixel neighbourhood grey median (MLBPH). The grey value of the pixel is changed to the median value of its neighbourhood sampling value after the feature value is extracted by the sunblock’s and the statistical histogram is established to create the MLBPH feature dictionary, which is used to recognise the identity of the human face in comparison to the test image. It has been [4] stated that Internet of Things can be useful for facial recognition to improve the smart home facilities. Recognition will be done with LBPH technique to identify a person and that can be highly useful for home residents. Challenging areas are to secure, monitor and control real time automation. The required components for the same are web camera, speaker, a stepper motor and Raspberry Pi3 System.

Studies by some researchers [5] in 2021 illustrated how important a person’s face is to who they are; in the real world, it is used to tell apart the personalities of two or more people. To ensure that only the right person has access to their particular ac-counts, both real and virtual, some biological components have recently been altered. Biometrics, which uses identification methods including fingerprints, palm veins, DNA, palm prints, and facial recognition, is one of the methods that has been developed. Their research will demonstrate how image processing can be used to use facial identification and recognition algorithms to create a tool that can recognise students’ frontal faces in a classroom. In 2021, a work [6] explained that since its debut, digitisation of images has played a substantial and crucial role in the computer science discipline. It encompasses the techniques and methods used when modifying a digital image using a computer. It is a form of signal processing where the input and output could either be a picture or characteristics of that picture. One such crucial area of image processing is image in painting. It is a type of picture preservation and restoration. Additionally selects the best efficient in painting algorithm based on runtime metrics.

The idea [7] of a facial identification system has been proposed to increase reliability by employing facial recognition for a variety of purposes, such as making it simpler for individuals to access with the right security measures during Covid-19 as well as security when trying to disguise their identity. The technique considers using models like Eigen Faces, Fisher Faces, and LBPH Faces as well as software like Python and OpenCV. The units of analysis are taken into account to be still images and clips from videos that capture facial expressions; facial recognition algorithms are then trained on their patterns. According to the results, the LBPH Faces were able to identify faces with a 95% certainty and in less time, which increased the accuracy of facial recognition.

Another work [8] presented a project whose goal was to use face recognition to track attendance in real time across all institutional domains. It is one of the main issues facing all businesses. The proposed approach used machine learning for other biometric measurements like fingerprint, iris, hand, and retina scans, it was simpler to process. The LBPH method will identify the face after the Haar cascade classifier has detected it. Real-time face data creation is the experiment’s focus. According to the study [9], NumPy arrays, which are the Python language’s standard representation for numerical data, enable the efficient implementation of numerical operations in a high-level language, shows how to vectorize calculations, eliminate memory data copies, and reduce operation counts as the ways to improve NumPy speed. It has been [10] stated that Tkinter programming is intended for Python users that need to create applications with graphical user interfaces (GUIs).

Other Researchers Also a work [11] explains scikit-image, a collection of Python-based image processing methods made available by a thriving community of volunteers under the permissive BSD Open Source license. Python’s expanding popularity as a scientific programming language and the expanding accessibility of a sizeable eco-system of auxiliary tools make it an ideal environment for develo** an image processing toolkit. Goala et al. in their work [12] employs a fuzzy multi-criteria decision support system to prioritize the components of a smart city that could be at risk of terrorist attacks. To achieve this, a novel aggregation operation on Intuitionistic fuzzy sets has been introduced. Furthermore, a case study on a smart city has been conducted to demonstrate the practicality of the proposed approach. A specialized optimal data aggregation approach, enabled by the Internet of Things (IoT), has been developed for intelligent surveillance systems in smart cities. This approach [13] aims to transform raw data values into refined ones while minimizing data loss. Furthermore, the proposed scheme ensures that each server is responsible for performing the data refinement process, thus maintaining the desired accuracy and precision ratio. Researchers in [14] their report emphasizes the need for the healthcare industry to implement digital protection coordinated by blockchain technology to safeguard crucial clinical assets in the context of artificial intelligence utilization in clinical settings. By leveraging blockchain-based applications, it becomes possible to accurately identify and address the most critical and potentially harmful errors within the medical field. Decentralized information protection, supported by blockchain, ensures the security of patient health records and protects them from data theft Figs 1, 2, 3 and 4.

Fig. 1
figure 1

Architecture of the proposed model

Fig. 2
figure 2

Workflow of the proposed model

Fig. 3
figure 3

Extraction of features with histogram representation

Fig. 4
figure 4

Identification of related feature

The summary of some other studies in this domain are presented in Table 1.

Table 1 Summary of studies in surveillance systems

3 Methodology

This paper approaches the aspects listed below, each require a particular set of algorithms and techniques to operate: Monitor, recognise the family member, listen for noises, and look out for visitors in the room.We will go over the methodologies and algorithms used for each function.

First, the GUI has seven buttons, each with its own function:

  • Monitor(Module 5): The function uses OpenCV to detect motion in a video stream by comparing successive frames and finding contours in the thresholded difference image and helps identify missing object.

  • Rectangle(Module 2): This feature tracks motion in a user-selected region of interest in the camera frame by comparing consecutive frames of the video. It displays "MOTION" in green text if motion is detected and "NO-MOTION" in red text otherwise, and ends when the escape key is pressed.

  • Noise(Module 1): This feature detects motion in a video captured by the default camera by comparing consecutive frames, applying thresholding and contour detection. It displays "MOTION" in green text if motion is detected and "NO-MOTION" in red text if no motion is detected..

  • Record(Module 4): This feature records a timestamped video using the default camera and saves it in AVI format with 640 × 480 resolution and 20 FPS, which can be stopped and saved by pressing the "Esc" key.

  • In-Out(Module 3): This feature uses OpenCV and date-time libraries to detect motion and track the direction of movement of a visitor in front of the camera. It saves timestamped images in the "in" or "out" folder depending on whether the visitor is moving towards the left or right, respectively.

  • Identify(Module 6): This feature defines three functions, "collect_data()", "train()", and "identify()", which are used for face recognition with OpenCV, Haar cascades, and LBPH. The functions are integrated with a Tkinter GUI for adding new faces and recognizing known faces in real-time from a webcam.

  • Exit.

The Architecture of the proposed model is as follows:

The Intelligent Monitoring System is designed as a combination of 6 modules.

The Methodology workflow of the proposed model is as follows:

  1. A.

    Module 1—Noise

This code captures frames from the default camera, calculates the difference between two consecutive frames, applies thresholding and contour detection to identify motion in the video. If motion is detected, it draws a green rectangle around the moving object and displays "MOTION" text on the screen. If no motion is detected, it displays "NO-MOTION" in red text.

Our Noise algorithm consists of the following steps:

  1. 1

    Import OpenCV library.

  2. 2

    Define a function called "noise".

  3. 3

    Capture a video stream from the default camera using VideoCapture(0) method.

  4. 4

    Start an infinite loop to process each frame of the video stream.

  5. 5

    Read two frames from the video stream and store them as frame1 and frame2.

  6. 6

    Calculate the absolute difference between the two frames and convert the result to grayscale.

  7. 7

    Blur the grayscale image using a kernel of size 5 × 5.

  8. 8

    Threshold the blurred image using a threshold value of 25 and set all values above the threshold to 255.

  9. 9

    Find all the contours in the thresholded image using the findContours method.

  10. 10

    If the length of contours is greater than zero, then find the contour with the maximum area using the max method.

  11. 11

    Get the bounding rectangle for the maximum area contour using the boundingRect method.

  12. 12

    Draw a green rectangle around the area with motion using the rectangle method.

  13. 13

    If no motion is detected, then put the text "NO-MOTION" in red colour.

  14. 14

    If motion is detected, then put the text "MOTION" in green colour.

  15. 15

    Display the resulting frame.

  16. 16

    If the ‘Esc’ key is pressed, release the video capture object and destroy all windows.

  17. 17

    End the loop.

  1. B.

    Module 2—Rectangular Noise

This module allows the user to select a region of interest in the camera frame by left-clicking and dragging the mouse to create a rectangle. The program then tracks motion in this region of interest by comparing the difference between two consecutive frames of the video. If motion is detected, a green rectangle is drawn around the motion area and the text "MOTION" is displayed, otherwise a red rectangle is drawn around the region of interest and the text "NO-MOTION" is displayed. The program ends when the escape key is pressed.

Our Rectangular Noise algorithm consists of the following steps:

  1. 1.

    Define global variables for mouse click coordinates and boolean flags.

  2. 2.

    Define select function that gets called whenever a mouse event occurs.

  3. 3.

    Inside select function, if left button is clicked, set × 1 and y1 coordinates and set donel flag to True. If right button is clicked, set × 2 and y2 coordinates and set doner flag to True.

  4. 4.

    Capture video from default camera.

  5. 5.

    Create a named window and set mouse callback to select function.

  6. 6.

    In a loop, display the frame and wait for a mouse event to occur or for the escape key to be pressed.

  7. 7.

    Once a mouse event occurs or escape key is pressed, destroy the window and proceed to the next loop.

  8. 8.

    In the next loop, capture video from default camera.

  9. 9.

    Select only the region of interest defined by the mouse clicks in the previous loop.

  10. 10.

    Calculate the difference between two frames and apply thresholding and blurring to the resulting image.

  11. 11.

    Find contours in the resulting image and draw a rectangle around the contour if there is motion detected.

  12. 12.

    Display the resulting frame with the rectangle and text indicating if there is motion or not.

  13. 13.

    Wait for the escape key to be pressed and release the camera and destroy all windows.

  1. C.

    Module 3—Visitor In/Out Data

This module uses OpenCV and datetime libraries to detect motion and track the direction of movement of a visitor in front of the camera. It captures frames from the default camera, applies background subtraction, thresholding and contour detection to find the largest contour, and tracks its movement. Based on the position of the bounding box of the largest contour, it determines whether the visitor is moving towards the left or right. If the visitor moves towards the left, a timestamped image is saved in the "in" folder and if the visitor moves towards the right, an image is saved in the "out" folder. The script continues to run until the "Esc" key is pressed, upon which the camera capture object is released and all windows are destroyed.

Our Visitor In/Out Data algorithm consists of the following steps:

  1. 1.

    Open the default camera using cv2.VideoCapture(0).

  2. 2.

    Initialize right and left as empty strings to keep track of the direction of the motion.

  3. 3.

    Start an infinite loop to capture and process video frames.

  4. 4.

    Capture two consecutive frames from the camera using cap.read().

  5. 5.

    Compute the absolute difference between the two frames using cv2.absdiff().

  6. 6.

    Blur the difference image using a kernel of size (5,5) with cv2.blur().

  7. 7.

    Convert the blurred difference image to grayscale using cv2.cvtColor().

  8. 8.

    Threshold the grayscale image to obtain a binary image using cv2.threshold().

  9. 9.

    Find contours in the binary image using cv2.findContours().

  10. 10.

    If at least one contour is found, find the contour with the maximum area and draw a bounding rectangle around it using cv2.rectangle().

  11. 11.

    If both right and left are empty and the maximum x-coordinate of the bounding rectangle is greater than 500, set right to True. If it is less than 200, set left to True.

  12. 12.

    If right is True and the maximum x-coordinate of the bounding rectangle is less than 200, print "to left", reset right and left, and save the current frame to a file in the "in" folder with the current date and time as the file name.

  13. 13.

    If left is True and the maximum x-coordinate of the bounding rectangle is greater than 500, print "to right", reset right and left, and save the current frame to a file in the "out" folder with the current date and time as the file name.

  14. 14.

    Show the current frame with the bounding rectangle and the text "MOTION" or "NO-MOTION" depending on whether a contour was found or not using cv2.imshow().

  15. 15.

    Wait for a key press using cv2.waitKey(). If the key pressed is the "Esc" key (key code 27), release the camera and destroy all windows using cap.release() and cv2.destroyAllWindows() and break the loop.

  1. D.

    Module 4—Video recording

This script records a video using the default camera on the device and saves it to a file. Each frame of the video is timestamped with the current date and time. The video is recorded in AVI format with a frame rate of 20 frames per second and a resolution of 640 × 480. The video can be stopped and saved by pressing the "Esc" key.

Our Record algorithm consists of the following steps:

  1. 1.

    Import the necessary libraries—cv2 for OpenCV and datetime to timestamp the video.

  2. 2.

    Initialize the camera capture object with cv2.VideoCapture(0). This sets up a connection with the default camera on the device.

  3. 3.

    Initialize a VideoWriter object to write the video to a file with cv2.VideoWriter().

  4. 4.

    Enter an infinite loop that captures frames from the camera using cap.read().

  5. 5.

    Add a timestamp to each frame using cv2.putText().

  6. 6.

    Write the current frame to the video file using out.write().

  7. 7.

    Display the current frame using cv2.imshow().

  8. 8.

    If the "Esc" key (keycode 27) is pressed, release the camera capture object and destroy all windows.

  1. E.

    Module 5—Theft detection

This module defines a function that uses the OpenCV library to detect motion in a video stream from the default camera. The function first captures the first frame and converts it to grayscale. Then it starts an infinite loop where it captures the next frame, computes the absolute difference between the current and previous frames, thresholds the difference image, and finds contours in the thresholded image.

Our algorithm for this module consists of the following steps:

  1. 1.

    Import the necessary libraries like cv2, spot_diff, time, and numpy.

  2. 2.

    Define the function find_motion to detect motion using a video stream from a webcam.

  3. 3.

    Set motion_detected and is_start_done variables to False.

  4. 4.

    Initialize the webcam using the cv2.VideoCapture() function.

  5. 5.

    Wait for 2 s before capturing the first frame.

  6. 6.

    Read the first frame and convert it to grayscale.

  7. 7.

    Start a while loop that reads each frame from the webcam stream and compares it with the previous frame.

  8. 8.

    Find the absolute difference between the previous and current frames using cv2.absdiff() and threshold the difference using cv2.threshold() to get a binary image.

  9. 9.

    Find the contours in the binary image using cv2.findContours().

  10. 10.

    Keep only those contours whose area is greater than 25.

  11. 11.

    If the number of contours is greater than 5, set motion_detected to True and set is_start_done to False.

  12. 12.

    If motion_detected is True and the number of contours is less than 3, start a timer to check if the motion has stopped.

  13. 13.

    If the timer exceeds 4 s, capture the next frame and use the spot_diff function to check if there is any motion between the two frames.

  14. 14.

    If there is no motion, print "running again" and return to step 7.

  15. 15.

    If there is motion, print "found motion" and exit the loop.

  16. 16.

    If the number of contours is less than or equal to 5, set motion_detected to False and print "no motion detected".

  17. 17.

    Display the binary image using cv2.imshow().

  18. 18.

    Read the next frame from the webcam stream and convert it to grayscale.

  19. 19.

    If the Esc key is pressed, exit the loop.

  20. 20.

    Release the webcam and destroy all windows.

  1. F.

    Module 6—Person Identification

This module defines three functions—"collect_data()", "train()", and "identify()"—that together enable the collection of face images, training of a face recognition model, and real-time recognition of known faces from a webcam. The algorithm uses OpenCV, Haar cascades, and LBPH (Local Binary Pattern Histograms) face recognition to achieve this. The algorithm is implemented with a Tkinter GUI that allows the user to choose between adding new faces to the system or recognizing known faces.

Face identification in pictures or video frames is accomplished using the Haar Cascade Classifier, a machine learning-based technique to object detection. A sizeable collection of photos with both positive and negative examples of the item being recognised is used to train the classifier. To find the item, the classifier employs a collection of characteristics that are represented as Haar wavelets. The LBPH Face Recogniser is a technique for identifying faces in pictures. First, the system uses a Haar Cascade Classifier to find faces in the input picture. A histogram of these patterns is then computed once local binary patterns from the observed facial areas have been extracted. The face is then identified using this histogram.

Using Grid parameters, the image can be divided and produced in the previous step into several grids, as seen in the following image: The model is then trained, and subsequently, when we want to make predictions, we follow the same steps and compare the histograms with the model that has previously been trained. This is how this functionality works. The open CV library is used to implement the proposed methodology [22].

Algorithm for this module is as follows:

  1. 1.

    Import required libraries—cv2, os, numpy, tkinter, tkinter font.

  2. 2.

    Define a function called "collect_data()" that collects images of a person's face using a webcam and saves them to a folder with their name and ID.

  3. 3.

    Inside "collect_data()", open a connection to the webcam using OpenCV, and load the Haar cascade classifier file for face detection.

  4. 4.

    Continuously capture frames from the webcam, convert them to grayscale, and detect faces using the cascade classifier.

  5. 5.

    For each detected face, draw a rectangle around it, save the grayscale ROI to a file, and increment the count.

  6. 6.

    Once 300 face images have been collected or the user presses the "esc" key, close the webcam connection, destroy all windows, and initiate the "train()" function.

  7. 7.

    Define a function called "train()" that reads the collected images from the "persons" folder, extracts the faces, and trains a LBPH face recognition model.

  8. 8.

    Save the trained model as a.yml file in the current directory.

  9. 9.

    Define a function called "identify()" that uses the trained LBPH model to recognize faces in real-time from the webcam.

  10. 10.

    Inside "identify()", load the trained LBPH model, load the Haar cascade classifier file for face detection, and create a dictionary of labels and IDs from the "persons" folder.

  11. 11.

    Continuously capture frames from the webcam, convert them to grayscale, and detect faces using the cascade classifier.

  12. 12.

    For each detected face, draw a rectangle around it, and use the trained model to predict the label.

  13. 13.

    Display the recognized label on the frame, or "unknown" if the prediction score is above 100.

  14. 14.

    If the user presses the "esc" key, close the webcam connection and destroy all windows.

  15. 15.

    Define a function called "maincall()" that creates a Tkinter GUI with two buttons—"Add Member" and "Start with known".

  16. 16.

    When the "Add Member" button is pressed, the "collect_data()" function is called.

  17. 17.

    When the "Start with known" button is pressed, the "identify()" function is called.

  18. 18.

    Start the GUI mainloop.

4 Results

Sample results obtained on running the Intelligent Surveillance Support Systems are as follows:

Here Fig. 5 shows NO-MOTION when no motion is identified. The Fig. 6 shows MOTION as output when the motion is identified in the frame.The Fig. 7 shows that no motion is identified in the selected subframe (rectangle). The Fig. 8 shows the image that is to be stored in the visitors folder with timestamp, during the person entry in frame.The Fig. 9 shows the result where the proposed module correctly identifies the lamps stolen or is missing form the frame.

Fig. 5
figure 5

Module-1 no motion

Fig. 6
figure 6

Module-1 motion found

Fig. 7
figure 7

Module-2 rectangular noise where a frame is selected

Fig. 8
figure 8

Module-3 visitor found and will be stored in visitors file

Fig. 9
figure 9

Module-5 identifying the object removed from the table

Here in Fig. 5 one can see that our model identifies rightly that there is no motion in the frame.

Here in Fig. 6 the model identifies the motion in the frame caused by the movement of hand. The rectangle selected as the subframe rightly identifies that there is no motion within, as shown in Fig. 7. The Fig. 8 shows that person when moves across the frame, our model identifies the entry and stores the shown above image along with timestamp in the visitors folder.

Our proposed model uses SSIM(Structural Similarity Index Metric), a metric that is used to examine how similar two provided images are to one another. There is a ton of information detailing the theory of SSIM because it has been there since 2004, but very few resources go into great detail, especially for a gradient-based implementation because SSIM is frequently employed as a loss function. Three crucial elements are extracted from an image via the Structural Similarity Index (SSIM) metric: Structure, Contrast, and Luminance. On the basis of these 3 attributes, a comparison between the two photos is made.

Speaking about the datasets, the algorithms use real time frame comparisons for the majority of the features.

The Structural Similarity Index, which ranges in value from -1 to + 1, is calculated using this system between two provided images. A number of + 1 denotes that the two photo-graphs are identical or extremely similar, whereas a value of -1 denotes that the two images are highly dissimilar. To fit into the range [0, 1], where the extremes have the same importance, these values are routinely adjusted. Still lets look upon the is an analysis of the main aspect. A machine learning approach known as haar cascading is used to identify the faces in a given frame. The research conducted by researchers [23] in their work of Facial identification using Haar cascade and LBP classifiers, justifies the adoption of the haar cascading algorithm over the LBP algorithm. With the help of their work, we are now able to evaluate the algorithm in the context of our use case, the Smart CCTV based Real time threat detection System. Haar Cascade classifier used in this system and accuracy measures are evaluated with respect to number of faces detected as shown in Table 2.

Table 2 Accuracy measures for no. of faces detected

After recognition, the LBPH algorithm is used to carry out identification. Instead of LBPH, we could have utilized algorithms like Eigen Face or Fisher Face. In order to support the decision to use the LBPH method, the paper by researchers [24] is cited.

Face photos taken in unrestricted weather are not part of any datasets that are currently available. Unfortunately, there aren’t enough images in large web-based dataset sources like imagenet for the experiment. As a result, a dataset called “LUDB" was produced for this type of research which was used by researchers [24]. Along with the LUDB dataset, two more well-known datasets, AT&T and 5_Celebrity [24], were employed for their performance study. The popular machine learning platform Kaggle provided them 5_Celebrity dataset. The metrics accuracy, precision, recall and F1 score of LBPH were determined using the Scikit learn tool. Through their work, we are now able to analyse the algorithm in the context of our use case, The Intelligent Surveillance Support System (ISSS), which also contains the facial recognition feature. Based on our ISSS framework, the following summarises the LBPH comparison for the three datasets mentioned above:

Refer to their study as shown in Table 3 to learn more about how this method, compares under various weather situations. LBPH accuracy graph for the datasets. Figure 9 presents precision and recall measures.

Table 3 On the AT&T dataset, 5_Celebrity and LUDB [21] evaluation measures

5 Conclusion and future scope

This Intelligent Surveillance Support System is a innovative solution for advanced video surveillance. Leveraging the power of computer vision using OpenCV, it offers unparalleled capabilities that far exceed those of traditional systems. With its advanced motion detection, theft detection, facial recognition, and in–out movement tracking, this system provides a highly sophisticated level of security that can keep any environment safe and secure (Fig. 10).

Fig. 10
figure 10

Precision and recall graph for the datasets

Furthermore, this system has a wide range of future extensions that will further enhance its capabilities. For instance, the system can be integrated with mobile CCTV and used to monitor offline centre-based examinations without the need for human intervention. The addition of built-in night vision will make it ideal for use in low-light environments, while the incorporation of deep learning technology will enable it to identify deadly weapons and detect accidental fires.

This system can also be developed as a standalone device that requires no external support, making it an ideal solution for deployment in remote locations or areas with limited connectivity. Moreover, the creation of a standalone program that does not require any prerequisites, such as Python, will make it more accessible to users who may not have the technical expertise to operate traditional security systems.

In conclusion, the Intelligent Surveillance Support System (ISSS) is a sophisticated software solution that leverages advanced features and optimised algorithms from the OpenCV library to enhance security and surveillance operations. By providing real-time monitoring and analysis of security footage, the ISSS offers a comprehensive and reliable security solution that improves the efficiency and effectiveness of surveillance systems.

Key factors associated with the ISSS include its ability to perform face recognition, theft detection, visitors in/out detection, and motion detection in real time. These features enable proactive threat detection, efficient monitoring of individuals, prevention of security breaches, and instant alerts for unusual activities. By automating these processes and leveraging machine perception, the ISSS minimizes human error, reduces response time, and enhances the overall safety and security of public and private spaces.

In terms of future improvements, some possible areas of improvement could include the incorporation of Advanced AI Algorithms, Multi-camera integration, behavioural Analysis, IoT devices, Cloud based processing, data analytics and predictive insights, etc. Exploring these areas and incorporating the latest advancements in technology and research, the Intelligent Surveillance Support System can continue to evolve and provide even more robust and efficient security solutions in the future.

In summary, Intelligent Surveillance Support System represents modern video surveillance technology, and its advanced features, extensive functionality, and flexible deployment options make it a valuable asset for a broad range of applications.