Face mask detection and social distance monitoring system for COVID-19 pandemic

Javed, Iram; Butt, Muhammad Atif; Khalid, Samina; Shehryar, Tehmina; Amin, Rashid; Syed, Adeel Muzaffar; Sadiq, Marium

doi:10.1007/s11042-022-13913-w

Face mask detection and social distance monitoring system for COVID-19 pandemic

Published: 30 September 2022

Volume 82, pages 14135–14152, (2023)
Cite this article

Download PDF

Multimedia Tools and Applications Aims and scope Submit manuscript

Face mask detection and social distance monitoring system for COVID-19 pandemic

Download PDF

Iram Javed¹,
Muhammad Atif Butt²,
Samina Khalid ORCID: orcid.org/0000-0003-4771-6842¹,
Tehmina Shehryar³,
Rashid Amin⁴,
Adeel Muzaffar Syed⁵ &
…
Marium Sadiq¹

2406 Accesses
Explore all metrics

Abstract

Coronavirus triggers several respirational infections such as sneezing, coughing, and pneumonia, which transmit humans to humans through airborne droplets. According to the guidelines of the World Health Organization, the spread of COVID-19 can be mitigated by avoiding public interactions in proximity and following standard operating procedures (SOPs) including wearing a face mask and maintaining social distancing in schools, shop** malls, and crowded areas. However, enforcing the adaptation of these SOPs on a larger scale is still a challenging task. With the emergence of deep learning-based visual object detection networks, numerous methods have been proposed to perform face mask detection on public spots. However, these methods require a huge amount of data to ensure robustness in real-time applications. Also, to the best of our knowledge, there is no standard outdoor surveillance-based dataset available to ensure the efficacy of face mask detection and social distancing methods in public spots. To this end, we present a large-scale dataset comprising of 10,000 outdoor images categorized into a binary class labeling i.e., face mask, and non-face masked people to accelerate the development of automated face mask detection and social distance measurement on public spots. Alongside, we also present an end-to-end pipeline to perform real-time face mask detection and social distance measurement in an outdoor environment. Initially, existing state-of-the-art single and multi-stage object detection networks are fine-tuned on the proposed dataset to evaluate their performance in terms of accuracy and inference time. Based on better performance, YOLO-v3 architecture is further optimized by tuning its feature extraction and region proposal generation layers to improve the performance in real-time applications. Our results indicate that the presented pipeline performed better than the baseline version, showing an improvement of 5.3% in terms of accuracy.

Social Distance Measurement and Face Mask Detection Using Deep Learning Models

A Hybrid Deep Learning System to Detect Face-Mask and Monitor Social Distance

A Novel Approach to Detect Face Mask in Real Time

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Coronavirus broke out at the end of 2019, and it is still devastating havoc on the livelihood and businesses of millions of people around the world [13]. Since the world has started recovering from the pandemic, people intend to return to a state of regularity, the same as before the pandemic. However, there is an upsurge of uneasiness among the people in getting back to their normal routine because this virus spreads through droplets of saliva from an infected person which can affect the people within the range of approximately 6 feet. The main symptoms of this infection are fever, headache, cough, respiratory difficulties, loss of taste, and smell ability which leads to the death of the infected person [41]. The incidence rate of COVID-19 is higher than other acute respiratory problems like severe acute respiratory syndrome (SARS) and the Middle East respiratory syndrome (MERS).

To prevent this deadly virus, World Health Organization (WHO) [35] issued guidelines and SOPs such as wearing a face mask and maintaining social distance in public spots. In this regard, several research studies also reported that maintaining the distance while physical interaction between people can prevent the spread of most respiratory diseases [21]. Tangana et. al [1] presented a mathematical model to demonstrate the impact of physical distance while interaction on transmission possibilities of virus among the people. In another study [15], it is demonstrated that wearing a face mask is highly effective in mitigating the reproduction of coronavirus. However, manual monitoring and enforcement of the aforementioned SOPs in public places such as schools, universities, shop** malls, and parks is a quite challenging task.

In step with the rapid advancement in Artificial Intelligence (AI), Deep Learning in particular, the computer vision community has contributed various state-of-the-art methods for intelligent surveillance [65], object detection [6] and recognition [5, 7], and scene understanding [46]. These methods can be employed to develop an intelligent monitoring system for face mask detection and social distance measurement in public places. However, there are two main challenges in this direction. Firstly, to the best of our knowledge, there is no South Asian standard benchmark available to evaluate facial mask detection and social distance measurement methods. Secondly, there is no pipeline available for the development of an end-to-end real-time intelligent monitoring system for facial mask detection and social distance measurement. It is important to mention that several research studies have employed standard single- and multi-stage object detectors such as Faster-RCNN, SSD, and Retina-Net to perform face mask detection [17]. However, these methods do not consider the impact of social distance measurement, which make these methods insufficient for deployment in actual public places.

To address the aforementioned short-comings of existing state-of-the-art methods, in this paper, we have made the following contributions.

1.
A local dataset containing 10,000 images based on two classes (i.e., masked face and unmasked face) has been collected from public places. It is worth noting that these classes are unique in orientation and dress codes, which are not covered in the existing datasets.
2.
Existing state-of-the-art single and multi-stage object detectors are fine-tuned on the proposed dataset. Based on the analysis, an improved YOLO-v3 based object detection architecture is presented to enhance robustness of real-time surveillance systems.
3.
Alongside, a machine-vision based distance measurement method has been proposed to ensure social distancing on public places.
4.
Lastly, an extensive comparative study has been carried out between state-of-the-art Face mask detection methods and the proposed method to demonstrate the effectiveness of our proposed method in terms of higher detection and recognition accuracy, and inference time.

The rest of the paper is organized as follow. In Section 2, we briefly discuss existing state-of-the-art facial mask detection and social distance measurement methods, along with the available datasets. In Section 3, we present a detailed overview of our proposed end-to-end pipeline for face mask detection and social distance measurement. The experimental results have been presented in Section 4. Finally, the paper is concluded in Section 5.

2 Related work

Real-time object detection and recognition methods can play an important role in develo** intelligent monitoring methods for face mask detection and social distancing measurement to prevent coronavirus transmission. In this section, we analyze the existing state-of-the-art methods employed in develo** an intelligent monitoring system for face mask detection and social distancing measurement which includes: (i) single- and multi-stage detection methods—for face masked and non-masked face detection, (ii) Available Datasets—to develop generalized face detection systems and, (iii) social distance measurement methods.

2.1 Facial mask detection

In the majority of existing research works, the researchers focused on face construction and identity recognition while wearing face masks. However, the aim of this study is to identify the human face in both states—wearing the mask, or not wearing the mask in order to assist in reducing COVID-19 transmission and spread. In recent studies, researchers have demonstrated that wearing face masks minimizes the rate of COVID-19 spread as it can interrupt airborne germs effectively [38]. However, monitoring the people in public places is still a challenging task. In this regard, Zhang et al. [62] proposed a single shot refinement face detector namely Refine Face to to detect people not wearing a face mask. In another research work, Jagadeeswari et al. [19] proposed SSD-based face mask detection method for an outdoor environment. Khandelwal et al.[22] presented a deep learning approach for classifying human face with and without mask. Onyema et al.[40] proposed method for facial expression recognition based convolutional neural network. Hussain et al. [16] proposed deep learning based IoT system to detect face mask using transfer learning approach.

Besides, aforementioned approaches achieved better accuracy on the respective test data. However, the real-time face mask detection is still a critical challenge for the system developers. In this regard, Snyder et al. [56] introduced deep learning based approach for mask detection to prevent COVID-19 transmission. Kodali et al. [23] presented custom CNN-based model to detect face wearing a mask in the public spots. Similarly, Sagayam et al. [54] proposed deep neural network based method for binary class (i.e., masked and non-masked) face state recognition. Degadwala et al. [9] proposed YOLO-v4 based face detection method which has been trained and tested over WIDER-FACE and MAFA datasets. Likewise, Taneja et al. [58] presented facial mask detection system with MobileNetV2 lightweight CNN and achieved 99.98% accuracy. On the other hand, Sethi et al. [55] aims to detect mask using ResNet-50. The model give 11.07% and 6.44% higher precision and recall and compared it with RetinaFaceMask detector model.

In another research work, Loey et. al [30] presented multi-stage detection method for face detection with wearing or not wearing mask. Alongside, ensemble method combined with deep learning model to detect face masks using real-world and synthetic data to improve the generalizability of machine learning models. These research works are discussed along with insightful strengths and limitations in the Table 1. To this end, we conclude that the deployment of the above-discussed face mask detection systems encounter several constraints at development and deployment level such as diverse types of face masks, face orientation, and illumines conditions [52]. Furthermore, stabilizing object detection model accurateness and real time condition, placement of detector on system with limited computing capacity. In the circumstance of the epidemic, facial mask detection is not still explored in images, videos as well as closed circuit television (CCTV) to control transmission chain of virus [37].

Table 1 An Overview of Existing Machine Learning Methods Used for Face Mask Detection and Recognition Tasks

Full size table

2.2 Available datasets

In the context of COVID-19, the face datasets have an essential role in training deep models for face mask and non-masked face detection. Recently, several datasets have been proposed to accelerate research in this direction. In this regard,Ge et al.[12] proposed MAFA dataset contains 30811 images which are collected from the Internet. These images have distinct types of masks, several occlusion degree and orientations. Furthermore,Laxel [33] introduced Face Mask Dataset (FMA) holds 853 images with three classes collected from Kaggle. Another extent version of kaggle dataset proposed by Wobot [18] denoted as FMA containing 6024 images having 20 classes. Rahmani et al. [45] proposed Medical Mask Dataset (MMD). The MMD dataset consist of 9067 images with three classes use to detect only medical mask. On the other hand, Wang et al. [26] aimed to study a database configuration in multiple sensor technologies similar to cameras, LiDAR, inertial gyroscopes, wireless sensors and additional sensors used as data acquisition stages. Liang et al. [27] utilized various sensors to get image information and geographic location information at the same time build an indoor 3D chart using geographic coordinates. Niu et al. [39] highlighted social distancing problem in 3D view by using monocular cameras pedestrian 3D localization. Futhermore, Magoo et al. [31] setting bird eye view framework with YOLO v3 model to monitor social distance in public area. Though, the research community has contributed several social distance measurement methods, however, deployment of such systems in real-world environment is still a challenging task.

3 The method

To address the above-mentioned issues, we propose a novel pipeline for develo** an end-to-end face mask detection methods to monitor the public spots in order to mitigate the COVID-19 spread, as shown in the Fig. 1. Firstly, we present a large-scale M UST F ace D ataset (MFD)—containing 10,000 images along with binary class bounding box annotations i.e., Face wearing mask, and Face not wearing mask. Alongside, we analyzed the existing state-of-the-art single stage and multi-stage object detector over our proposed dataset. Specifically, we fine-tuned the existing YOLO-v3 [49], SSD [63], RetinaNet-50 [28], Fast-RCNN [50], Faster R-CNN (FPN) [32], Faster-RCNN (ResNet-50) [25] and Faster-RCNN (ResNet-101) [29] on our proposed dataset through transfer learning. Based on the better performance, we further improved the YOLO-v3 architecture to robustify its performance in outdoor environment. On the basis of our face detector, we employed our self-proposed social distance measurement method—which takes input from the face detector and computes the distance between the two human beings to mitigate the COVID-19 spread in public spots.

3.1 MUST face dataset

To this end, we collect and release M UST F ace D ataset (MFD)—a large-scale dataset to accelerate the development of generalized methods for end-to-end face mask detection in public places. Our MFD contains 10,000 images along with binary class (i.e., masked face, non-masked face) bounding box annotations. The proposed dataset is generated from the video sequences captured by the surveillance cameras installed at the outdoors of the departmental buildings. The average height of the installed cameras is in the range of 12 feet to 15 feet from the ground. After successful video sequence collection, the crowded frames are manually extracted while ensuring the quality control parameters such as positioning of the people and the clarity of the images. It is important to mention that we comply with the regulatory bodies and collected the data from the permitted areas. To protect the privacy, we do not disclose or release the personal identities, Geo-location, incoming and outgoing pattern based information of the people.

After completing frame extraction, considering the use-case of our proposed method, we defined two classes for annotations i.e., masked face, and non-masked face. For this purpose, we employed LabelImg annotation tool to label the human faces according to the aforementioned defined classes. One of the reasons of manual annotations instead of automated labeling is to maintain the accuracy of the coordinates of ground truth which plays an important role in training a robust face detection model. All the annotations are cross-validated by a team of experts to ensure the quality of ground truth. Some of the samples of our dataset are shown in the Fig. 2.

3.2 Suitable face detection method selection

Till recently, deep object detection methods have demonstrated better applicability in various real-time object detection and recognition tasks [24]. To select the suitable deep learning object detector, firstly, we fine-tuned the existing state-of-the art single-stage and multi-stage detection methods including YOLO-v3 [49], SSD [63], RetinaNet-50 [28], Fast-RCNN [50], Faster R-CNN (FPN) [32], Faster-RCNN (ResNet-50) [25] and Faster-RCNN (ResNet-101) [29] on our proposed MFD through transfer learning. The results show that existing YOLO-v3 outperformed aforementioned employed detection methods in terms of inference time and accuracy. Based on the better performance, we further improved the YOLO-v3 architecture to robustify its performance in outdoor environment.

3.3 Proposed facial mask detection architecture

In the proposed framework, we have employed YOLO-v3 architecture to perform facial mask detection in real-time, one of the most outstanding deep learning object detectors proposed by Joseph Redmon and Ali Farhadi in 2018 [48], which demonstrated consistent performance for object detection and recognition tasks. One of the main issues in existing detection network was the vanishing gradient problem, which commonly occurs by increasing network layers. Therefore, multi-scale YOLO-v3 has been proposed which hold residual connections—which join the input from the previous layer to output of next layer similar to ResNet architecture. Resultantly, Yolo-v3 achieved good performance even over low resolution images due to inclusion of multi-scale feature extraction property. To this end, we employed the existing YOLO-v3 architecture and inserted k-means anchoring to 9 anchor boxes and then isolate them into three locations to get more more bounding boxes per image than baseline version.

The input layer takes an RGB image with a size of 416x416 pixels. As a backbone network, we employed DarkNet-53 to accomplish the maximum calculated floating-point procedure per second. The internal structure of the model includes fully connected network that does not contain max-pooling layer. As depicted in Fig. 1 the network contains convolution block, residual block, and scale output layers. In convolution block, convolution functions of the kernel size hold strides instead of max pooling to reduce size of input images; each monitored by batch normalization and ReLU activation. On the other hand, residual block having different kernel size of two convolution block named as mega-block. In existing YOLO-v3 architecture, the convolution blocks iterates by 1x, 2x, 4x, and 8x. However, considering the use-case of our application, we reduced the iterations of convolution blocks to 1x, 2x, 4x in order to improve the learning performance and inference time. In the bottom of the architecture, an average pool, followed by a fully connected layer and softmax activation is employed to down-sample the feature map and get binary class output probability, respectively. To improve the learning process, we applied the concept of transfer learning to utilize the storing knowledge of a neural network to do new tasks by simply learning new weights. The ultimate aim of employing this technique is to increase the learning process.

3.4 Social distance measurement methods

With the recent advancement in the field of AI, computer vision based applications have demonstrated better applicability in several applications such as scene understanding, object recognition, speed, and distance estimation [14]. Some research used proportional-integral-derived (PID) [57] due to it’s simplicity and non-optimal performance. Since, it is suitable for distance measurement as well as will consume less power and memory. Zhang et al. [64], proposed distance estimation method to localization of an object in the camera coordinate frame. Their method contain three steps. The first step is regarding camera calibration and the second step is concerned with constitute a model for distance measurement between camera coordinate frame with their projection frame and third step is representing absolute distance estimation.

The distance is computed with respect to the pivot point of bounding box known as centroid—which is calculated using (1), mentioned below.

$$ C_(x,y) = \frac{\widehat {x}_{min} + \widehat{x}_{max}}{2} , \frac{\widehat {y}_{min} + \widehat{y}_{max}}{2} $$

(1)

It can be seen from (1), C means centroid—means that minimum and maximum width of the bounding box whereas y_min , y_max means that minimum and maximum height of the bounding box. Calculated centroid and then use Euclidean distance formula to measure distance between centroids, as shown in the (2) and then compared the distance with ground truth value.

$$ D_({C_{2(x,y)}},{C_{1}(x,y)})=\sqrt{({x}_{min}-{x}_{max})^{2} + ({y}_{min}- {y}_{max})^{2}} $$

(2)

After calculating centroid of bounding box, a unique ID is assigned to each centroid. In the next step, the distance between every detected centroid is computed using Euclidean distance. To validate the correctness, Root Mean Square Error (RMSE) (mentioned in the equation 3) to estimate the error between actual value and predicted value of the model.

$$ RMSE = \sqrt{{\sum}_{i=1}^{N}\frac{(Predictedvalue - Actualvalue)}{N}} $$

(3)

3.5 Proposed algorithm for real-time face mask detection

Here we present a novel algorithm, depicted in Algorithm 1 , for develo** and deploying an end-to-end face mask detection and social distance monitoring system in the public spots.

In the first step, the real-time stream of the camera get the visual frames—which is passed to our developed face mask detection method for inference. Our proposed method analyzes the frames, if there is no face detected, our network returns null. If face is detected, face detect and also compute distance between faces by using our proposed method. To find out the precautionary measure according to the facial mask and measure social distance, a discussion performed in the Section 2. Following scenarios has been performed: if person wear a mask and distance is greater than 6 feet then no action performed. But when person not wearing a mask and social distance is greater than 6 feet then alert is high. On the other hand, when person wear mask and social distance is less than 6 feet again alarm generated. The masked person and not maintain social distance, then generated warning.

4 Experiments and results

In this section, we evaluate the effectiveness of the proposed mask/non-mask face detection method and present the comparison study with current cutting-edge techniques. The studies are conducted on a powerful computer running a 64-bit version of Windows 10 that has an RTX 2080TI graphics card, an 11 GB DDR5 GPU, a core i9- 9900k CPU, and 32 GB of RAM.

4.1 Training setup

The training process of the proposed pipeline is divided into three fundamental steps: data pre-processing, model training, and model evaluation. Firstly, the whole dataset is randomly split into training, validation, and test set with 80:10:10 percent ratio and normalized the input size to 416x416 pixel resolution. In the next step, Pytorch library is used for the implementation of the proposed pipeline. Moreover, the experiments are categorized into three phases i.e. (i) evaluation of the existing state-of-the-art object detection networks on proposed dataset, and (ii) evaluation of improved Yolo-v3 network on proposed dataset, and (iii) evaluation of proposed distance measurement method.

4.2 Evaluation of existing state-of-the-art object detection networks on proposed dataset

To evaluate the existing state-of-the-art deep object detection models—YOLO-v3, SSD, RetinaNet-50, RetinaNet-101, Fast-RCNN, Faster R-CNN (FPN), Faster-RCNN (ResNet-50) and Faster-RCNN (ResNet-101) are are fine-tuned on the proposed face mask detection dataset. Pytorch 1.4.0 library and cuda 11.0 version are used to configure the training runs. The hyper-parameters such as learning rate, batch size and epochs are set to 0.0001, 32, and 100 with the stochastic gradient descent optimizer to update model weights, respectively. The performance matrices of the employed models are shown in Table 2.

Table 2 Evaluation of existing state-of-the-art object detection networks on proposed dataset

Full size table

It can be seen from Table 2 that single-stage detectors demonstrated better applicability in term of low inference time due to their less parametric architectures. Whereas, the multi-stage object detectors have been computationally expensive while achieving significantly higher inference time. It is also important to mention that Yolo-v3 with 53 layers demonstrated better accuracy than the SSD, RetinaNet-50, RetinaNet-101, Fast-RCNN, Faster R-CNN (FPN), Faster-RCNN (ResNet-50) and Faster-RCNN (ResNet-101). For instance, YOLO-v3 achieved 64.1% mean accuracy, 59.6% mAP, 53.1% mAP @ 0.95 and 28ms inference time. Similarly, SSD achieved 61.8% mean accuracy, mAP 56.2%, mAP @ 0.95 is 48.6% and take 34 prediction time. Also, RetinaNet-50 demonstrate 55.2% mean accuracy, mAP 51.9%, mAP @ 0.95 is 44.7% with the inference time of 37ms on the test set. Whereas, RetinaNet-101 achieved 51.0% mean accuracy, mAP 46.3%, and 44.7% mAP @ 0.95 with 39ms inference time which is comparatively higher than RetinaNet-50. On the other hand, We next analyze the multi-stage object detector i.e., Fast R-CNN which demonstrated 41.7% mean accuracy, 39.4% mAP, and 37.1% mAP @ 0.95 with 132ms inference time on the our test set which is significantly higher than the employed single shot detectors. In another experiment, Faster R-CNN based on FPN 119 achieved mean accuracy of 47.3%, 44% mAP, and 41.5% mAP @ 0.95. Whereas, the sample Faster R-CNN with ResNet-50 feature extraction network achieved mean accuracy of 59.0%, mAP 44%, and 57.4% mAP @ 0.95 with inference time of 108ms. However, with ResNet-101 as a backbone feature extraction network, Faster-RCNN shows mean accuracy of 62.7%, mAP 61.3%, and 59.0% mAP @ 0.95 with inference time of 98ms. Consequently, it can be assumed that YOLO-v3 with DarkNet-53 can achieve better accuracy after further architectural fine-tuning.

4.3 Evaluation of improved YOLO-V3 architecture on proposed dataset

Based on the above discussed analysis, the architecture of the YOLO-v3 is further improved by trimming the less contributing convolutional layers and residual connections.The improved feature extractor—DarkNet has been evaluated on the proposed dataset. In order to train the network faster, we employed transfer learning to learn the high level features from the proposed dataset. In the training setup, we employed SGD optimization algorithm with momentum to train and evaluate the improved network on our proposed dataset for mask/non-mask face detection tasks. The re-known performance metrics such as mean accuracy, mAP, mAP @ 0.95 and inference time are used to evaluate the performance of our improved face mask/non-mask face detection on our dataset. The mean accuracy refers to the sum of correct predictions divided by the sum of total data samples. Whereas, mAP denotes mean average precision, and AP @ 0.95 shows the average precision with 0.95 intersection over union. Furthermore, inference time refers to the total time taken from getting an input to producing an output.

Table 3 Evaluation of improved YOLO-v3 on proposed dataset

Full size table

It can be seen from the Table 3 that our improved Yolo-V3 based detection network outperformed the baseline Yolo-v3 in mask/non-mask face detection tasks on our proposed dataset. One of the main reasons behind the increase of accuracy in our model is the trimming of less contributing residual connections with accelerated the performance of our model as compared to the baseline model. Some of sample results are demonstrated in Fig. 3 to show the effectiveness of our proposed masked/non-masked face detection method.

4.4 Evaluation of proposed distance measurement method

After evaluating our proposed mask/non-mask face detection, in next step, we evaluated our proposed machine-vision based distance measurement method to ensure social distancing on public places. Following the standard performance metrics, we employed root mean square error to analyze the correctness of our method as compared to the ground truth. Some of the quantitative analysis is shown in Table 4.

Table 4 Results of proposed distance measurement methods

Full size table

The vision based system detect faces of the person and give the bounding boxes information. Later on, detect the central point of the bounding boxes around the face and then measure distance between two central point (centroid) using the standard equation of euclidean distance. The error rate is computed using RMSE which computes the difference between ground truth value and predicted value of the model. For instance,in the Distance 1 sample, the actual distance (ground truth) between two persons is 2.44 feet, whereas our proposed vision-based distance measurement method predicts 2.37 with quite lesser error rate i.e., 0.035 RMSE. In the next data sample i.e., Distance 2, the actual distance is 2.99 whereas, our model inferred 2.95 with the RMSE of 0.020. Similarly, in Distance 3 sample, the ground truth value is 3.16 whereas, the proposed method predicts 3.10 holding the error rate of 0.030 which is quite effective performance on our test set.

5 Conclusion

In this paper, a novel pipeline for develo** an end-to-end masked/non-masked face detection method is proposed to improve the effectiveness of real-time surveillance systems at public places. Alongside, a new dataset containing 10,000 images of two classes (masked face, non-masked face) is constructed to develop a generalized masked/non-masked face detection and social distance measurement in outdoor public places. While fine-tuning existing state-of-the-art single-stage and multi-stage detection methods, it is observed that Yolo-v3 outperformed the other networks in terms of accuracy and inference time. Based on analysis, we further improved the baseline Yolo-v3 by eliminating the less contributing residual connections in the network. Consequently, the results indicate that our customized YOLO-v3 performed better than baseline version, showing an improvement of 5.3% in terms of accuracy. In the future, we are aiming to extend our work to develop an image segmentation-based system that can provide accurate level information and gives greater clarity to detect face mask.

Data Availability

All the data used to support the findings of the study are available in the manuscript.

Code Availability

The proposed dataset along with implementation is available at https://github.com/iram1994/Face-Mask-Detection

References

Atangana A (2020) Modelling the spread of covid-19 with new fractal-fractional operators: can the lockdown save mankind before vaccination? Chaos, Solitons & Fractals 136:109860
Article MathSciNet Google Scholar
Berahmand K, Nasiri E, Li Y et al (2021) Spectral clustering on protein-protein interaction networks via constructing affinity matrix using attributed graph embedding. Comput Biol Med 138:104933
Article Google Scholar
Bhuiyan MR, Khushbu SA, Islam MS (2020) A deep learning based assistive system to classify covid-19 face mask for human safety with yolov3. In: 2020 11th international conference on computing, communication and networking technologies (ICCCNT). IEEE, pp 1–5
Butt M A, Riaz F, Mehmood Y, Akram S (2021) Reeec-agent: Human driver cognition and emotions-inspired rear-end collision avoidance method for autonomous vehicles. Simulation 97(9):601–617
Article Google Scholar
Butt MA, Khattak AM, Shafique S, Hayat B, Abid S, Kim K-I, Ayub M W, Sajid A, Adnan A (2021) Convolutional neural network based vehicle classification in adverse illuminous conditions for intelligent transportation systems. Complexity 2021
Butt M A, Riaz F (2022) Carl-d: A vision benchmark suite and large scale dataset for vehicle detection and scene segmentation. Sig Process Image Commun 104:116667
Article Google Scholar
Butt M A, Ul-Hasan A, Shafait F (2022) Traffsign: Multilingual traffic signboard text detection and recognition for urdu and english. In: International workshop on document analysis systems. Springer, pp 741–755
Chavda A, Dsouza J, Badgujar S, Damani A (2021) Multi-stage cnn architecture for face mask detection. In: 2021 6th international conference for convergence in technology (I2CT). IEEE, pp 1–8
Degadwala S, Vyas D, Chakraborty U, Dider AR, Biswas H (2021) Yolo-v4 deep learning model for medical face mask detection. In: 2021 international conference on artificial intelligence and smart systems (ICAIS). IEEE, pp 209–213
Edmond V, Onyema E, Osijirin A, Oka O (2022) Application of innovative technologies in computer science education during covid-19 school closure in enugu. 12:5129–5139
Ejaz M S, Islam M R, Sifatullah M, Sarker A (2019) Implementation of principal component analysis on masked and non-masked face recognition. In: 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT). IEEE, pp 1–5
Ge S, Li J, Ye Q, Luo Z (2017) Detecting masked faces in the wild with lle-cnns. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2682–2690
Ge X-Y, Pu Y, Liao C-H, Huang W-F, Zeng Q, Zhou H, Yi B, Wang A-M, Dou Q-Y, Zhou P-C et al (2020) Evaluation of the exposure risk of sars-cov-2 in different hospital environment. Sustain Cities Soc 61:102413
Article Google Scholar
Ghodgaonkar I, Chakraborty S, Banna V, Allcroft S, Metwaly M, Bordwell F, Kimura K, Zhao X, Goel A, Tung C et al (2020) Analyzing worldwide social distancing through large-scale computer vision. ar**v:2008.12363
Goldberg MH, Gustafson A, Maibach EW, Ballew MT, Bergquist P, Kotcher JE, Marlon JR, Rosenthal SA, Leiserowitz A (2020) Mask-wearing increased after a government recommendation: a natural experiment in the us during the covid-19 pandemic. Front Commun 5:44
Article Google Scholar
Hussain S, Yu Y, Ayoub M, Khan A, Rehman R, Wahid J A, Hou W (2021) Iot and deep learning based approach for rapid screening and face mask detection for infection spread control of covid-19. Appl Sci 11(8):3495
Article Google Scholar
Iftikhar A et al (2021) An insight into facial mask and social distance monitoring system based on deep learning object detector to prevent covid-19 transmission. In: Sinteza 2021-international scientific conference on information technology and data related research. Singidunum University, pp 120–127
Intelligence W (2021) Face mask detection dataset
Jagadeeswari C, Theja M U (2020) Performance evaluation of intelligent face mask detection system with various deep learning classifiers. Int J Adv Technol 29(11s):3074–3082
Google Scholar
Jiang N, Lu Y, Tang S, Goto S (2010) Rapid face detection using a multi-mode cascade and separate haar feature. In: 2010 international symposium on intelligent signal processing and communication systems. IEEE, pp 1–4
Katz R, Vaught A, Simmens S J (2019) Local decision making for implementing social distancing in response to outbreaks. Public Health Rep 134(2):150–154
Article Google Scholar
Khandelwal P, Khandelwal A, Agarwal S, Thomas D, Xavier N, Raghuraman A (2020) Using computer vision to enhance safety of workforce in manufacturing in a post covid world. ar**v:2005.05287
Kodali R K, Dhanekula R (2021) Face mask detection using deep learning. In: 2021 international conference on computer communication and informatics (ICCCI). IEEE, pp 1–5
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Lee C, Kim H J, Oh K W (2016) Comparison of faster r-cnn models for object detection. In: 2016 16th international conference on control, automation and systems (iccas). IEEE, pp 107–110
Levchev P, Krishnan M N, Yu C, Menke J, Zakhor A (2014) Simultaneous fingerprinting and map** for multimodal image and wifi indoor positioning. In: 2014 International conference on indoor positioning and indoor navigation (IPIN). IEEE, pp 442–450
Liang J Z, Corso N, Turner E, Zakhor A (2013) Image based localization in indoor environments. In: 2013 fourth international conference on computing for geospatial research and application. IEEE, pp 70–75
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu B, Zhao W, Sun Q (2017) Study of object detection based on faster r-cnn. In: 2017 Chinese Automation Congress (CAC). IEEE, pp 6233–6236
Loey M, Manogaran G, Taha MHN, Khalifa NEM (2021) Fighting against covid-19: a novel deep learning model based on yolo-v2 with resnet-50 for medical face mask detection. Sustain Cities Soc 65:102600
Article Google Scholar
Magoo R, Singh H, **dal N, Hooda N, Rana P S (2021) Deep learning-based bird eye view social distancing monitoring using surveillance video for curbing the covid-19 spread. Neural Comput Appl 33(22):15807–15814
Article Google Scholar
Mai X, Zhang H, Jia X, Meng MQ-H (2020) Faster r-cnn with classifier fusion for automatic detection of small fruits. IEEE Trans Autom Sci Eng 17(3):1555–1569
Google Scholar
mask Dataset — MakeML - Create Neural Network with ease. Makeml.app. (2022). Retrieved 15 January 2022, from https://makeml.app/datasets/mask
Mehrpooya A, Saberi-Movahed F, Azizizadeh N, Rezaei-Ravari M, Saberi-Movahed F, Eftekhari M, Tavassoly I (2022) High dimensionality reduction by matrix factorization for systems pharmacology. Brief Bioinform 23 (1):bbab410
Article Google Scholar
Middleton J, Martin-Moreno J M, Barros H, Chambaud L, Signorelli C (2020) ASPHER statement on the novel coronavirus disease (COVID-19) outbreak emergency. Int J Public Health 65(3):237–238
Article Google Scholar
Militante S V, Dionisio N V (2020) Deep learning implementation of facemask and physical distancing detection with alarm systems. In: 2020 third international conference on vocational education and electrical engineering (ICVEE). IEEE, pp 1–5
Nagrath P, Jain R, Madan A, Arora R, Kataria P, Hemanth J (2021) Ssdmnv2: a real time dnn-based face mask detection system using single shot multibox detector and mobilenetv2. Sustainable cities and society 66:102692
Article Google Scholar
Naudé W (2020) Artificial intelligence vs covid-19: limitations, constraints and pitfalls. AI & society 35(3):761–765
Article Google Scholar
Niu Y, Xu Z, Xu E, Li G, Huo Y, Sun W (2021) Monocular pedestrian 3d localization for social distance monitoring. Sensors 21(17):5908
Article Google Scholar
Onyema E M, Shukla P K, Dalal S, Mathur M N, Zakariah M, Tiwari B (2021) Enhancement of patient facial recognition through deep learning algorithm: Convnet. Journal of Healthcare Engineering 2021
Prather K A, Wang C C, Schooley R T (2020) Reducing transmission of sars-cov-2. Science 368(6498):1422–1424
Article Google Scholar
Prem K, Liu Y, Russell T W, Kucharski A J, Eggo R M, Davies N, Flasche S, Clifford S, Pearson CA, Munday J D et al (2020) The effect of control strategies to reduce social mixing on outcomes of the covid-19 epidemic in wuhan, china: a modelling study. The Lancet Public Health 5(5):e261–e270
Article Google Scholar
Punn N S, Sonbhadra S K, Agarwal S, Rai G (2020) Monitoring covid-19 social distancing with person detection and tracking via fine-tuned yolo v3 and deepsort techniques. ar**v:2005.01385
Qin B, Li D (2020) Identifying facemask-wearing condition using image super-resolution with classification network to prevent covid-19. Sensors 20(18):5236
Article Google Scholar
Rahmani M K I, Taranum F, Nikhat R, Farooqi M R, Khan M A (2022) Automatic real-time medical mask detection using deep learning to fight covid-19. Comput Syst Sci Eng 42(3):1181–1198
Article Google Scholar
Rasib M, Butt M A, Khalid S, Abid S, Raiz F, Jabbar S, Han K (2021) Are self-driving vehicles ready to launch? an insight into steering control in autonomous self-driving vehicles. Math Probl Eng 2021
Rasib M, Butt M A, Riaz F, Sulaiman A, Akram M (2021) Pixel level segmentation based drivable road region detection and steering angle estimation method for autonomous driving on unstructured roads. IEEE Access 9:167855–167867
Article Google Scholar
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. ar**v:1804.02767
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28
Rostami M, Forouzandeh S, Berahmand K, Soltani M, Shahsavari M, Oussalah M (2022) Gene selection for microarray data classification via multi-objective graph theoretic-based method. Artif Intell Med 123:102228
Article Google Scholar
Roy B, Nandy S, Ghosh D, Dutta D, Biswas P, Das T (2020) Moxa: a deep learning based unmanned approach for real-time monitoring of people wearing medical masks. Trans Indian Natl Acad Eng 5(3):509–518
Article Google Scholar
Saberi-Movahed F, Mohammadifard M, Mehrpooya A, Rezaei-Ravari M, Berahmand K, Rostami M, Karami S, Najafzadeh M, Ha**ezhad D, Jamshidi M et al (2022) Decoding clinical biomarker space of covid-19: exploring matrix factorization-based feature selection methods. Comput Biol Med 146:105426
Article Google Scholar
Sagayam K M et al (2021) Cnn-based mask detection system using opencv and mobilenetv2. In: 2021 3rd international conference on signal processing and communication (ICPSC). IEEE, pp 115–119
Sethi S, Kathuria M, Kaushik T (2021) Face mask detection using deep learning: an approach to reduce risk of coronavirus spread. J Biomed Inform 120:103848
Article Google Scholar
Snyder S E, Husari G (2021) Thor: a deep learning approach for face mask detection to prevent the covid-19 pandemic. In: SoutheastCon 2021. IEEE, pp 1–8
Suwarno I, Ma’arif A, Raharja N M, Hariadi T K, Shomad M A (2020) Using a combination of PID control and Kalman filter to design of IoT-based telepresence self-balancing robots during COVID-19 pandemic
Taneja S, Nayyar A, Nagrath P, et al. (2021) Face mask detection using deep learning during covid-19. In: Proceedings of second international conference on computing, communications, and Cyber-Security. Springer, pp 39–51
Wang Z, Wang G, Huang B, **ong Z, Hong Q, Wu H, Yi P, Jiang K, Wang N, Pei Y et al (2020) Masked face recognition dataset and application. ar**v:2003.09093
Yadav S (2020) Deep learning based safe social distancing and face mask detection in public areas for covid-19 safety guidelines adherence. Int J Res Appl Sci Eng Technol 8(7):1368–1375
Article Google Scholar
Yang D, Yurtsever E, Renganathan V, Redmill K A, Özgüner U (2021) A vision-based social distancing and critical density detection system for covid-19. Sensors 21(13):4608
Article Google Scholar
Zhang S, Chi C, Lei Z, Li S Z (2020) Refineface: refinement neural network for high performance face detection. IEEE Trans Pattern Anal Mach Intell 43(11):4008–4020
Article Google Scholar
Zhang S, Wen L, Bian X, Lei Z, Li S Z (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4203–4212
Zhang Z, Han Y, Zhou Y, Dai M (2013) A novel absolute localization estimation of a target with monocular vision. Optik 124(12):1218–1223
Article Google Scholar
Zhao Z-Q, Zheng P, Xu S-, Wu X (2019) Object detection with deep learning: A review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Technology, Mirpur University of Science and Technology, Azad Jammu and Kashmir, Pakistan
Iram Javed, Samina Khalid & Marium Sadiq
Information Technology University, Punjab, Lahore, Pakistan
Muhammad Atif Butt
Department of Software Engineering, Mirpur University of Science and Technology, Azad Jammu and Kashmir, Pakistan
Tehmina Shehryar
Department of Computer Science, University of Chakwal, Chakwal, 48800, Pakistan
Rashid Amin
Department of Software Engineering, Bahria University, Islamabad, Pakistan
Adeel Muzaffar Syed

Authors

Iram Javed
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Atif Butt
View author publications
You can also search for this author in PubMed Google Scholar
Samina Khalid
View author publications
You can also search for this author in PubMed Google Scholar
Tehmina Shehryar
View author publications
You can also search for this author in PubMed Google Scholar
Rashid Amin
View author publications
You can also search for this author in PubMed Google Scholar
Adeel Muzaffar Syed
View author publications
You can also search for this author in PubMed Google Scholar
Marium Sadiq
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The research conceptualization and methodology were done by Iram Javed, Muhammad Atif Butt, and Samina Khalid. The technical and theoretical framework was prepared by Iram Javed and Muhammad Atif Butt. The technical review and improvement were performed by Tehmina Shehryar, Rashid Amin, Adeel Muzaffar Syed, and Marium Sadiq. The overall technical support, guidance, and project administration were done by Iram Javed, Muhammad Atif Butt, and Samina Khalid.

Corresponding author

Correspondence to Samina Khalid.

Ethics declarations

Conflict of Interests

The authors declare no conflicts of interest.

Additional information

Disclosure

Iram Javed and Muhammad Atif Butt are the joint first authors to this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The proposed dataset along with implementation is available at https://github.com/iram1994/Face-Mask-Detection/

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Javed, I., Butt, M.A., Khalid, S. et al. Face mask detection and social distance monitoring system for COVID-19 pandemic. Multimed Tools Appl 82, 14135–14152 (2023). https://doi.org/10.1007/s11042-022-13913-w

Download citation

Received: 06 February 2022
Revised: 04 July 2022
Accepted: 12 September 2022
Published: 30 September 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11042-022-13913-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Face mask detection and social distance monitoring system for COVID-19 pandemic

Abstract

Similar content being viewed by others

Social Distance Measurement and Face Mask Detection Using Deep Learning Models

A Hybrid Deep Learning System to Detect Face-Mask and Monitor Social Distance

A Novel Approach to Detect Face Mask in Real Time

1 Introduction