Enhancing construction safety: predicting worker sleep deprivation using machine learning algorithms

Sathvik, S.; Alsharef, Abdullah; Singh, Atul Kumar; Shah, Mohd Asif; ShivaKumar, G.

doi:10.1038/s41598-024-65568-2

Enhancing construction safety: predicting worker sleep deprivation using machine learning algorithms

Article
Open access
Published: 08 July 2024

Volume 14, article number 15716, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Enhancing construction safety: predicting worker sleep deprivation using machine learning algorithms

Download PDF

S. Sathvik¹,
Abdullah Alsharef²,
Atul Kumar Singh^1,3,
Mohd Asif Shah^4,5,6 &
…
G. ShivaKumar¹

138 Accesses
Explore all metrics

Abstract

Sleep deprivation is a critical issue that affects workers in numerous industries, including construction. It adversely affects workers and can lead to significant concerns regarding their health, safety, and overall job performance. Several studies have investigated the effects of sleep deprivation on safety and productivity. Although the impact of sleep deprivation on safety and productivity through cognitive impairment has been investigated, research on the association of sleep deprivation and contributing factors that lead to workplace hazards and injuries remains limited. To fill this gap in the literature, this study utilized machine learning algorithms to predict hazardous situations. Furthermore, this study demonstrates the applicability of machine learning algorithms, including support vector machine and random forest, by predicting sleep deprivation in construction workers based on responses from 240 construction workers, identifying seven primary indices as predictive factors. The findings indicate that the support vector machine algorithm produced superior sleep deprivation prediction outcomes during the validation process. The study findings offer significant benefits to stakeholders in the construction industry, particularly project and safety managers. By enabling the implementation of targeted interventions, these insights can help reduce accidents and improve workplace safety through the timely and accurate prediction of sleep deprivation.

A quantitative study examining the effects of sleep quality on construction workers’ performance in the city of Jeddah, Saudi Arabia

Article Open access 14 May 2024

To Reveal the Critical Influencing Factors for Safety Behaviors of Chinese Construction Workers from Stress Management Perspective: A Machine-Learning Approach

The sound of safety: exploring the determinants of prevention intention in noisy industrial workplaces

Article Open access 04 January 2024

Introduction

Sleep deprivation (SD), characterized by frequent interruptions in breathing during sleep owing to increased blockage of the airways, affects 7% of female and 20% of male employees¹. Apart from excessive daytime tiredness and frequent awakenings, the most common symptom of obstructive sleep apnea (OSA) is snoring. Moreover, hypertension, atherosclerosis, coronary heart disease, and vasculitis are established to be linked to this ailment^2,3. Furthermore, SD is used to describe polygenic illness mellitus, hypoglycemic agent resistance, dyslipidemia, obesity, and psychological abnormalities⁴.

Early detection of this disease is essential given its prevalence in various parts of the world, including South Indian states, and its substantial impact on individuals’ health and quality of life^5,6. According to the international diagnostic criteria, patients with symptoms of SD and an apnea–hypopnea index (AHI) between 5 and 15/h are diagnosed with SD as a disorder of the central nervous system⁷. The diagnosis of sleep apnea necessitates a comprehensive polysomnography (PSG) test. Nonetheless, due to its high cost, clinical practitioners often reserve this test to select patients^8,9.

Numerous studies have explored the development of sleep deprivation (SD) and apnea–hypopnea index (AHI) prediction models tailored to individuals likely affected by these conditions. These models incorporate a grading system that does not require polysomnography^9,10. Regression analyses were conducted, accounting for demographic variables such as age and sex and incorporating measurements of factors such as body mass index and waist or neck size^11,12. However, prior prediction model studies have demonstrated that construction workers’ country, ethnicity, and characteristics significantly influence the results^13,14. For instance, SD is common in East Asians, including Indians, Chinese, and Japanese, and even in non-obese individuals^14,15. One particular reason is their narrower cranial traits^14,15. Consequently, several ethnic groups should be tested to ascertain the applicability of similar prediction models to populations including Indians. Recent advancements in computational capabilities have significantly increased the reliance of predictive modeling approaches on machine learning techniques.

Machine learning enables computers to learn from real-world data and discover previously unknown patterns^16,17,18. Traditional data analysis methods often rely on subjective opinions with analysts choosing specific methodologies. In contrast, machine learning progressively improves results over time through iterative processes^19,20. For example, precisely defining diseases for identification using mathematical models poses a significant challenge in the medical field. Machine learning is particularly applicable in data-rich sectors of medicine, which require extensive data for learning, processing, training, and validation^21,22. Consequently, various machine learning algorithms can be utilized to achieve specific objectives. Common classical machine learning models include logistic regression (LR), support vector machine (SVM), random forest (RF), and decision tree (DT). These methods are categorized as supervised learning that uses labeled data²³.

Previous studies on prediction models for the Indian population have predominantly utilized multivariate analysis or support vector machine (SVM). One study on sleep deprivation (SD) in construction workers utilized logistic regression (LR) with data from 433 individuals, achieving a sensitivity of 74.6% and a specificity of 66.3%^24,25,26. Another study employed an SVM to create an SD prediction model based on data from 566 individuals, resulting in an accuracy rate of 84.15%²⁷. Apart from regression analysis and SVM, other prediction methods have been utilized by several researchers due to the increase in computational power along with the ability to conduct the analysis in the cloud environment. The rationale for employing machine learning techniques in this study is multifaceted. First, machine learning algorithms excel at identifying intricate patterns and relationships within complex datasets, making them well-suited for the analysis of multidimensional data associated with sleep deprivation. Secondly, these techniques can handle a wide range of input variables, enabling the incorporation of diverse factors that may influence sleep patterns, such as physiological, environmental, and behavioral variables. Furthermore, machine learning models have the ability to learn and adapt dynamically, allowing for continuous refinement and improvement as new data becomes available. This adaptive nature is particularly advantageous in the context of sleep deprivation, where individual variations and evolving circumstances necessitate flexible and responsive models. Additionally, it is worth noting that no machine learning-based prediction models have been developed to predict SD among South Indian construction workers. The primary objective of this study was to evaluate the performance of various machine learning algorithms in predicting sleep deprivation among construction workers. In addition to validating the feasibility of predicting sleep deprivation, this study assessed the predictive efficacy of different machine learning algorithms. The remainder of this paper is structured as follows: section two outlines the research methodology. Section three presents the results, followed by section four which offers a discussion of these findings. Finally, section five concludes the paper with a comprehensive summary of the study's key outcomes.

Methodology

To accomplish the study's aims and objectives, the authors employed a multifaceted research methodology. Figure 1 provides a summary of the adopted research methodology. The subsequent sub-sections offer detailed explanations of the implemented research approach.

Data collection

In this study, a total of 295 construction workers were selected from the SRM Medical College and Research Centre, located in Tamil Nadu, India, for the collection of relevant data. The cohort consisted exclusively of Indian construction workers, each exhibiting symptoms of sleep deprivation (SD) such as daytime sleepiness, frequent snoring, and cases of witnessed sleep apnea. These symptoms were critical in identifying the target group for this study. A detailed analysis was conducted to further understand the extent of the SD among these workers. This analysis involved correlating 92 different variables with workers' SD and overall sleep status, providing a comprehensive view of the factors influencing their sleep health. The data collection process adhered to rigorous protocols to ensure the reliability and validity of the gathered information. A comprehensive set of questionnaires was administered, the questionnaires were specifically tailored to the Indian context to ensure cultural relevance and accuracy. These included localized versions of well-established tools such as the Pittsburgh Sleep Quality Index (PSQI), Fatigue Severity Scale (FSS), and Epworth Sleepiness Scale (ESS). Furthermore, to complement the self-reported data, a series of anthropometric measurements was performed on the participants. These measurements were vital in providing baseline physical data, which could be crucial for understanding the correlation between physical characteristics and sleep disorders among workers. The entire data collection process was conducted in accordance with strict ethical guidelines, ensuring the protection of participants' rights and well-being.

The responses of 288 construction workers were rigorously analyzed, taking into account the exclusion of data from seven workers due to their incompleteness. The participating construction workers were systematically categorized into two distinct groups based on the presence or absence of sleep deprivation (SD): 220 construction workers were identified as experiencing SD, whereas the remaining 68 workers were classified as not suffering from SD. The methodology for this classification involved carefully defining the inclusion and exclusion criteria. Moreover, detailed anthropometric measurements were obtained, and polysomnography (PSG) tests were implemented on these workers to ensure accurate categorization and data collection. The study was conducted according to strict ethical guidelines. The Ethics Committee of SRM Medical College Hospital and Research Center (reference number 2186/IEC/2020) was thoroughly reviewed and approved for this research. This approval was contingent upon obtaining written consent from all participating construction workers, ensuring that they were fully informed about the nature of the study and their role in it. Such a process underscores the commitment to ethical standards and the importance of informed consent in conducting research that involves human subjects. This thorough approach not only reinforced the integrity of the research, but also ensured the protection and respect for the rights and well-being of the participants.

Feature selection

The features of the model were selected using a permutation feature selection method involving 92 permutations²². This process began by training the model with all the variables. The importance of each feature was then evaluated based on the change in a specific performance score following a random shuffle of the feature's values²⁶. Consequently, this shuffle disrupted the relationship between the feature and target outcome, revealing the significance of the feature²⁸. Figure 2 shows the seven significance proportions of the final seven FSS features selected using the permutation algorithm. The seven variables are hypertension, stomion to subscale, snoring from the PQ, intensity of snoring from the PQ, total FSS score, waist circumference, and frequency of falling asleep from the PQ.

Prediction of OSA using machine learning algorithms

Following a comprehensive review of the existing articles, the research team identified four widely employed machine learning algorithms to model the data effectively. These algorithms include logistic regression (LR), support vector machine (SVM), random forest (RF), and decision tree (DT). Each of these algorithms was chosen for its distinct methodological strength and suitability for the collected data. Prior to fitting the four identified models, the data were split into training and test sets. For each sleep disorder (SD) and non-sleep disorder (non-SD) group, 30% of the data were randomly chosen for testing (SD: 88 instances; non-SD: 25 instances), and the remainder for training (SD: 152 instances; non-SD: 49 instances). After completing the process, the four models (i.e., LR, DT, RF, and SVM) were trained.

LR is a widely used machine learning algorithm for classification tasks, such as determining if an email is spam or not²⁹. It calculates probabilities to classify new observations into categories, excelling in situations in which data points relate linearly to these categories. It is commonly applied in healthcare, marketing, and finance, and is valued for its simplicity and ease of interpretation. Similarly, DT in machine learning is a commonly utilized classification approach that is primarily utilized for categorizing data. It functions similar to a flowchart, with each branch symbolizing a data-driven decision with a probability in a definitive categorization. Its popularity stems from its simplicity with which it can be understood and visually represented, making it a widely preferred choice for addressing a variety of classification issues across numerous industries and among researchers from different specializations. As for RF, it is a widely used for machine learning, which is known for its adaptability to complex data, while accounting for correlations and interactions among features. It is an ensemble learning tool that combines multiple decision trees to form a powerful classifier, thereby reducing the risk of overfitting.

The process began by transforming the input data into a high-dimensional space to establish an optimal boundary between the groups. Subsequently, an ensemble RF model was developed, which extended from multiple decision trees. The RF approach involves generating a series of decision trees and selecting the best classification based on their collective output. Additionally, the DT algorithm was employed to enhance the gradient boosting model (GBM). DT stands out for its faster execution and higher prediction accuracy compared to GBM. To prevent overfitting, a regularization function was incorporated into the model. Finally, a grid search was conducted to identify the most effective parameters for each machine learning model to ensure optimal performance²⁹.

Performance evaluation of machine learning models

The performance of LR and the other three machine learning techniques (SVM, RF, and DT) was assessed by calculating key metrics: specificity, sensitivity, positive predictive value (PPV), accuracy, and negative predictive value (PNV). These metrics were derived from the true-positive, false-positive (FP), true-negative, and false-negative (FN) outcomes of each model. In addition, the area under the curve (AUC) was computed to assess the overall performance of each model. The analysis was conducted in Python using the Scikit-learn library (version 0.23.2)²⁶. The receiver operating characteristic (ROC) curves and their comparative analyses were performed using MedCalc software (version 14.0)³⁰. The IBM SPSS for Windows was used for further statistical analyses³¹. To maintain the reliability of the findings, a statistical significance threshold was established at p < 0.05 to ensure the robustness and validity of the results.

Ethics statement

This research was approved by the Ethics Committee of the SRM Hospital and Research Centre (2186/IEC/2020) and conducted according to the principles of the institutional ethical committee. The patients provided written informed consent to participate in the study. Informed consent was obtained from all matters belonging to this research study.

Results

Table 1 presents a comparative profile of construction workers in the sleep disorder (SD) and non-sleep disorder (non-SD) groups. The reported statistics in Table 1 suggest that there is a statistically significant difference between the two groups in all cases, with a p value less than 0.05, except for sex and total sleep duration (min). The findings from the analysis suggest that sex is not a distinguishing factor between the OSA and Non OSA groups. Additionally, there was no difference in sleep duration between the two groups as well.

Table 1 Comparison of participant profiles in non-OSA and OSA groups.

Full size table

Each of the four machine learning models (i.e., LR, DT, RF, and SVM) was trained using seven variables to predict the SD. Additionally, their performance was evaluated using a designated set of test data, with the results are summarized in Table 2. The accuracy of the models in predicting SD was as follows: RF achieved 85.2% (confidence interval [CI] 78.7–89.5), SVM scored 99.4% (CI 97.8–97.8), DT attained 92.8% (CI 89.4–95.8), and LR recorded 99.3% (CI 96.7–99.8). The area under the curve (AUC) for these models, depicted in Fig. 3, was 0.95 (CI 0.89–0.97) for RF, 0.98 (CI 0.97–1.3) for SVM, 0.95 (CI 0.93–0.97) for DT, and 0.95 (CI 0.96–1.3) for LR. These results demonstrate the effective training of each model, indicating their proficiency in accurately predicting sleep disorders among construction workers.

Table 2 Comparison of participant profiles in non-OSA and OSA groups.

Full size table

The performance of sleep disorder (SD) prediction was assessed using a distinct set of test data. Among the models tested, the SVM demonstrated the highest accuracy. It achieved a sensitivity of 81.34% (CI 65.37–84.82), specificity of 87.97% (CI 68.6–99.4), positive predictive value (PPV) of 90.62% (CI 82.64–95.63), negative predictive value (PNV) of 69.77% (CI 52.99–83.85), and an overall accuracy of 85.45% (CI 75.64–92.70), as depicted in Fig. 4. In comparison, the DT model exhibited the lowest performance, with a sensitivity of 80.69% (CI 67.4–88.2), specificity of 74.92% (CI 52.7–90.2), PPV of 86.72% (CI 79.29–92.43), PNV of 54.58% (CI 38.57–68.05), and accuracy of 76.0% (CI 65.37–84.82). Notably, among the models, including SVM, RF, DT, and LR, SVM delivered the highest accuracy. In the Receiver Operating Characteristic (ROC) analysis, SVM also exhibited the highest area under the curve (AUC) at 0.88 (CI 0.78–0.94), followed by LR (0.85, CI 0.75–0.92), RF (0.83, CI 0.73–0.90), and DT (0.81, CI 0.71–0.89). Overall, the differences in the AUCs (p = 0.41) were statistically insignificant, indicating a comparable level of performance among the examined models.

Figure 5 presents a heatmap detailing the effect of each feature on sleep disorder prediction across the different machine learning models. Notably, waist circumference emerged as the most influential factor in SD prediction for all models, with importance scores of 0.15 in RF, 0.13 in SVM, 0.14 in DT, and 0.13 in LR. Additionally, the loudness of snoring, as measured by the PSQI, also significantly affected the SD prediction, with scores of 0.04 in RF, 0.09 in SVM, 0.05 in DT, and 0.13 in LR. Based on these findings, an application was developed for SD prediction that calculates the likelihood of SD by providing values for these seven key features. This application provides users with a probability estimate of having SD, leveraging the predictive power of the developed machine learning models as part of this study.

Discussion

Based on the responses of 282 participants, this experiment selected 92 variables as key indices for predicting SD and compared the performance of several machine learning methods for SD prediction (LR, SVM, RF, and DT). As observed, seven indices affected the prediction of SD (hypertension, subscale of stomion, PSQI from snoring, snoring loudness from the PSQI, falling sleep from the PSQI, and total FSS score)^1,32. Using the selected indices as inputs, these models predicted the SD with an accuracy of 88% (SVM (0.88), LR (0.85), SVM (0.83), RF (0.84), and DT (0.84)). This outcome was primarily due to foreign terrorist organizations (0.80). In terms of the supported accuracy (76.0%), SVM (84.44%) ranked first, followed by RF (79.68%), LR (76.0%), and DT (76.0%).

Overall, the PPV was considerably high and the PNV was low for most of the models tested in this experiment. According to the present findings, several models have exhibited few autonomous agencies and numerous FNs²⁴. If the model predicts the occurrence of SD (SD group) even when a specific case does not exist, it is called FP, and if the model predicts the absence of SD (non-SD group) even when a specific case exists, it is called FN (SD group)²⁸. If one of the two groups contains more training data than the other, the model training is likely to be biased toward the group with more data. As the training data for the SD group was three times more than that of the non-SD group, a potential for training bias existed in this trial. The sensitivity and specificity predictions for most models were not biased toward the non-SD cluster (p = 0.08) and displayed no excessive discrepancy^30,33.

The SD prediction performance was tested using the separately created test data, and the obtained results indicated the highest accuracy of the SVM model, with a sensitivity of 81.34% (CI 65.37–84.82), specificity of 87.97% (CI 68.6–99.4), PPV of 90.62% (CI 82.64–95.63), PNV of 69.77% (CI 52.99–83.85), and accuracy of 85.45% (CI 75.64–92.70), as illustrated in Fig. 4. In contrast, the DT model revealed the lowest performance with 80.69% sensitivity (CI 67.4–88.2), 74.92% specificity (CI 52.7–90.2), 86.72% PPV (CI 79.29–92.43), 54.58% PNV (CI 38.57–68.05), and 76.0% accuracy (CI 65.37–84.82). Notably, SVM, RF, DT, and LR delivered the highest accuracies. In the ROC analysis, SVM exhibited the highest AUC (0.88, CI 0.78–0.94), followed by LR (0.85, CI 0.75–0.92), RF (0.83, CI 0.73–0.90), and DT (0.81, CI 0.71–0.89). Overall, the variations between the AUCs (p = 0.41) were insignificant.

The heatmap effects of each feature on SD prediction for each model are illustrated in Fig. 5. Across all models, waist circumference had the greatest influence on SD prediction (RF = 0.15, SVM = 0.13, DT = 0.14, and LR = 0.13) and PSQI snoring loudness (RF = 0.04, SVM = 0.09, DT = 0.05, and LR = 0.13). Accordingly, the machine learning models developed for SD prediction were integrated into an application that yielded a probability for SD upon inputting these seven features.

Based on this outcome, we deduced that the training data accurately represented the SD and could be used in machine learning algorithms. The older SVM model outperformed the newer DT and RF models in this study. In general, the SVM performs appropriately with small datasets, which justifies its frequent and wide applications. However, SVM suffers from the limitation that it loses precision after a certain number of boundary overlaps^34,35. Consequently, the number of knowledge points increases because the accuracy decreases when the boundary between the information for the prediction is uncertain. The extent of data was not sufficiently large to obtain reasonable performance because of the prominent deviation in the number of participants from the SD and non-SD groups³¹. Because the training dataset was small, a larger volume of data should be obtained in the future to ensure distinct characteristics between the SD and non-SD groups. Among the seven main features, waist circumference and PSQI snoring volume were the two characteristics that significantly affected SD prediction, with snoring being a vital sign of sleep apnea and one of the most noticeable signs of the disease.

As expected, snoring volume was directly correlated with the severity of SD (AHI). Notably, a large waist circumference is a significant risk factor and predictor of SD, as well as a significant contributor to the severity of the condition. In a previous study, this correlation was observed among Indians. Snoring volume and waist circumference have been highlighted as crucial factors in previous investigations of SD prediction models^36,37. Based on relevant data from individuals suspected of SD from Asian countries, as well as the most recently developed algorithms, this study constructed SD prediction models and compared their performances to examine the most suitable machine learning model for predicting SD. Thus, machine learning models offer promising potential for predicting SD, as demonstrated by the high accuracy of the four machine learning models^38,39. In particular, the SVM is the most effective model for predicting SD based on small datasets. However, this study has certain limitations, owing to the limited sample size. Owing to the magnitude of the biased information in the relationship between the SD and non-SD teams, a reasonable degree of uncertainty perturbed the model performance because the entire dataset was clearly insufficient for training and validating the machine learning models⁴⁰.

In addition, overfitting was a limiting factor in validation during model training and substantiation. Moreover, there is a possibility of bias when using random-check information in the validation process⁴¹. Notably, the validation results differed completely when alternative randomly selected data were used for the testing. Nonetheless, overfitting was unlikely because of the limited size of the dataset, and the model performance remained consistent throughout the training and testing stages. However, validation requires additional overfitting analysis.

Thus, additional training data must be acquired for machine learning, and in the future, overfitting analyses should be conducted using validation methods, such as cross-validation and external validation. Further research on more extensive datasets and additional analyses will potentially improve the performance of the DT, RF, and SVM. Notably, machine learning algorithms such as LR, SVM, RF, and DT are considerably promising for the American state prediction of victimization data from India. Thus, machine learning is critical in the context of SD prediction. According to previous studies, the analytical technique is notable from the findings of construction workers. Instead of using LR or SVM to predict outcomes from American states, several machine learning methods can be applied, and their results can be comparatively analyzed to determine the most appropriate strategy for state prediction. The current SD prediction model achieved an AUC of 0.87 in comparison to a CAUC of 0.78 obtained by the LR-based SD prediction model of the Spanish cluster.

The SD prediction model constructed based on the responses from worker groups delivered an accuracy of 87.72% when compared to the current model, which displayed an accuracy of 83.33% and used a constant machine learning model similar to the SVM model used in this study. The South Indian state prediction model exhibited a sensitivity and specificity of 80.33% and 86.96%, respectively, using the same SVM model with marginal deviations and stable performance. In contrast, the American state prediction model proposed by the construction labor cluster exhibited an extremely large deviation between sensitivity and specificity of 42.86% and 94%, respectively, with an extremely low sensitivity. This indicated that the non-SD group received preferential treatment in terms of instruction or training and, as expected, learning did not occur. This outcome could be a result of the magnitude of the knowledge composition relationship between the SD and non-SD groups, along with unoptimized training parameters or methods for feature selection. Thus, a superior SD prediction model was preferred.

Compared with previous studies, the knowledge base for this research was smaller, which could be perceived as a drawback. Consequently, further studies are required to obtain more comprehensive data. In the future development of digital healthcare, machine learning approaches will be comparable to mobile applications for tailored observance of American states. More importantly, the daily progression of SD risk and AHI risk can be tracked using physiological data, such as atomic number 8 saturation, snoring sound, respiratory pattern, and pulse recorded during sleep using wearable devices or mobile phones. Furthermore, data related to cardiovascular disease and anthropometric parameters can be combined and analyzed using machine learning methods.

In summary, the key findings from this study demonstrate the efficacy of machine learning techniques, particularly the support vector machine (SVM) algorithm, in accurately predicting sleep deprivation among construction workers based on seven identified predictive factors. The SVM model achieved superior performance with 85.45% accuracy, 81.34% sensitivity, and 87.97% specificity, outperforming other models like random forest, decision tree, and logistic regression. Moreover, waist circumference and snoring loudness emerged as the most influential factors contributing to sleep deprivation prediction across all models. A subsequent study would aim to examine indicators such as feature importance and Shapley Additive exPlanations (SHAP) to assess their significance⁴².

Conclusions

Sleep deprivation poses a significant challenge across various sectors, particularly in the construction industry. This issue not only impacts the health and safety of workers but also their overall job efficiency. Numerous studies have explored how lack of sleep affects safety and productivity, particularly through cognitive deficits. However, there is a notable scarcity of research on the identification and mitigation of workplace hazards. To address this research gap, the current study focuses on the application of machine learning algorithms to predict hazardous conditions. This particularly highlights the effectiveness of specific techniques, such as support vector machines and random forests, in predicting sleep deprivation among construction workers. Based on the data collected from 240 construction workers, 92 variables related to sleep deprivation were identified. Seven key indices were chosen for detailed analysis. This study developed and validated four types of machine learning models for predicting SD: SVM, RF, logistic regression, and DT. Using data from South Indian construction workers with suspected SD, all four models exhibited strong SD prediction performance, wherein the SVM yielded the best SD prediction result. Therefore, machine learning techniques are essential for develo** a viable digital sleep health system to predict sleep deprivation and sleep disorders in the future.

The findings of this study have significant real-world implications and contribute to the academic discourse on construction safety and worker well-being. By develo** and validating machine learning models for predicting sleep deprivation among construction workers, this research paves the way for practical applications that can proactively identify at-risk individuals and facilitate targeted interventions. Project managers and safety professionals can leverage these predictive models to implement tailored strategies, such as adjusting work schedules, providing sleep hygiene education, or offering counseling services, to mitigate the adverse effects of sleep deprivation. Moreover, the academic contribution of this study lies in its demonstration of the efficacy of machine learning techniques in addressing a critical issue within the construction industry, thereby expanding the knowledge base and fostering further research in this domain.

One of the limitations of this study was the relatively small sample size. To this end, future follow-up studies should be conducted on machine learning and artificial intelligence (AI) approaches for predicting SD in construction workers using extensive datasets to improve the performance of relevant machine learning methods. Specifically, future studies should investigate fitting deep learning models, such as convolutional neural networks, recurrent neural networks, long short-term memory, and deep neural networks, for structured data to predict worker sleep deprivation. The application of machine learning algorithms to predict sleep deprivation can significantly improve the health and safety of construction workers by identifying those at risk. The early detection of sleep deprivation can lead to interventions that prevent accidents and health issues. Additionally, mitigating the risks associated with sleep deprivation can enhance overall job efficiency and reduce the likelihood of accidents, which in turn can decrease costs related to workplace injuries and inefficiencies and improve overall construction project success.

Data availability

The corresponding author will provide data on request to support the findings of this study.

References

Sathvik, S. & Loganathan, K. A quantitative analysis between sleep and psychological behaviour of Indian construction workers. J. Turk. Sleep Med. 9(3), 221–231. https://doi.org/10.4274/jtsm.galenos.2022.64426 (2022).
Article Google Scholar
Şimşek, Y. & Tekgül, N. Sleep quality in adolescents in relation to age and sleep-related habitual and environmental factors. J. Pediatr. Res. 6(4), 307–313. https://doi.org/10.4274/jpr.galenos.2019.86619 (2019).
Article Google Scholar
Franzen, P. L., Siegle, G. J. & Buysse, D. J. Relationships between affect, vigilance, and sleepiness following sleep deprivation. J. Sleep Res. 17(1), 34–41. https://doi.org/10.1111/j.1365-2869.2008.00635.x (2008).
Article PubMed PubMed Central Google Scholar
Stimpfel, A. W., Fatehi, F. & Kovner, C. Nurses’ sleep, work hours, and patient care quality, and safety. Sleep Health 6(3), 314–320. https://doi.org/10.1016/j.sleh.2019.11.001 (2020).
Article PubMed Google Scholar
Jarrin, D. C., McGrath, J. J., Silverstein, J. E. & Drake, C. Objective and subjective socioeconomic gradients exist for sleep quality, sleep latency, sleep duration, weekend oversleep, and daytime sleepiness in adults. Behav. Sleep Med. 11(2), 144–158. https://doi.org/10.1080/15402002.2011.636112 (2013).
Article PubMed Google Scholar
Philipsen, M. T. et al. Sleep, psychological distress, and clinical pregnancy outcome in women and their partners undergoing in vitro or intracytoplasmic sperm injection fertility treatment. Sleep Health 8(2), 242–248. https://doi.org/10.1016/j.sleh.2021.10.011 (2022).
Article PubMed Google Scholar
Reis, C., Pilz, L. K., Keller, L. K., Paiva, T. & Roenneberg, T. Social timing influences sleep quality in patients with sleep disorders. Sleep Med. 71, 8–17. https://doi.org/10.1016/j.sleep.2020.02.019 (2020).
Article PubMed Google Scholar
Brick, C. A., Seely, D. L. & Palermo, T. M. Association between sleep hygiene and sleep quality in medical students. Behav. Sleep Med. 8(2), 113–121. https://doi.org/10.1080/15402001003622925 (2010).
Article PubMed PubMed Central Google Scholar
Raniti, M. B. et al. Sleep duration and sleep quality: Associations with depressive symptoms across adolescence. Behav. Sleep Med. 15(3), 198–215. https://doi.org/10.1080/15402002.2015.1120198 (2017).
Article PubMed Google Scholar
Schwartz, A. R. et al. Brief digital sleep questionnaire powered by machine learning prediction models identifies common sleep disorders. Sleep Med. 71, 66–76. https://doi.org/10.1016/j.sleep.2020.03.005 (2020).
Article PubMed Google Scholar
Mnatzaganian, C. L., Atayee, R. S., Namba, J. M., Brandl, K. & Lee, K. C. The effect of sleep quality, sleep components, and environmental sleep factors on core curriculum exam scores among pharmacy students. Curr. Pharm. Teach. Learn. 12(2), 119–126. https://doi.org/10.1016/j.cptl.2019.11.004 (2020).
Article PubMed Google Scholar
Meltzer, L. J., Shaheed, K. & Ambler, D. Start later, sleep later: School start times and adolescent sleep in homeschool versus public/private school students. Behav. Sleep Med. 14(2), 140–154. https://doi.org/10.1080/15402002.2014.963584 (2016).
Article PubMed Google Scholar
Visvalingam, N. et al. Prevalence of and factors associated with poor sleep quality and short sleep in a working population in Singapore. Sleep Health 6(3), 277–287. https://doi.org/10.1016/j.sleh.2019.10.008 (2020).
Article PubMed Google Scholar
Segrin, C. & Burke, T. J. Loneliness and sleep quality: Dyadic effects and stress effects. Behav. Sleep Med. 13(3), 241–254. https://doi.org/10.1080/15402002.2013.860897 (2015).
Article PubMed Google Scholar
Paavonen, E. J., Huurre, T., Tilli, M., Kiviruusu, O. & Partonen, T. Brief behavioral sleep intervention for adolescents: An effectiveness study. Behav. Sleep Med. 14(4), 351–366. https://doi.org/10.1080/15402002.2015.1007993 (2016).
Article PubMed Google Scholar
Cellini, N. et al. Changes in sleep timing and subjective sleep quality during the COVID-19 lockdown in Italy and Belgium: Age, gender and working status as modulating factors. Sleep Med. 77, 112–119. https://doi.org/10.1016/j.sleep.2020.11.027 (2021).
Article PubMed Google Scholar
Killgore, W. D. S. et al. Sleep quality and duration are associated with greater trait emotional intelligence. Sleep Health 8(2), 230–233. https://doi.org/10.1016/j.sleh.2021.06.003 (2022).
Article PubMed Google Scholar
Bernat Adell, M. D. et al. Factors affecting sleep quality in intensive care units. Med. Intensiva (Engl. Ed.) 45(8), 470–476. https://doi.org/10.1016/j.medine.2021.08.011 (2021).
Article CAS PubMed Google Scholar
Powell, R. & Cop**, A. Sleep deprivation and its consequences in construction workers. J. Constr. Eng. Manag. https://doi.org/10.1061/ASCECO.1943-7862.0000211 (2010).
Article Google Scholar
Aloba, O. O., Adewuya, A. O., Ola, B. A. & Mapayi, B. M. Validity of the Pittsburgh sleep quality index (PSQI) among Nigerian university students. Sleep Med. 8(3), 266–270. https://doi.org/10.1016/j.sleep.2006.08.003 (2007).
Article PubMed Google Scholar
Bai, S., Buxton, O. M., Master, L. & Hale, L. Daily associations between family interaction quality, stress, and objective sleep in adolescents. Sleep Health 8(1), 69–72. https://doi.org/10.1016/j.sleh.2021.11.006 (2022).
Article PubMed Google Scholar
Lee, S. Y. et al. Factors associated with poor sleep quality in the Korean general population: Providing information from the Korean version of the Pittsburgh sleep quality index. J. Affect. Disord. 271, 49–58. https://doi.org/10.1016/j.jad.2020.03.069 (2020).
Article PubMed Google Scholar
Westerlund, A., Lagerros, Y. T., Kecklund, G., Axelsson, J. & Åkerstedt, T. Relationships between questionnaire ratings of sleep quality and polysomnography in healthy adults. Behav. Sleep Med. 14(2), 185–199. https://doi.org/10.1080/15402002.2014.974181 (2016).
Article PubMed Google Scholar
Goodin, B. R., McGuire, L. & Smith, M. T. Ethnicity moderates the influence of perceived social status on subjective sleep quality. Behav. Sleep Med. 8(4), 194–206. https://doi.org/10.1080/15402002.2010.509193 (2010).
Article PubMed Google Scholar
Bogdan, A. R. & Reeves, K. W. Sleep duration in relation to attention deficit hyperactivity disorder in American adults. Behav. Sleep Med. 16(3), 235–243. https://doi.org/10.1080/15402002.2016.1188391 (2018).
Article PubMed Google Scholar
Tekcan, P., Çalişkan, Z. & Kocaöz, S. Sleep quality and related factors in Turkish high school adolescents. J. Pediatr. Nurs. 55, 120–125. https://doi.org/10.1016/j.pedn.2020.07.020 (2020).
Article PubMed Google Scholar
Garrigós-Pedrón, M., Segura-Ortí, E., Gracia-Naya, M. & La Touche, R. Predictive factors of sleep quality in patients with chronic migraine. Neurologia 37(2), 101–109. https://doi.org/10.1016/j.nrl.2018.11.004 (2022).
Article PubMed Google Scholar
Hawkins, M. et al. Physical activity and sleep quality and duration during pregnancy among hispanic women: Estudio PARTO. Behav. Sleep Med. 17(6), 804–817. https://doi.org/10.1080/15402002.2018.1518225 (2019).
Article PubMed Google Scholar
Introduction to Machine Learning with Python: A Guide for Data Scientists.
Kim, T. Y., You, S. E. & Ko, Y. S. Association between Sasang constitutional types with obesity factors and sleep quality. Integr. Med. Res. 7(4), 341–350. https://doi.org/10.1016/j.imr.2018.06.007 (2018).
Article PubMed PubMed Central Google Scholar
Sathvik, S., Krishnaraj, L. & Awuzie, B. O. An assessment of prevalence of poor sleep quality among construction workers in Southern India. Built. Environ. Proj. Asset Manag. 13(2), 290–305. https://doi.org/10.1108/BEPAM-03-2022-0041 (2023).
Article Google Scholar
Robotham, D. Sleep as a public health concern: Insomnia and mental health. J. Public Ment. Health 10(4), 234–237. https://doi.org/10.1108/17465721111188250 (2011).
Article Google Scholar
Paterson, J. L., Reynolds, A. C., Duncan, M., Vandelanotte, C. & Ferguson, S. A. Barriers and enablers to modifying sleep behavior in adolescents and young adults: A qualitative investigation. Behav. Sleep Med. 17(1), 1–11. https://doi.org/10.1080/15402002.2016.1266489 (2019).
Article PubMed Google Scholar
Yang, Y. et al. Prevalence and associated factors of poor sleep quality among Chinese returning workers during the COVID-19 pandemic. Sleep Med. 73, 47–52. https://doi.org/10.1016/j.sleep.2020.06.034 (2020).
Article PubMed PubMed Central Google Scholar
Du, M. et al. Maternal sleep quality during early pregnancy, risk factors and its impact on pregnancy outcomes: A prospective cohort study. Sleep Med. 79, 11–18. https://doi.org/10.1016/j.sleep.2020.12.040 (2021).
Article PubMed Google Scholar
Mehdizadeh Khorrami, B., Soleimani, A., Pinnarelli, A., Brusco, G. & Vizza, P. Forecasting heating and cooling loads in residential buildings using machine learning: A comparative study of techniques and influential indicators. Asian J. Civ. Eng. https://doi.org/10.1007/s42107-023-00834-8 (2023).
Article Google Scholar
Hammel, L. et al. Electric-impulse-technology: Results of a basic investigation into the use of the technology as a selective demolition method in the construction industry. Asian J. Civ. Eng. 24(7), 1981–1995. https://doi.org/10.1007/s42107-023-00617-1 (2023).
Article Google Scholar
Al Alawi, M. Delay in payment effects on productivity of small and medium construction companies in Oman: Exploration and ranking. Asian J. Civ. Eng. 22(7), 1347–1359. https://doi.org/10.1007/s42107-021-00387-8 (2021).
Article Google Scholar
Albtoush, A. M. F., Doh, S. I., Rahman, R. A. & Al-Momani, A. H. Critical success factors of construction projects in Jordan: An empirical investigation. Asian J. Civ. Eng. 23(7), 1087–1099. https://doi.org/10.1007/s42107-022-00470-8 (2022).
Article Google Scholar
Mehdizadeh Khorrami, B., Soleimani, A., Pinnarelli, A., Brusco, G. & Vizza, P. Correction: Forecasting heating and cooling loads in residential buildings using machine learning: A comparative study of techniques and influential indicators (Asian Journal of Civil Engineering, (2023), 10.1007/s42107-023-00834-8). Asian J. Civ. Eng. https://doi.org/10.1007/s42107-023-00865-1 (2023).
Article Google Scholar
Sathvik, S., Singh, A. K., Prasath Kumar, V. R. & Krishnaraj, L. Intellectual human behaviour on business environment on 3’c model. J. Eng. Res. (Kuwait) 9, 1–11. https://doi.org/10.36909/jer.ACMM.16325 (2022).
Article Google Scholar
Sun, K. et al. An interpretable clustering approach to safety climate analysis: Examining driver group distinctions. Accid Anal. Prev. 196, 107420. https://doi.org/10.1016/j.aap.2023.107420 (2024).
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Civil Engineering, Dayananda Sagar College of Engineering, Bengaluru, Karnataka, 560111, India
S. Sathvik, Atul Kumar Singh & G. ShivaKumar
Department of Civil Engineering, College of Engineering, King Saud University, P.O. Box 800, 11421, Riyadh, Saudi Arabia
Abdullah Alsharef
Department of Civil Engineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, 603203, India
Atul Kumar Singh
Kabridahar University, P.O Box 250, Kebri Dehar, Ethiopia
Mohd Asif Shah
Division of Research and Development, Lovely Professional University, Phagwara, Punjab, 144001, India
Mohd Asif Shah
Centre of Research Impact and Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, 140401, India
Mohd Asif Shah

Authors

S. Sathvik
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah Alsharef
View author publications
You can also search for this author in PubMed Google Scholar
Atul Kumar Singh
View author publications
You can also search for this author in PubMed Google Scholar
Mohd Asif Shah
View author publications
You can also search for this author in PubMed Google Scholar
G. ShivaKumar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.S. wrote the main manuscript text; A.K.S. and G.S.K. prepared tables in the paper; M.A.S. and A.A. prepared figures in the paper. All authors reviewed the manuscript. Thanks to SRM Institute of Science and Technology and my PhD supervisor, Dr. L. Krishnaraj, for their support and supervision throughout this research.

Corresponding authors

Correspondence to S. Sathvik, Atul Kumar Singh or Mohd Asif Shah.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sathvik, S., Alsharef, A., Singh, A.K. et al. Enhancing construction safety: predicting worker sleep deprivation using machine learning algorithms. Sci Rep 14, 15716 (2024). https://doi.org/10.1038/s41598-024-65568-2

Download citation

Received: 28 January 2024
Accepted: 21 June 2024
Published: 08 July 2024
DOI: https://doi.org/10.1038/s41598-024-65568-2
Springer Nature Limited

Enhancing construction safety: predicting worker sleep deprivation using machine learning algorithms

Abstract

Similar content being viewed by others

A quantitative study examining the effects of sleep quality on construction workers’ performance in the city of Jeddah, Saudi Arabia

To Reveal the Critical Influencing Factors for Safety Behaviors of Chinese Construction Workers from Stress Management Perspective: A Machine-Learning Approach

The sound of safety: exploring the determinants of prevention intention in noisy industrial workplaces

Introduction