Detection of atmospheric radon concentration anomalies and their potential for earthquake prediction using Random Forest analysis

Tsuchiya, Mayu; Nagahama, Hiroyuki; Muto, Jun; Hirano, Mitsuhiro; Yasuoka, Yumi

doi:10.1038/s41598-024-61887-6

Detection of atmospheric radon concentration anomalies and their potential for earthquake prediction using Random Forest analysis

Article
Open access
Published: 31 May 2024

Volume 14, article number 11626, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Detection of atmospheric radon concentration anomalies and their potential for earthquake prediction using Random Forest analysis

Download PDF

Mayu Tsuchiya¹,
Hiroyuki Nagahama¹,
Jun Muto¹,
Mitsuhiro Hirano³ &
…
Yumi Yasuoka²

598 Accesses
10 Altmetric
Explore all metrics

Abstract

Various anomalies occurring before earthquakes are currently being studied to predict seismic events, with one of them being the radioactive element radon (²²²Rn). Radon concentrations in the soil, water, and atmosphere fluctuate in response to crustal movement. Recent research has statistically detected anomalies by analyzing the fluctuations in radon concentrations before earthquakes and conducting quantitative evaluations of radon. However, the method used to determine the parameters in the analysis was problematic. Therefore, in this study, we compared observed atmospheric radon concentration data with predicted values based on typical annual patterns using Random Forest analysis. We conducted a more objective analysis by employing this method and statistically determining anomalies using thresholds. This analysis was conducted using atmospheric radon concentration observation data obtained at Kobe Pharmaceutical University (KPU) before the 1995 Kobe Earthquake, and ionization currents emitted when radon decays were obtained at Fukushima Medical University (FMU) before the 2011 Tohoku-oki Earthquake. Consequently, before the major earthquakes occurred at both locations, the difference between the predicted and observed values exceeded the standard deviation by a factor of three. These results indicate the potential of Random Forest analysis to identify anomalies in atmospheric radon concentrations before earthquakes occur.

Method for prediction of landslide movements based on random forests

Article 14 October 2016

Identification of precipitation trend and landslide susceptibility analysis in Miandoab County using MATLAB

Article 02 June 2022

Prediction of the Kostanjek Landslide Movements Based on Monitoring Results Using Random Forests Technique

Introduction

Currently, various researchers are searching for the precursor phenomena of earthquakes, with a particular focus on the study of the radioactive element radon¹. Radon (²²²Rn) is an element with an atomic number of 86 and a half-life of 3.82 days. It is a noble gas and exhibits behavior similar to that of an ideal gas. Previous studies have revealed that radon concentrations in the soil, water, and atmosphere vary in response to crustal movement. In the Gulf of Corinth in Greece, atmospheric radon concentration suddenly decreased immediately before an earthquake². Concerning underwater radon concentrations, the radon in groundwater decreased three months before the Izu-Oshima-kinkai Earthquake, which occurred on January 14, 1978, and experienced a sharp decline five days before the earthquake³. Moreover, the radon concentration in groundwater increased several months before the 1995 Kobe Earthquake, peaking nine days before the earthquake⁴. Atmospheric radon concentrations have been shown to vary in response to crustal strain in the range of ${10}^{-6}$ to ${10}^{-8}$⁵. Thus, among water, soil, and air, atmospheric radon concentration serves as an indicator of fluctuations in response to even small strains.

In the aforementioned atmospheric radon concentration studies, the data exhibited seasonal cycles⁶. These seasonal fluctuations can be represented by sinusoidal waves^7,8. Previous research has aimed to detect anomalies in atmospheric radon concentrations by identifying and removing seasonality from observational data during earthquake-free periods and then analyzing the residuals to correlate to earthquakes^7,9. The average radon concentrations and the smoothed atmospheric residuals showed anomalies prior to the Kobe Earthquake of 1995¹⁰. Fluctuations in atmospheric radon concentration in response to tidal loading have also been observed during this period¹¹. A similar anomalous increase was also observed before and after the 2011 earthquake in northern Wakayama Prefecture⁹. In contrast, before the 2018 Northern Osaka Earthquake, an abnormal decrease in atmospheric radon concentration associated with quiescent seismic activity was observed¹². Thus, abnormal phenomena in atmospheric radon concentrations before earthquakes are not limited to increases but also include abnormal decreases. However, in these analyses of atmospheric radon concentrations, the magnitudes of the residuals in atmospheric radon concentrations were influenced by the determination of the sine curve representing seasonal variations, leading to uncertainties in anomaly detection.

To address this issue, Iwata et al. (2018) conducted anomaly detection using singular spectrum transformation (SST) without removing seasonal variations by directly utilizing observational data¹³. SST analysis can detect anomalies in periodic data, eliminating the need for preset sine curves that reflect seasonal variations. This advantage allows anomaly detection even in data where the underlying physical processes are not explicitly known. Using this approach, Iwata et al. (2018) analyzed atmospheric radon concentrations observed before the 2011 Tohoku-oki Earthquake. By comparing changes in the cumulative seismic moment with the rate of change in atmospheric radon concentrations, which was determined using SST, they were able to quantitatively evaluate anomalies. The results of this study reveal a statistically significant correlation between the detected anomalies in atmospheric radon concentrations and changes in the cumulative seismic moment. Furthermore, an increase in the rate of change was observed before the 2011 Tohoku-oki Earthquake. This period coincides with various anomalies, such as slow-slip events in the Japan Trench¹⁴. Additionally, a decrease in groundwater level and temperature in the wells was observed near the northern end of the main rupture region of the earthquake¹⁵. In addition, when seismic activity was analyzed using natural time analysis, the entropy changes in seismic motion reached a minimum of approximately three months before the Tohoku-oki Earthquake¹⁶. The alignment of these anomalies with crustal movements and other anomalies not limited to radon made the occurrence of preseismic anomalies more evident. However, SST analysis still has uncertainties in anomaly detection, owing to the influence of the parameters used in the analysis and the ambiguity in determining the criteria for considering an increase in the rate of change as abnormal.

To address the SST problem, attempts were made to detect anomalies in atmospheric radon concentrations before the Kobe and Tohoku-oki Earthquakes (Fig. 1) using Random Forest analysis, as such anomalies have been reported in previous studies. These two earthquakes were used in this study because observational data were available for a long period before the mainshock, and previous studies have made detailed comparisons with seismic events. Random Forest analysis is a type of machine learning that utilizes models created from learned data to predict or classify out-of-sample data without an arbitrary selection of parameters (see Methods section for details). Furthermore, adopting a 3σ threshold as the criterion for considering anomalies resolves the ambiguity in anomaly determination¹⁷. This analysis targeted the detection of anomalies with greater objectivity than conventional anomaly detection.

Results and discussion

To create a Random Forest prediction model, the observation period without particularly large earthquakes was designated as the normal period, and the subsequent periods of earthquakes were considered the preseismic period (Fig. 2). The data were split into normal and preseismic periods for Kobe Pharmaceutical University (KPU) and Fukushima Medical University (FMU) (Table 1). The KPU had a deficit of approximately one month in the observation data after the earthquake due to the damage caused by the Kobe Earthquake. The ionization current at the FMU increased after the Tohoku-oki Earthquake due to the high environmental radiation caused by the accident at the Fukushima Daiichi Nuclear Power Plant¹⁸. Therefore, observational data up to immediately before the earthquake was used in this study. Based on the data from the normal period, we created prediction models and then validated the accuracy of the generated predictive model by calculating the coefficient of determination. The date of observation was used as an explanatory variable, and the observed data were used as objective variables. Furthermore, to assess the feasibility of performing anomaly detection in real time and explore whether it can be analyzed using simpler methods, we utilized the observed data for training without excluding outliers.

Table 1 Locations of observational sites and periods of the ionization current data in the analysis.

Full size table

The case of KPU

To analyze the KPU, atmospheric radon concentration data from 1984 to 1995 was utilized, and a prediction model was developed using data from 1984 to 1988. This model is illustrated in Fig. 3a. As shown in Fig. 3a, the shapes of the graphs for the predicted and observed values were generally consistent. The coefficient of determination for this period was 0.816, indicating high reproducibility of the prediction model. Using this prediction model, we were able to replicate changes in the atmospheric radon concentration that occurred during the preseismic period. Figure 3a shows that during the preseismic period, there were many instances in which the observed values fell below the predicted values. In particular, a trend towards lower values was evident from 1990 to 1991. From 1992 to 1993, there were a few instances in which the observed values deviated significantly from the predicted values. However, from mid-1994 to early 1995, the predicted values significantly exceeded the observed values. To quantitatively evaluate fluctuations during the preseismic period, we calculated the coefficient of determination R² for the period (1990–1995). The overall R² value for the preseismic period was − 0.504. Until 1991, the values remained consistently low, with 1990 at − 7.107 and 1991 at − 2.326, both falling below − 1. Leading up to 1994, there was a gradual increase in values, with 1992 at − 0.374, 1993 at − 0.464, and 1994 at − 0.008. However, in 1995, the value dropped significantly to − 46.899.

As shown in Fig. 3b, an examination of the residual values between the observed and predicted values revealed variations in the atmospheric radon concentration during the normal and preseismic periods. As shown in Fig. 3a, the period from 1990 to 1991 exhibited values significantly below the predicted values. From 1992 to 1994, there were numerous instances in which the observed values fell below the predicted values during spring and summer and exceeded the predicted values during autumn and winter. From mid-1994 on, there was an extended period during which the observed values consistently exceeded the predicted values.

As mentioned previously, there was a period of low atmospheric radon concentrations from 1990 to 1991. There have been instances of seismic quiescence before significant earthquakes; indeed, seismic quiescence was reported before the 1995 Kobe Earthquake¹⁹. Prior to the 2018 Northern Osaka Earthquake, a decrease in atmospheric radon concentration linked to seismic quiescence was documented. This suggests that, similar to the Northern Osaka earthquake, seismic quiescence occurred in the vicinity of KPU from 1990 to 1991, as reflected by the low atmospheric radon concentration.

From the latter half of 1994 until just before the Kobe Earthquake, there was a period when predicted values were much higher than observed values. Between 1992 and 1994, the difference between the observed and predicted values was small, and the graph indicates that seasonal variations in atmospheric radon concentration were evident in the observed values. Consequently, it is believed that from the end of 1994, there was a noticeable divergence in trends between the normal and preseismic periods. From Fig. 3b, the difference between the observed and predicted values was calculated, and its standard deviation was set as σ. 2σ showed no correspondence with the earthquake and no continuous variation. However, the difference exceeded 3σ from December 23 to 28, 1994, when the earthquake was imminent¹⁷.

The increase in radon concentration prior to the Kobe Earthquake can be attributed to changes in the stress field of the area. In the vicinity of KPU, a transition from a contractive to an extensional strain field has been reported since mid-1994²⁰. It has been revealed that an extensional stress field promotes the release of carbon dioxide from near-surface fluids²¹. Therefore, it is likely that the transition from the compressional stress field to the extensional stress field enhances radon emanation, leading to a significant increase in atmospheric radon concentration exceeding 3σ at the end of 1994. Thus, the significant increase in atmospheric radon concentrations detected at the end of 1994 in the Random Forest analysis in this study is considered an anomaly that has been confirmed in previous studies. This confirms the robustness of the prediction and anomaly-detection methods based on Random Forest analysis used in this study.

The case of FMU

During this application to the FMU, we utilized data consisting of the ionization current measured at the RI facility of the FMU. The ionization current values can be linearly converted to atmospheric radon concentrations (see Eq. (1) in the “Methods” section for details). The forecasting was done using data from 2008 to 2011, while the prediction model was developed using FMU data from 2002 to 2007. Figure 4 displays the model and forecast results after 2008. As shown in Fig. 4a, the observed values exhibited periodicity resembling a sinusoidal curve with peaks in both the normal and preseismic periods. Additionally, it can be noted that from 2008 to mid-2010 the shape of the predicted values generally matched the shape of the observed values. However, in the latter half of 2010, a shift in the peak seasonal variation became apparent. Given that the observed values demonstrate a periodicity similar to a sinusoidal curve throughout the normal and preseismic periods, this characteristic is consistent with previous research⁷. The presence of minima and maxima in the vicinity of the peaks indicates that a relatively accurate prediction model was created for the FMU. The coefficient of determination R² of this model for the normal period (2003–2008) was 0.727. A high determination value indicates that the prediction model is highly accurate and demonstrates its robustness.

Next, we compared the observed and predicted preseismic ionization current data from 2008 to 2011 using a Random Forest analysis. No phase shift was observed in the seasonal variation of the ionization current between the predicted and observed data from 2008 to mid-2010. However, a phase shift was observed after September 2010. In addition, the predicted values reproducing the normal period show a clear delay in the positive peak values prior to the Tohoku-oki Earthquake. Furthermore, higher ionization currents were observed at the end of 2010, immediately before the earthquake.

The observed variation in the preseismic ionization current data was also recognized in the coefficient of determination R² between the prediction model and the observed values. To quantitatively evaluate this variation, we calculated the R² before the earthquake (2008–2011) to be -0.341. This indicates that the prediction accuracy during this period was lower than during the normal period (2003–2008). To identify the periods during which the observed values of the ionization current deviated significantly from the predictions based on seasonal variations, the standard deviation of the difference between the predicted and observed values was calculated. R² calculated for each year within the preseismic period showed an anomalous variation toward the Tohoku-oki Earthquake. The R² values remained somewhat low at − 0.271 in 2008, − 0.029 in 2009, and − 0.367 in 2010, but dropped significantly to − 7.219 in 2011. The significant decrease in the R² value in 2011 indicates that the atmospheric radon concentration estimated from the ionization current data deviated significantly from the normal period.

An anomalous variation was also observed in the residual variation of the ionization current (Fig. 4b). Figure 4b shows the difference between the observed and predicted values for the preseismic period in the FMU, with a line drawn at 3σ. From Fig. 4b, it is evident that the residual is generally larger during the preseismic period than during the normal period. In mid-2008, late 2008, and late 2010, the residual notably increased, but particularly by late 2010, it is apparent that the observed ionization current values exceeded 3σ, significantly deviating from the seasonal variation.

From the GNSS time series measured before the Tohoku-oki Earthquake, it was determined that northeastern (NE) Japan underwent east–west deformation from 2003 to 2010^22,23, with significant eastward deformation occurring in 2008 and 2010²⁴. The eastward motion leading to volumetric expansion was caused by an aseismic slip in the Japan Trench²³. This suggests that similar to KPU, extensional deformation occurred during the preseismic period²¹, leading to an increase in the ionization current during this period. Iwata et al. (2018) conducted cumulative seismic moment calculations simultaneously with SST analysis and reported an increase in the cumulative seismic moment associated with aseismic slip¹³. Radons in soil are released into the atmosphere through soil fissures, and changes in soil fissures due to crustal deformation are expected to sensitively alter the concentration of radon in the atmosphere^25,26. Based on these observations, Iwata et al. (2018) suggested that the expansion of fractures caused by volumetric expansion in NE Japan due to aseismic slip led to an increase in the rate of radon emanation to the surface¹³. Therefore, the significant increase in atmospheric radon concentration detected in this Random Forest analysis was likely associated with changes in the fracture width due to aseismic slip, which is similar to the findings of Iwata et al. (2018).

Practical challenges for real-time monitoring of atmospheric radon concentrations

To establish a real-time monitoring network for earthquakes and crustal movement in the future, recent developments in the monitoring networks of atmospheric radon concentrations measured by Radioisotope facilities (RI facilities) across Japan have enabled us to synchronously monitor anomalies related to crustal deformation in Japan using radon concentrations¹². Muto et al. (2021) pointed out that these measurements reveal that atmospheric radon concentration reflects the average values of radon exhalation and is independent of local heterogeneity in geological and hydrological structures. In the network, the non-parametric analysis used to estimate change points in the time series clearly indicates that the change in atmospheric radon concentration is statistically correlated with the seismic activities in the Kobe Earthquake and Tohoku regions prior to the 2011 Tohoku-oki Earthquake¹³.

Moreover, for real-time monitoring, it is necessary to validate the effectiveness of this method for other earthquakes occurring in various tectonic settings. To validate this method, we applied Random Forest analysis to Wakayama Medical University (WMU) during 2011–2013, where we obtained data on atmospheric radon concentration variations during the earthquake swarm period⁹. The results show that the difference between observed and predicted values before the 2011 and 2013 earthquakes exceeded 3σ (Refer to the supplementary Fig. S1, S2). This indicates that Random Forest analysis of atmospheric radon concentration variations is also effective for earthquake swarms with short cycles. Further analysis is required to demonstrate the versatility of these methods for earthquakes occurring in other regions, including other countries.

Previous studies have shown that seismic activity and weather are influenced by solar activity^27,28,29. Because the solar triggering cycle is an 11-year cycle³⁰, continuous observation data longer than a multiple of 11 years is required to accurately consider the effects of solar activity in the analysis. To achieve this, further research is essential to verify the validity of the method in various locations, analyze the method using longer-term data, and examine the lead time between anomaly detection and earthquake occurrence. During the periods adopted for the “Teaching Phase” and “Testing Phase,” when local seismicity M6 + is incorporated into the data, the seasonal levels of radon are periodic and seismic events are randomly related to the time scale. Therefore, training, testing, and anomaly detection can potentially be biased by the data used to predict the radon time series. Therefore, it is recommended that radon concentration fluctuations during the seismic quiet period be used in the “Guidance phase” and “Test phase.” With the resolution of these issues, anomaly detection for earthquake prediction has become considerably more important. However, in this study, including M6 + in the catalog did not affect the main results capturing the anomalies of the mainshock, and anomalies were observed before the M7 earthquake. To seek greater accuracy in targeting smaller earthquakes, a future challenge will be to create a new dataset that excludes the confidence of M6 + and reanalyze it. Therefore, the results of the Random Forest analysis of atmospheric radon concentration variations indicate the potential to identify abnormalities in atmospheric radon concentrations before earthquake occurrence, suggesting the possibility of capturing earthquake precursors.

Conclusion

In this study, we conducted Random Forest analysis at both KPU and FMU to minimize ambiguity for parameter determination using a statistical approach for anomaly detection based on numerical values for atmospheric radon concentration before earthquakes. This allowed for more objective anomaly detection when compared with SST. The results of the Random Forest analysis revealed anomalies in the ionization current data, indicating significant discrepancies between the predicted and observed data during specific periods before the 1995 Kobe Earthquake at KPU and before the 2011 Tohoku-oki Earthquake at FMU.

Consequently, it has become possible to analyze the observed ionization current values as raw data, leading to the potential for real-time anomaly detection in short-term earthquake predictions using atmospheric radon concentrations. The ionization currents were measured at other RI facilities. This study detected anomalies from the observed ionization current data without converting them into atmospheric radon concentrations, suggesting the possibility of promptly detecting anomalies, even without prior knowledge, to convert ionization currents into atmospheric radon concentrations. Additionally, the direct utilization of observational data can simplify the procedures for real-time detection, potentially contributing to the establishment of a nationwide real-time monitoring network for earthquakes and crustal movement using RI facilities. If a prediction model for atmospheric radon concentrations beyond normal periods is established, real-time anomaly detection becomes feasible by comparing the observed atmospheric radon concentration with that of the model, potentially contributing to the development of a crustal deformation-monitoring network.

Future improvements are needed, such as analyzing data from other locations over a longer period of time to validate its effectiveness and excluding the confidence of M6 + from the training dataset to achieve more accurate anomaly detection.

Method

Atmospheric radon concentration data

The data used for the analysis were derived from atmospheric radon concentration measurements at KPU before the 1995 Kobe Earthquake (Mw 6.9)¹⁰ and ionization current data measured at FMU before the 2011 Tohoku-oki Earthquake (Mw 9.0)¹³. Atmospheric radon concentration at KPU was measured with a gas flow-type ionization chamber (Fuji Electric Systems Co., Ltd., Japan; model: NAG513; effective volume: 1.8 × ${10}^{-2}$ m³)⁵. Measurements were performed at FMU using another gas flow-type ionization chamber (Aloka Co., Ltd, Japan; model: DGM-101; effective volume: 1.4 × ${10}^{-2}$ m³; equipped with ⁹⁰Sr calibration source)⁷. The ionization current data included both the background radiation values and ⁹⁰Sr radiation levels provided by the instrument.

In this study, it was assumed that the ionization current data represented the concentration of radon in the atmosphere through a linear transformation using the following formula³¹:

$$\begin{array}{c}C=\left(FIW/eEV\right)\times {10}^{-15}=fI,\end{array}$$

(1)

where $C$ is the atmospheric radon concentration, $F$ is the correction coefficient, $I$ is the ionization current, $W$ is the W-value of alpha particles in air, $e$ is the elementary charge, $E$ is the alpha energy, $V$ is the effective volume, and $f$ is the conversion coefficient³¹. Records at hourly intervals using the daily minimum values among the data, which were least affected by weather and geographical conditions, were deemed to best represent the daily atmospheric radon concentration³². Days with missing data were treated as missing values and data from leap days were excluded.

Random Forest analysis

Introduced by Breiman³³ in 2001, Random Forest analysis belongs to the ensemble learning category of machine learning³³. The main goal is to integrate insights from multiple models to derive definitive answers. In the RF domain of Random Forests, ensemble learning models are created by using a methodology known as bagging. This process generates multiple groups of training data, including duplicates, from the entire training data pool. By creating a large number of decision trees characterized by low accuracy and minimal intermodel correlation, Random Forests effectively prevent overfitting during the machine learning process. Random Forest analysis is versatile and can be used for data classification and regression. In the classification scenario, the final decision was made by majority vote among the results provided by the multiple decision trees. Conversely, in the regression task, the final result is computed by averaging the predictions generated by multiple decision trees. This averaging technique plays an important role in mitigating the effects of the noise inherent in each decision tree. The decision trees generated using the bagging technique adhered to a consistent distribution³⁴. This consistency guarantees that the expected value of the mean derived from a collection of X decision trees aligns with the anticipated value of each decision tree. We used Random Forest analysis for multiple regression in this study, which allowed us to forecast data outside of the training data timeframe.

Creating the prediction model

In this study, we employed multiple regression using a Random Forest in order to forecast future data beyond the training data period. The analyses were performed using Python version 3.8³⁵, Scikit-learn³⁶, Pandas³⁷, and NumPy³⁸. To assess the robustness of the Random Forest time-series prediction model, we randomly selected 70% of the data from the normal period for both FMU and KPU to create prediction models. Subsequently, we compared the predicted data with the remaining 30% of the data from the normal period and calculated the coefficient of determination (R²) to confirm the model’s prediction accuracy (Fig. 2). To construct Random Forest prediction models and perform model fitting, we utilized the observed data as explanatory variables. Previous studies have indicated that atmospheric radon concentrations are temperature dependent³⁹ and fluctuate seasonally with the northward and southward movement of the rainy season fronts. Because the teacher data for the Random Forest analysis includes annual variations that reflect seasonal variations due to the movement of such fronts, they do not affect the discrepancy between the observed and predicted values. When creating the model, we executed a Random Forest with an unrestricted depth as well as depths of 5, 10, 15, 20, and 25. The depth with the highest coefficient of determination between observed and predicted values was selected. Consequently, the depth of 10 m had the highest coefficient of determination. Supplementary Table S1 shows the calculated coefficients of determination at each depth.

Determination coefficient

To assess the agreement between the predicted and observed data, we employed coefficients of determination derived from their differences. The coefficient of determination, denoted as R², was calculated based on the predicted value u, measured value v, and mean value $\overline{v}$ of the measured data v as follows³⁶:

$$\begin{array}{c}{R}^{2}\left(u,v\right)=1-\left\{\sum_{i=0}^{n-1}{\left({v}_{i}-{u}_{i}\right)}^{2}\right\}/\left\{\sum_{i=0}^{n-1}{\left({v}_{i}-\overline{v }\right)}^{2}\right\}.\end{array}$$

(2)

If the predicted and observed values were in agreement, then R² was equal to 1. If there was a large discrepancy between the predicted and observed values, R² had a negative value. If the slopes or waveforms of the actual and predicted data differed, the R² value was negative. After evaluating the accuracy of the model by R² in this way, the model was applied to predict the ionization current values for the specified data period. The R² values were then compared with the model values to explore anomalies in preseismic atmospheric radon concentrations relative to normal conditions. Furthermore, the coefficient of determination of the model was validated through cross-validation by dividing the normal period into five segments. Please refer to Supplementary Table S2 for the respective values.

Anomaly identification method

In order to detect anomalies in the observed atmospheric radon concentrations and ionization current values, we employed a 3σ criterion. As there is no general definition of anomalies for time-series data of geochemical and geophysical observations, it is necessary to set standards for anomalies suitable for radon observations¹⁷. We considered that both the amplitude and duration were sufficient to distinguish it from seasonal fluctuations, and defined radon changes that maintained a level of 3σ or higher for a period of one day or more as an anomaly.

First, we subtract the predicted values obtained through Random Forest analysis from the observed values to calculate the residuals. Next, we computed the standard deviation σ of the residuals. Based on the properties of the Gaussian distribution, periods with values exceeding 3σ were considered as times when anomalous values were observed, as they fell outside the range encompassing 99.7% of the data.

References

Huang, P., Lv, W., Huang, R., Luo, Q. & Yang, Y. Earthquake precursors: A review of key factors influencing radon concentration. J. Environ. Radioactiv. 271, 107310 (2024).
Article CAS Google Scholar
Karastathis, V. K. et al. Observations on the stress related variations of soil radon concentration in the Gulf of Corinth, Greece. Sci. Rep. 12, 5442 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Wakita, H., Nakamura, Y., Notsu, K., Noguchi, M. & Asada, T. Radon anomaly: A possible precursor of the 1978 Izu-Oshima-kinkai earthquake. Science 207(4433), 882–883 (1980).
Article ADS CAS PubMed Google Scholar
Igarashi, G. et al. Ground-water radon anomaly before the Kobe earthquake in Japan. Science 269(5220), 60–61 (1995).
Article ADS CAS PubMed Google Scholar
Yasuoka, Y. et al. Preseismic changes in atmospheric radon concentration and crustal strain. Phys. Chem. Earth. A/B/C 34(6–7), 431–434 (2009).
Article ADS Google Scholar
Omori, Y. et al. Variation of atmospheric radon concentration with bimodal seasonality. Radiat. Meas. 44(9–10), 1045–1050 (2009).
Article CAS Google Scholar
Hayashi, K. et al. Normal seasonal variations for atmospheric radon concentration: A sinusoidal model. J. Environ. Radioact. 139, 149–153 (2015).
Article CAS PubMed Google Scholar
Kobayashi, Y. et al. Annual variation in the atmospheric radon concentration in Japan. J. Environ. Radioact. 146, 110–118 (2015).
Article CAS PubMed Google Scholar
Goto, M. et al. Anomalous changes in atmospheric radon concentration before and after the 2011 northern Wakayama Earthquake (Mj 5.5). Radiat. Prot. Dosim. 174(3), 412–418 (2017).
CAS Google Scholar
Yasuoka, Y. & Shinogi, M. Anomaly in atmospheric radon concentration: a possible precursor of the 1995 Kobe, Japan, earthquake. Health. Phys. 72(5), 759–761 (1997).
Article CAS PubMed Google Scholar
Omori, Y., Nagahama, H., Yasuoka, Y. & Muto, J. Radon degassing triggered by tidal loading before an earthquake. Sci. Rep. 11, 4092 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Muto, J. et al. Preseismic atmospheric radon anomaly associated with 2018 Northern Osaka earthquake. Sci. Rep. 11, 7451 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Iwata, D., Nagahama, H., Muto, J. & Yasuoka, Y. Non-parametric detection of atmospheric radon concentration anomalies related to earthquakes. Sci. Rep. 8, 13028 (2018).
Article ADS PubMed PubMed Central Google Scholar
Ito, Y. et al. Episodic slow slip events in the Japan subduction zone before the 2011 Tohoku-Oki earthquake. Tectonophysics 600, 14–26 (2013).
Article ADS Google Scholar
Orihara, Y., Kamogawa, M. & Nagao, T. Preseismic changes of the level and temperature of confined groundwater related to the 2011 Tohoku earthquake. Sci. Rep. 4(1), 6907 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Varotsos, P., Sarilis, N. & Skorolas, E. Compatibility of the SES generation model with the precursory phenomena before the Tohoku M 9 Earthquake in Japan in 2011. In Natural Time Analysis: The New View of Time, Part II: Advances in Disaster Prediction Using Complex Systems (eds Varotsos, P. et al.) 189–208 (Springer Nature, Switzerland, 2023).
Chapter Google Scholar
Igarashi, G. & Wakita, H. Groundwater radon anomalies associated with earthquakes. Tectonophysics 180(2–4), 237–254 (1990).
Article ADS Google Scholar
Saito, K. et al. Detailed deposition density maps constructed by large-scale soil sampling for gamma-ray emitting radioactive nuclides from the Fukushima Dai-ichi Nuclear Power Plant accident. J. Environ. Radioact. 139, 308–319 (2015).
Article CAS PubMed Google Scholar
Mogi, K. Some features of seismic activities before the recent large earthquakes in and near Japan - the 1995 Kobe earthquake and the 1995 Iturup-oki earthquake. 120–138. In Proceedings of the First Joint Meeting of the U.S.-Japan Conference on Natural Resources (UJNR) Panel on Earthquake Research, Pasadena, California, November 12 - 14, 1996：Open-file report / U.S. Geological Survey, 97-467 (1997)
Kyoto University and Earthquake Research Institute, The University of Tokyo. Observations of crustal movements and discharge change at Rokko-Takao station. Rep. Coord. Comm. Earthq. Predict. 54, 695–707 (1995). (In Japanese)
Google Scholar
Tamburello, G., Pondrelli, S., Chiodini, G. & Rouwet, D. Global-scale control of extensional tectonics on CO₂ earth degassing. Nat. Commun. 9, 4608 (2018).
Article ADS PubMed PubMed Central Google Scholar
Suito, H., Nishimura, T., Tobita, M., Imakiire, T. & Ozawa, S. Interplate fault slip along the Japan Trench before the occurrence of the 2011 off the Pacific coast of Tohoku Earthquake as inferred from GPS data. Earth Planets Space 63, 615–619 (2011).
Article ADS Google Scholar
Ozawa, S. et al. Preceding, coseismic, and postseismic slips of the 2011 Tohoku earthquake, Japan. J. Geophys. Res-Sol. Ea. 117, B07404 (2012).
Article ADS Google Scholar
Mavrommatis, A. P., Segall, P. & Johnson, K. M. A decadal-scale deformation transient prior to the 2011 Mw 9.0 Tohoku-oki earthquake. Geophys. Res. Lett. 41(13), 4486–4494 (2014).
Article ADS Google Scholar
Kawada, K. et al. Time-scale invariant changes in atmospheric radon concentration and crustal strain prior to a large earthquake. Nonlin. Processes Geophys. 14, 123–130 (2007).
Article ADS Google Scholar
Koike, K., Yoshinaga, T., Ueyama, T. & Asaue, H. Increased radon-222 in soil gas because of cumulative seismicity at active faults. Earth Planet Space 66, 57 (2014).
Article ADS Google Scholar
Marchitelli, V., Harabaglia, P., Troise, C. & De Natale, G. On the correlation between solar activity and large earthquakes worldwide. Sci. Rep. 10, 11495 (2020).
Article CAS PubMed PubMed Central Google Scholar
Carslaw, K. S., Harrison, R. G. & Kirkby, J. Cosmic rays, clouds, and climate. Science 298, 1732–1737 (2002).
Article ADS CAS PubMed Google Scholar
Marsh, N. & Svensmark, H. Cosmic ray, clouds, and climate. Space Sci. Rev. 94, 215–230 (2000).
Article ADS CAS Google Scholar
Hathaway, D. H. The solar cycle. Living Rev. Sol. Phys. 7(1), 4 (2010).
Article ADS PubMed PubMed Central Google Scholar
Tajika, Y. et al. Radon concentration of outdoor air: Measured by an ionization chamber for radioisotope monitoring system at radioisotope institute. J. Radioanal. Nucl. Chem. 295, 1709–1714 (2013).
Article CAS Google Scholar
Porstendorfer, J., Butterweck, G. & Reineking, A. Daily variation of the radon concentration indoors and outdoors and the influence of meteorological parameters. Health Phys. 67(3), 283–287 (1994).
Article CAS PubMed Google Scholar
Breiman, L. Random Forests. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction 2nd edn. Springer Series in Statistic (Springer, 2009).
Book Google Scholar
Van Rossum, G. & Drake, F. L. Python 3 Reference Manual (CreateSpace, 2009).
Google Scholar
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
MathSciNet Google Scholar
McKinney, W. Data structures for statistical computing in Python. In: van der Walt, S. & Millman, K. J. (eds) Proc. 9th Python in Science Conf. 56–61 (2010).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Yasuoka, Y. & Shinogi, M. The variation of atmospheric ²²²Rn concentration in Kobe. Radioisotopes 43, 688–694 (1994).
Article CAS Google Scholar
Uieda, L. et al. PyGMT: A Python interface for the Generic Map** Tools (v0.3.1). Zenodo (2021).

Download references

Acknowledgements

This study was supported by the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) of Japan under its Earthquake and Volcano Hazards Observation and Research Program (1207, The Second Program THK_10, The Third Program THK_08). Also, ionization current data from Fukushima Medical University was provided by Mr. Toshiyuki Suzuki of the Radioisotope Center, Fukushima Medical University (FMU), and ionization current data from Wakayama Medical University in the supplement was provided by Dr. Hayato Ihara of the Radioisotope Laboratory, Wakayama Medical University (WMU).

Author information

Authors and Affiliations

Department of Earth Science, Graduate School of Science, Tohoku University, 6-3 Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8578, Japan
Mayu Tsuchiya, Hiroyuki Nagahama & Jun Muto
Radioisotope Research Center, Kobe Pharmaceutical University, 4-19-1 Motoyamakitamachi, Higashinada-ku, Kobe, 658-8558, Japan
Yumi Yasuoka
School of Engineering, Utsunomiya University, 7-1-2 Yoto, Utsunomiya, 321-8585, Japan
Mitsuhiro Hirano

Authors

Mayu Tsuchiya
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Nagahama
View author publications
You can also search for this author in PubMed Google Scholar
Jun Muto
View author publications
You can also search for this author in PubMed Google Scholar
Mitsuhiro Hirano
View author publications
You can also search for this author in PubMed Google Scholar
Yumi Yasuoka
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.T., H.N., J.M. and M.H. designed the study. Y.Y. and H.N. contributed to collecting data from radioisotope facilities. M.T. analyzed the data; M.T. wrote the manuscript. All the authors contributed to the discussion and review of the paper.

Corresponding author

Correspondence to Mayu Tsuchiya.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Tsuchiya, M., Nagahama, H., Muto, J. et al. Detection of atmospheric radon concentration anomalies and their potential for earthquake prediction using Random Forest analysis. Sci Rep 14, 11626 (2024). https://doi.org/10.1038/s41598-024-61887-6

Download citation

Received: 28 February 2024
Accepted: 10 May 2024
Published: 31 May 2024
DOI: https://doi.org/10.1038/s41598-024-61887-6
Springer Nature Limited

Detection of atmospheric radon concentration anomalies and their potential for earthquake prediction using Random Forest analysis

Abstract

Similar content being viewed by others

Method for prediction of landslide movements based on random forests

Identification of precipitation trend and landslide susceptibility analysis in Miandoab County using MATLAB

Prediction of the Kostanjek Landslide Movements Based on Monitoring Results Using Random Forests Technique

Introduction

Results and discussion

The case of KPU

The case of FMU

Practical challenges for real-time monitoring of atmospheric radon concentrations

Conclusion

Method

Atmospheric radon concentration data

Random Forest analysis

Creating the prediction model

Determination coefficient

Anomaly identification method

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Navigation

Detection of atmospheric radon concentration anomalies and their potential for earthquake prediction using Random Forest analysis

Abstract

Similar content being viewed by others

Method for prediction of landslide movements based on random forests

Identification of precipitation trend and landslide susceptibility analysis in Miandoab County using MATLAB

Prediction of the Kostanjek Landslide Movements Based on Monitoring Results Using Random Forests Technique

Introduction

Results and discussion

The case of KPU

The case of FMU

Practical challenges for real-time monitoring of atmospheric radon concentrations

Conclusion

Method

Atmospheric radon concentration data

Random Forest analysis

Creating the prediction model

Determination coefficient

Anomaly identification method

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation