Introduction

Outdoor exposure test is one of the standard methods for evaluating the durability of wood-based boards. As a test specimen is exposed in a natural weathering environment for a long-term period, the deterioration of the test specimen strongly depends on the climate conditions of the exposure locations. Consequently, the results of the outdoor exposure tests conducted at specific locations are not generally applicable to locations with different climate conditions. This is a disadvantage of outdoor exposure test.

To overcome this problem, Sekino et al. [1] and Kojima et al. [2, 3] extensively collected the outdoor exposure data of commercial wood-based boards exposed at eight representative locations across Japan, and quantified the deterioration of wood-based boards at different locations. The relationships between board deterioration and outdoor exposure conditions were modeled by introducing a combination of climate factors, called “weathering intensity (WI)”. The best combination of climate factors was determined based on the coefficient of correlation from simple linear regression analysis, and it was found that the WI defined by the logarithm of \( \sum {({\text{Temperature}} \times {\text{Precipitation}})} \) was highly correlated to the deterioration of mechanical properties, namely internal bond strength (IB), modulus of rupture, and lateral nail resistance. Although the WI data were well fitted by the regression analysis, it remains unclear whether the WI is applicable to predict the deterioration of wood-based boards, because no validation was performed with an independent set of data.

Furthermore, there are three questions regarding the data analysis. Firstly, the WI is a combination of climate factors involving mathematical transformation, so it is difficult to examine the impact of individual climate factors on the deterioration of wood-based boards. Secondly, simple linear regression uses one variable to explain patterns in the data by looking for a relationship between two continuous variables. If there is more than one possible explanatory variable and an important third variable is not included, we could miss significant relationships between the first two variables, or even come to the wrong conclusion [4]. Instead, it would then be more efficient to include all the information available in a multivariate analysis. Thirdly, a group of parametric tests, including linear regression, t test and analysis of variance (ANOVA), relies on the four assumptions: independence, homogeneity of variance, normality of error, and linearity [4]. If these assumptions are contravened, the parameter estimates are no longer valid and the statistical significance of WI cannot be assessed. It is highly likely that one or more assumptions were contravened by the simple linear regression between the WI and the deterioration of mechanical properties. Therefore, in this paper, we addressed these questions using multiple linear regression (MLR), presenting how to check the assumptions of MLR.

Contrastingly, an artificial neural network (ANN) is a nonlinear computational model, capable of modeling complex, undefined, and nonlinear relationships between variables with better results than traditional statistical regressions [5]. The background information on ANNs can be found in the literatures [6, 7]. The characteristic feature of ANNs is that they are not programmed; they are trained from a series of examples without needing to know beforehand the relations which may exist between the variables involved in the process, by adjusting the weight of the relations between the variables. In the field of wood science, ANNs have been applied to predict mechanical and physical properties of wood [810], fracture toughness [11], thermal conductivity [12], hygroscopic equilibrium moisture content [13], nonisothermal diffusion of moisture [14], dielectric loss factor [15], and drying process of wood [1619]. Also in the field of wood-based composite materials, some mechanical and physical properties of wood-based boards have been successfully predicted using ANNs based on their physical properties [2024] or based on processing parameters which are to be optimized in a board manufacturing process [2528].

In this study, we focused on the IB of particleboard subjected to outdoor exposure. An MLR and an ANN were developed to predict the IB of particleboard during outdoor exposure based on climate data collected at eight locations in Japan. The ANN model and the MLR model were developed with the data from five locations, and the performance of the models was assessed for the remaining three locations. In addition, techniques to check the assumptions in MLR were presented, and the impact of climate exposure on the IB under outdoor exposure was examined by means of statistical analysis.

Materials and methods

Samples and outdoor exposure tests

Industry-manufactured phenol–formaldehyde resin-bonded particleboards (hereafter call “board”) were obtained for the experiments. The boards were manufactured from wood processing residues, and satisfied the waterproof category of Type 18 and Type P boards under JIS A-5908 [29]. Type 18 boards satisfy 18 MPa in modulus of rupture, and Type P indicates waterproof particleboard. The density and thickness of the boards were 0.75 g/cm3 and 12.2 mm. The industry-manufactured boards did not allow revealing the information on the processing parameters, such as particle size, hot pressing temperature, and time. Further details are provided in the references [30, 31]. Thirty specimens measuring 50 × 50 mm were prepared for control, and the IB test was conducted according to JIS-5908 [29]. The initial IB was 0.83 MPa on average with a standard deviation of 0.09 MPa.

The outdoor exposure tests were conducted at eight locations in Japan from February 2004 to March 2011. Figure 1 and Table 1 show the latitude, longitude, and climate conditions of the locations. Twelve particleboards measuring 300 × 300 mm were set up for each location on an exposure stand that faced south at an angle of 90° to the ground. The cut edges of the boards were coated with enamel paint as a waterproof agent prior to outdoor exposure. Two boards were collected from each location after 1, 2, 3, 4, 5, and 6 (or 7) years of exposure, and were subjected to the IB test. Prior to the IB test, the boards were conditioned to the moisture content of approximately 10 %. For each location, thirteen specimens measuring 50 × 50 mm were cut from the boards. The detailed cutting pattern was described by Korai et al. [30, 31].

Fig. 1
figure 1

Eight locations for outdoor exposure tests in Japan

Table 1 Summary of T m, S m, and P m for each location

Data preparation

Climate data of the eight locations for the exposure period were collected from the website of the Meteorological Agency in Japan [32]. The data included annual mean temperature (T), annual sunshine duration (S), and annual precipitation (P). The individual annual data were averaged over exposure period (t; t = 1–7 years) using the following equations:

$$ T_{mtj} = \frac{1}{t}\sum\limits_{k = 1}^{t} {T_{kj} } \quad \left( {t = 1 {-} 7,\;j = 1 {-} 8} \right) $$
(1)
$$ S_{mtj} = \frac{1}{t}\sum\limits_{k = 1}^{t} {S_{kj} } \quad \left( {t = 1{-}7,\;j = 1{-}8} \right) $$
(2)
$$ P_{mtj} = \frac{1}{t}\sum\limits_{k = 1}^{t} {P_{kj} } \quad \left( {t = 1{-}7,\;j = 1{-}8} \right) $$
(3)

where j is the location number, T mtj is the mean T of the location j during the exposure period t, S mtj is the mean S of the location j during the exposure period t, P mtj is the mean P of the location j during the exposure period t. Table 1 lists a summary of the mean annual climate data for each location.

The IB values of the thirteen specimens obtained from the same location were averaged to remove the variability in IB between the specimens. The mean IB (IBm) of each location is shown in Fig. 2 providing strong evidence that the IB deterioration depended on the exposure duration and the climate conditions at each location.

Fig. 2
figure 2

Mean internal bond strength (IB) of each location as a function of exposure period

The data obtained from the eight locations were separated into two groups by analyzing the variability in the climate data of each location, as discussed later in the results and discussion section. The data from the locations 1, 2, 4, 7, and 8 were used to develop MLR and ANN models, while the data from the remaining three locations 3, 5, and 6 were used to evaluate the predictive ability of the models.

MLR model

We considered models that assume the IB deterioration of particleboard under outdoor exposure was influenced by exposure period, temperature, sunshine duration, and precipitation. An MLR model was developed assuming that IBm depends linearly on t, T m, S m, and P m. The developed MLR model was expressed as following:

$$ {\text{IB}}_{\rm m} = \beta_{0} + \beta_{1} \times t + \beta_{2} \times T_{\rm m} + \beta_{3} \times S_{\rm m} + \beta_{4} \times P_{\rm m} + \varepsilon $$
(4)

where β i (i = 0–4) represent parameters to be estimated, and ε is the error term following a normal distribution with a mean zero and constant variance.

The assumptions of MLR, such as homogeneity of variance, normality of error, and linearity, were diagnosed by looking for patterns in certain plots [4]: standardized residual plots against fitted values and quantile–quantile plot (QQ plot) against normal distribution. In addition, the variance inflation factor (VIF) was used to assess the levels of multicollinearity. VIFs measure how much the variances of the estimated regression coefficients are inflated as compared to when the predictor variables are not linearly related. VIFs larger than 5–10 imply problems with multicollinearity between input variables, which can lead to models with poor prediction [33]. The analysis of MLR was performed using R, version 3.0.1 [34].

ANN model

An ANN model was constructed to predict IBm using NeuralWorks Predict (NWP) software (NeuralWare Inc., Pittsburgh, PA, USA). The input variables were t, T m, S m, and P m, while the output layer was IBm. The input variables were nonlinearly transformed to avoid complex representation of the model. A genetic algorithm [35] was employed to make a suitable choice of input variables from the set of all input variables and transformations of input variables [36]. The types of transformation selected included the linear, square, and hyperbolic tangent, whereupon ANNs were constructed by a cascade-correlation learning algorithm [37]. The cascade-correlation is a method of incrementally adding processing elements. Instead of adjusting the weights in an ANN of fixed topology, cascade-correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multilayer structure. Once a new hidden unit has been added to the ANN, its input-side weights are frozen. This unit then becomes a permanent feature detector in the ANN, available for producing outputs or creating other, more complex feature detectors. The flowchart of the ANN modeling is depicted in Fig. 3.

Fig. 3
figure 3

Flowchart of the ANN modeling. t exposure period, T annual mean temperature, S annual sunshine duration, P annual precipitation, IB internal bond strength

To show the degree of contribution of the input variables to the determination of the network output, a sensitivity analysis was performed with NWP that computes partial derivatives of the output variable with respect to each of the input variables. The sensitivity analysis produces a quantitative measure of the variation in the IBm calculated by the network, when each variable changes. The normalized sensitivity for each input variable was calculated according to Eq. (5):

$$ {\text{Normalized sensitivity}} = \frac{{\sum\nolimits_{i = 1}^{N} {\left( {\frac{{\partial y_{i} }}{{\partial x_{i} }}} \right)^{2} } }}{{\sigma^{2} N}} $$
(5)

where σ 2 is the variance of the partial derivatives for each input variable, x i and y i are the input and output vectors for each data set. High values of this sensitivity indicate that a slight variation of the variable produces considerable changes in the output IBm, and vice versa. Furthermore, the average value of sensitivity for each input variable was calculated according to Eq. (6):

$$ {\text{Average value of sensitivity}} = \frac{{\sum\nolimits_{i = 1}^{N} {\frac{{\partial y_{i} }}{{\partial x_{i} }}} }}{N} $$
(6)

which indicates a positive relationship between input and output variables for its positive sign, while a negative sign indicates an inverse relationship. This is a standard diagnostic procedure commonly used to gain insight into a multilayer neural network solution [36].

Results and discussion

Climate variability between locations

Principal component analysis was employed for the climate variables, namely T m, S m, and P m, so that the climate variability between locations could be observed in the two-dimensional score plot (Fig. 4). It is clear that the locations 5 and 8 are clustered in the top right-hand corner, indicating that the climate conditions at locations 5 and 8 are different from those at the other locations. These two locations had relatively high T m and P m (Table 1), and showed the lowest IBm among the eight locations over the entire exposure period (Fig. 2), whereas the location 1 in the bottom left-hand corner had the lowest T m and P m, and showed the highest IBm. Therefore, the score plot shows the tendency of IB deterioration in association with climate conditions. Kojima et al. [2, 3] geographically divided eight locations into northern Japan (location 1–4) and southern Japan (location 5–8), and the IB deterioration was evaluated for each group separately. This geographical difference is apparently reflected on the first principal component in Fig. 4.

Fig. 4
figure 4

Score plot for the first two principal components. The individual plots represent locations for each exposure period t

If the data for model construction are collected from the locations with similar climate condition, the model will not be applicable to the external locations with different climate conditions. The climate variability between locations was checked by the score plot, and the eight locations were divided to avoid substantial bias between groups. Consequently, the locations 1, 2, 4, 7, and 8 were selected for model development, while the data from the remaining three locations (3, 5, and 6) were used to evaluate the predictive ability of the models.

Checking assumptions of MLR

An important aspect of regression involves assessing the tenability of the assumptions upon which its analyses are based. The assumption of independence of observations, which is fundamental to all statistics, was fulfilled at the design stage of this study.

The residuals from the model were examined to check the assumptions of linearity and homogeneity of variance. In Fig. 5, the standardized residuals were more or less evenly scattered above and below their mean of zero, indicating that the data satisfied both assumptions.

Fig. 5
figure 5

Plots of fitted values versus standardized residuals

The normality assumption was assessed through normal QQ plot in Fig. 6. The displayed points should follow a linear shape if the data values are from normal distribution. The QQ plot shows that the residuals were approximately normally distributed, since the plots did not depart from the expected identity line. These results confirm that the data satisfied the assumptions of MLR.

Fig. 6
figure 6

Normal quantile–quantile plots. The solid line is the line that goes through the 25th and 75th percentiles of the data and of the theoretical distribution

Interpretation of MLR model

The results of type II ANOVA for the MLR model is listed in Table 2. The inspection of the ANOVA table suggests that t, T m, and P m were significant factors of IB giving p < 0.05. This finding supports the fact that the WI (the logarithm of the sum of (T × P) over exposure period) was highly correlated to IB [2, 3]. The sums of squares and F value were highest for t, followed by T m and P m in decreasing order. Therefore, the exposure period was found to be the most influential factor on IB deterioration, followed by T m and P m.

Table 2 Type II ANOVA table for MLR model

In contrast, S m was insignificant (p = 0.685), indicating that sunshine duration had little influence on IB deterioration of particleboard under outdoor exposure. This finding is consistent with the suggestion that sunlight does not directly affect the internal deterioration of board and that IB is an interior property of boards [2].

Variance inflation factor was used to measure collinearity among explanatory variables. The VIF values were lower than 5 for all variables, indicating the absence of multicollinearity. The estimated coefficient for S m was zero, and as a result, the developed MLR was expressed as following;

$$ {\text{IB}}_{\rm m} = 1.1521 - 0.0841 \times t - 0.0340 \times T_{\rm m} - 0.0009 \times P_{\rm m} $$
(7)

The coefficients for all variables showed negative values, and the MLR model gave an R 2 of 0.89 (adjusted R 2 of 0.88). This means that 88–89 % of the IBm variance could be explained by the linear model assuming that exposure duration, temperature, and precipitation negatively affect IB in an additive manner. These results demonstrate that the MLR model was useful to assess the impact of climate on IB of particleboard under outdoor exposure. It is, however, noted that the MLR model is applicable within the limited exposure period of this study, as discussed later.

Interpretation of ANN model

The architecture of the ANN model consisted of 4 input neurons, 9 neurons with a hyperbolic tangent transfer function in the hidden layer, and 1 output neuron with a sigmoid transfer function. To estimate the relative importance of the individual climate variables to model predictions, the normalized sensitivity was calculated for each input variable (Fig. 7). It is apparent that t, T m, and P m had a negative influence, and t was the variable presenting a higher impact on IBm, followed by T m and P m. This order is consistent with the results of the MLR model. Conversely, S m had a relatively small and positive impact on IBm. It is likely to reflect the nonlinear relationships between S m and IBm, which was not found in the statistical test of the MLR model. Although S m does not directly affect the internal deterioration of board, solar radiation facilitates the evaporation of rain water from the surface and prevents water from penetrating into the interior of board. This may be a reasonable explanation for the positive impact of S m.

Fig. 7
figure 7

Results of sensitivity analysis in the ANN model. Gray bars represent average sensitivity <0; white bars represent average sensitivity >0

Predictive ability of ANN model and MLR model

Figure 8 shows the plots of the experimentally measured versus predicted IBm using the ANN model and MLR model, respectively. The ANN model gave an R 2 of 0.93 and a root mean square error (RMSE) of 0.05 MPa, while the MLR model gave an R 2 of 0.87 and an RMSE of 0.07 MPa. Thus, the predictive ability of both models was good, and they were demonstrated to be robust and applicable to boards exposed at different locations in Japan.

Fig. 8
figure 8

Plots of the experimentally measured versus predicted IB using the ANN model (left) and the MLR model (right). The solid line represents a one–one relationship between measured and predicted values

The ANN model gave a higher R 2 and a lower RMSE than the MLR model. Therefore, the ANN model outperformed the MLR model for predicting IBm. As can be seen in the Eq. (7), the MLR model assumes that IBm decreases by 0.0841 MPa per year, independent from the magnitude of IBm. Consequently, in Fig. 8, the MLR model showed a negative value when its measured IB value was close to zero. It is concerned that MLR will be vulnerable when the exposure period is further increased and the boards continue degrading, because the assumption of linearity will no longer be valid. In contrast, ANN is capable of describing nonlinear effects of climate variables without any assumptions. This flexibility of ANN makes it more suitable for predicting the IB of particleboard under outdoor exposure.

Conclusion

The IB deterioration of a commercial particleboard put under various outdoor exposure conditions were examined by develo** the MLR model and the ANN model based on climate data including exposure duration, annual mean temperature, annual sunshine duration, and annual precipitation. Our results showed that both the models could be used to predict the IB of Type 18 and Type P particleboards exposed at different locations in Japan, and that the ANN model yielded slightly better performance than the MLR model.

It should be noticed that the MLR is based on the four assumptions: independence, homogeneity of variance, normality of error, and linearity. It is important to check the assumptions that we made in fitting the data. If these assumptions are contravened, the significance levels are no longer valid and the worth of the whole analysis is cast into doubt. In contrast, ANNs allow for flexible modeling of both linear and nonlinear behavior of IB deterioration without any assumptions that are required for MLR. Therefore, the ANNs are considered to be more suitable to predict the IB of particleboard under outdoor exposure.