1 Introduction

Mass timber can play a crucial role in enhancing the sustainability of the construction industry. The increasing adoption of mass timber stems from the advantages it offers over conventional construction materials such as concrete and steel, including aesthetic appeal, sustainability, and low environmental impact whilst providing a high strength-to-weight ratio [1,2,3]. Consequently, numerous projects are now embracing timber, such as Mjøstårnet, Norway, standing at 85 m. However, projects like these present challenges to the construction industry due to fire safety questions associated with timber in tall buildings [4]. Unfortunately, there has been insufficient research conducted in the fire safety of tall mass timber structures and the problems that arise from its flammability, fuel load and structural strength which need to be studied further [5,6,7,8]. When exposed to increased temperatures, timber undergoes several processes; preheating, drying, charring and oxidation, as illustrated in Figure 1. Charring is a process, also known as pyrolysis, that causes thermal degradation by breaking down the polymer chains into solid char and flammable volatiles [9, 10]. The boundary between the charred and uncharred layers of wood is referred to as the char-line, typically located around the 300 °C isotherm [11, 12]. Understanding the charring process and its rate is a crucial factor when calculating the fire resistance of timber structures, as it enables the determination of the residual cross-section and load bearing capacity of structural timber members. However, the current methods of predicting the charring process remain limited [9].

Figure 1
figure 1

The process of charring occurs when wood is exposed to intense heat, breaking down the timber into char and volatiles and reducing the load bearing capacity [9]

Industry standards such as Eurocode-5 (EC5) adopt a simplistic approach for structural calculations, quantifying char depth linearly over time, as represented in Eq. (1) [13].

$$\begin{array}{c}\delta =\beta t\end{array}$$
(1)

Equation (1) relies on a linear correlation between the final char depth, \(\delta\), and the duration of fire exposure, \(t\), with the gradient representing the average charring rate, \(\beta\) [9, 13,14,15]. In reality, charring rate is not often constant over time but can be approximated as such when using simple models for calculations [16]. Studies show the influence of various factors on the charring rate such as wood species, moisture content, density, permeability and heat flux [15, 17, 18]. As such, there is a need to develop sophisticated tools to accurately calculate the charring rate of timber in fire. EC5 assumes a constant charring rate depending on the wood type and density; for instance, softwood is assigned a charring rate of 0.65 mm/min, whilst hardwood (with exception of beach) is allocated 0.5 mm/min [6, 13]. More advanced models predict charring rates across various design fires using a multi-scale approach [12]. This study employs the EC5 standard to benchmark as a comparative tool for the development of the data-driven models discussed in this paper [17].

Alternative methods to EC5 and multi-scale include the utilization of data-driven tools such as machine learning (ML). Naser (2019) conducted a study on temperature-dependent models for charring rate using artificial intelligence (AI). Naser’s study applied a neural network (NN) to predict a uniform charring rate equivalent to Eq. (1) which demonstrates the potential of AI as a substitute for empirical models, highlighting the need for additional investigation. His study served as a proof-of-concept for employing AI in addressing fire engineering challenges [6, 19, 20]. One of the challenges faced when employing ML in fire engineering is the requirement for large datasets to train the models [17, 18, 21], which may not always be readily available. Although some fire limited datasets are available, ML algorithms offer substantial advantages, as they can learn and predict what cannot be explained yet by simple physical models. As more data becomes available in the field of fire engineering, the accuracy of predictive models will improve over time.

This paper presents a novel database comprising of timber charring experiments and tests, subsequently employing statistical methods and AI algorithms to predict the average charring rate of timber in fire for structural calculations.

2 VAQT Database of Charring Rates of Timber Products

Machine learning (ML) relies on the quantity and quality of data to train and create an accurate model. The initial step in this study involved creating a comprehensive database of average charring rates obtained from fire resistance tests following the ISO 834 standard. Collaborations with testing houses, ARUP and IBS, facilitated access to confidential data. Additionally, a thorough search of the available literature was conducted. By consolidating data from 21 different sources, the VAQT database was established, totalling 231 individual furnace tests, significantly larger than previous datasets [6, 21]. Figure 2 illustrates the distribution of data sources within VAQT, with 94% originating from literature. VAQT encompasses both softwood and hardwood, including species such as beech, spruce, fir and oak. The dataset incorporates input variables, such as density, moisture content, and thickness, while others were categorized such as insulation which refers to material applied to the top, bottom, and side faces of the sample in the fire tests, namely; calcium silicate and gypsum. The average charring rate is derived through a linear fit using Eq. (1) through the origin, from which the gradient is computed. Numerical variables are retained in their original numerical format, while categorical variables are encoded, for instance, categories such as (A, B, C) are represented as (1, 0, 0), (0, 1, 0), and (0, 0, 1), respectively [40]. VAQT was structured to incorporate essential features pertinent to timber charring experiments, as in Figure 3 and 4, where some features exhibit skewness addressed later in this study. VAQT also comprises information, including specimen composition, material type, size, moisture content, details about the test environment (furnace specifications, temperature), and measurements taken during the experiments (char depth, mass loss). To address missing values, for example moisture content, which serves as an input parameter for the models, a normalization process was implemented, where missing values were replaced by the mean value of the entire dataset [6, 41]. Comparing this with the study conducted by Liu et al., focusing on the review of charring rates among various timber species, specifically, shows the distribution of density against average charring rates reported an R2 of 0.4 whilst VAQT yields 0.3 [7]. This indicates that both studies demonstrate a weak correlation between density and charring rates but are not negligible showing that other variables are also important. However, it also highlights the need for future experiments to encompass a broader range of densities.

Figure 2
figure 2

VAQT fire resistance tests of mass timber elements from literature and test house reports [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39] whilst [*] refers to the confidential reports. This figure illustrates the distribution from the sources with the two images representing example cut outs of the specimen

Figure 3
figure 3

Box and whisker plot for the most important numerical variables in VAQT. The red curve outlines the distribution of the data for each variable

Figure 4
figure 4

Bar plot for the most important categorical variables in VAQT

Eurocode-5 (EC5) standard was compared against VAQT using Eq. (2) calculating an \({\varepsilon }_{VAQT}\) of 27%. This considers both soft wood and hardwood. \(P\) represents the total number of samples, \(i\) denotes the \({i}_{th}\) sample and \({y}_{m,i}\) and \({y}_{p,i}\) represent the measured and predicted charring rates respectively.

$$\begin{array}{c}{\varepsilon }_{VAQT} \left(\%\right)= \frac{1}{P}\sum_{i=1}^{P}\frac{{y}_{m,i }-{y}_{p,i }}{{y}_{p,i}}\end{array}$$
(2)

Despite a low \({\varepsilon }_{VAQT}\), there was a noticeable difference when comparing it to the distribution of measured charring rates from VAQT as represented in Figure 5. Eurocode-5 underpredicts the charring rate for about 50% of the samples, sometimes by a large margin, which is unsafe. In contrast, it overpredicts it for the other 50%, which is cautious, sometimes overly so. We want to know when Eurocode-5 underpredicts or overpredicts and by how much. The uncertainties specific to the measurement of charring rates are poorly documented in the scientific literature and a quantification of these falls beyond the remit of the paper.

Figure 5
figure 5

Measured average charring rates in VAQT compared to the value in Eurocode-5. Note that Eurocode-5 is close to the median, so it underpredicts the average charring rate for about 50% of the samples and it overpredicts it for the other 50%

3 Data-Driven Models

In ML, a dataset the size of VAQT is generally considered small for tackling complex problems, and as a result it poses an increased risk of model overfitting and or less accurate predictions [42]. Overfitting occurs when a model fits well to the training data but fails to generalize effectively to unseen data, a phenomenon caused by complex models, skewed data distributions and size. It is noteworthy to mention that skewed data distributions do not imply a scarcity of data within specific ranges; consequently, predictions made in these regions may exhibit higher probabilities of inaccuracy [43]. Furthermore, ML models tend to perform optimally when interpolating data; hence, they may encounter challenges in learning the underlying function in regions with limited data. Addressing this issue involves including more data in VAQT, but, due to the limited availability of data in the field, this approach may not be feasible.

To minimise the risk of overfitting, VAQT was randomly divided into training (80%) and testing sets (20%) following the standard practice in most ML frameworks to facilitate validation and reduce bias [44]. Moreover, the training set was further subdivided to include a validation set, accounting for 20% of the training data. The inclusion of a validation set is vital in ML training, as it facilitates evaluating the model’s performance during the training process and early detection of overfitting before introducing the testing set to make predictions [43]. Specifically for a neural network (NN), the need for a validation set is equally crucial due to the possibility of random fluctuations or noise in the data, which can adversely affect the model's overall performance. The validation set also becomes instrumental in tuning hyperparameters to optimize a NN’s performance. NN’s have parameters that need to be specified before training the model and these are called hyperparameters. These hyperparameters, include the learning rate, number of layers, number of neurons, activation and error functions [45, 46]. To enhance hyperparameter tuning, this study utilizes Bayesian optimization.

To further mitigate the risk of overfitting and to determine the best model for VAQT, a k-fold cross-validation approach was applied before implementing training [43]. This technique involves dividing the training set into random k-equal subsets, folds, where cross-validation iteratively designates one subset as the validation set while using the rest for training, repeating k times as represented by Eq. (3) with \(E\) as the final evaluation score and k as the number of folds [43].

$$\begin{array}{c}E = \frac{1}{k}\sum_{i=1}^{k}{E}_{i}\end{array}$$
(3)

Utilizing ML to address the specific problem, it is crucial to address some fundamental questions of importance. Understanding the type of problem at hand is essential; in our context, identifying it as a regression problem is significant, as it influences the choice of appropriate algorithms and techniques needed for analysis. Additionally, determining the factors that contribute to the success of the ML model is critical. This involves assessing various evaluation metrics, ensuring that the model’s performance aligns with the intended objectives and requirements. To ensure reproducibility in the results the seed for the data splitting was fixed, which determines the distribution of the dataset split allowing a consistent comparison of the different models. However, it is important to note that the fixed seed was randomised to demonstrate the robustness of the trained models later in this study. In summary, addressing these crucial questions creates a foundation for a successful application of ML.

3.1 Statistical Methods: Regression

Regression models, specifically, ridge and lasso, were chosen for their ease of implementation and foundational understanding, aiming also to provide a proof of concept for ML as they also allow the quantification of an algebraic expression for the models. These models are regularized versions of the fundamental linear regression Equation, represented as objective functions given in Eqs. (4) and (5) where regularization is used to minimize overfitting [43, 47,48,49].

$$\begin{array}{c}SS{E}_{{L}_{2}}=\sum_{i=1}^{P}{\left({a}_{o}+ \sum_{n=1}^{P}{a}_{n,i}{x}_{n,i} -{\widehat{y}}_{i}\right)}^{2}+ \lambda \sum_{j=1}^{N}{a}_{j}^{2}\end{array}$$
(4)
$$\begin{array}{c}SS{E}_{{L}_{1}}=\sum_{i=1}^{P}{\left({a}_{o}+ \sum_{n=1}^{P}{a}_{n,i}{x}_{n,i} -{\widehat{y}}_{i}\right)}^{2}+\lambda \sum_{j=1}^{N}\left|a\right|\end{array}$$
(5)

In context, \({x}_{n}\) denotes the predictor variable (e.g., density, moisture content), and \({y}_{i}\) represents the target variable (charring rate), and \(a\) represents the weight of regularization applied. The degree of regularization applied is determined by the value of \(\lambda\), typically set between 0 and 1 and is determined by comparing the respective models mean squared errors (MSE) as indicated in Eq. (6). The MSE is used to measure how well the model fits the data in Euclidean space using L2 normalization distance.

$$\begin{array}{c}MSE=\frac{\sum_{i=1}^{P}{\left({y}_{i}-{\widehat{y}}_{i}\right)}^{2}}{P}= \frac{SSE}{P}\end{array}$$
(6)

Another regression model, Bayesian ridge, which uses a gaussian distribution and adapts to the data given was used [50]. The approach is probabilistic, contrasting with the deterministic approach employed in the previous regression models. This probabilistic framework is deemed appropriate as it offers rapid inference and efficiency aligning well with the scale of VAQT.

3.2 Supervised Machine Learning: Support Vector Machine

Supervised learning encompasses a variety of algorithms in ML, with support vector machine (SVM) standing out as an influential and widely used tool [51]. Although this study considered the inclusion of alternative models like random forests, they were ultimately not used here but could be part of future studies. SVM, known for error-based learning, has been used in many different fields and more specifically in fire engineering [44]. Panev et.al (2021) used SVM to assist in the prediction of fire resistance of composite shallow floor systems and concluded that this method rapidly assesses the feasibility of different details at the early stages of the design process [52, 53]. In this study, we utilize the support vector regressor (SVR), a variant of SVM, as predicting the charring rate is a regression task. SVM uses a kernel trick, a technique that allows SVM to perform non-linear learning through convex optimization in infinite-dimensional space [54]. The kernel used is a Gaussian radial basis kernel which helps control the convergence of the algorithm by finding non-linear relationships [54]. Due to its ability to generalize well with small datasets meaning having accurate predictions, SVM is viable choice for this study. SVM and a NN share similarities as parametric models. However, SVM distinguishes itself by utilizing kernels. SVM serves as a bridge between statistical methods like regression, which may not generalize well on specific datasets to more complex models such as a NN which were developed and are now widely adopted, providing a solution that is also implemented in this study.

3.3 Neural Network

To introduce more advanced predictive methods, a NN was developed for this study which captures hidden patterns through adaptive learning [6]. The architecture of the network is sequential, with no loops connecting different parts of the network. Our NN consists of three layers; an input layer with 17 nodes due to the 17 predictor variables of VAQT, a single dense-hidden layer with 19 nodes, and a single output layer which is represented through 1 node to predict the charring rate. The general rule of thumb suggests that increasing the depth of the network is more effective than infinitely increasing its width, which creates a network with multiple hidden layers called a multilayer perceptron (MLP) [44]. The model is defined using the rectified linear unit (ReLu) activation function in the hidden layer and a linear function in the output layer since the output represents continuous data. ReLu is a common choice for most NN architectures due to its efficiency and speed, making it suitable for regression problems and thus specifically this study. The mean squared error (MSE) as represented previously in Eq. (6) was selected for both the error function and evaluation metric to allow a direct comparison with the statistical models. Using MSE as the error function is a standard practice when dealing with regression problems [54]. The Adam optimizer, a variant of stochastic gradient descent, was chosen with a learning rate of 0.0001. Furthermore, the weights and biases of the network are randomly initialized such that the same value chosen for the L2 regularization were applied on the network layers. The role of the optimizer is to minimize the error, pushing the model towards a global minimum whilst avoiding the local minima; therefore, selecting an appropriate learning rate is crucial for optimal performance.

4 Results & Discussion

Amongst the statistical models, the ridge regression model (\(\lambda\) = 0.001) performed the best, which predicted the charring rate of timber with a minimum \({\varepsilon }_{VAQT}\) of 11%, performing better than EC5. The performance of all the models is represented in Figure 6 with the corresponding ridge regression model being depicted in Figure 7. To re-iterate, the seed is fixed for these results and is later removed to show reproducibility. Initial results from these regression models are encouraging, suggesting that further pursuing this approach could lead to more accurate charring rate predictions. However, the goal is to have the predicted charring rate close to the measured one, as visually demonstrated by the charring values being closer to the regression line in Figure 7.

Figure 6
figure 6

Comparison of the average error using Eq. (2) in VAQT for the statistical models, where all models show a similar prediction whilst ridge regression (λ = 0.001) is the best and ridge (λ = 1) is the worst

Figure 7
figure 7

Comparison of measured vs predicted for the best regression model—ridge regression (\(\uplambda\) = 0.001). The datapoints falling above the regression line indicate cases where the model is not conservative for design

These results could potentially be improved further using a more advanced model such as a NN, as both linear and non-linear relationships are considered. The current statistical models, namely ridge regression (\(\lambda\) = 0.001) used for predicting charring rate, rely on linear relationships, except for Bayesian ridge and SVM, both of which indicate good predictive performance. However, it's important to note that this is not always guaranteed, and there may be only a marginal increase in accuracy with disproportional increased computational cost.

4.1 Iteration Feasibility Study

The nature of training a NN is stochastic, meaning that it can follow a different learning path even with a fixed data split. This is due to the random initialization of weights and biases at the start of training. Weights are included in each neuron in the layers of a network, and a bias is a constant term added which results in the output from the neuron. To address this issue and ensure reliable results, an iteration feasibility study was conducted to show reproducibility [55]. The iteration feasibility study revealed that when the total number of iterations exceeded 200, the results became independent and further iterations are not needed. This finding is illustrated in Figure 8, where the validation error (MSE) values reach a plateau after approximately 200 iterations. Consequently 250 unique training iterations were selected for each NN model. It’s essential to distinguish between “iterations” and “epochs” in this study, as epochs will be mentioned in the next section of this paper. Iterations refer to the number of times the model will uniquely train meaning a new model with new weights and biases, while epochs determine the number of times the model repeats over the entire training dataset during the training process. This feasibility study aims to obtain a single output that potentially provides the lowest validation error from the 250 unique training iterations.

Figure 8
figure 8

Iteration feasibility study represents the plateau of MSE (mm/min) values using Eq. (6) over 1000 iterations where 250 is the optimal point where results become independent of further iterations

The first NN trained uses a simple network architecture with a single dense-hidden layer comprising of 19 neurons. The model predicted the charring rate of timber with a minimum \({\varepsilon }_{VAQT}\) of 10% over the 250 iterations, demonstrating an improvement over Eurocode-5 (EC5) and ridge regression (λ = 0.001). This initial model represents the simplest form of a general NN architecture. However, to explore the potential for further increased performance, other hyperparameters were tuned to which the number of layers and neurons in each layer were altered while kee** other hyperparameters constant, this is known as a heuristic training approach [56]. The first trained model had the best performance, indicating that increasing model complexity through the network architecture does not always lead to better performance. Figure 9 depicts the result of this NN. However, it is evident that this model suffers from slight overfitting as shown in Figure 10 through the training validation loss curve. The two curves meet but the validation curve starts to ascend indicating the model is starting to overfit where the predictions fit closely to the training set rather than the testing set. Ideally the two curves should converge and plateau during the training process. The other trained models also exhibit similar trends, suggesting that other hyperparameters, such as learning rate or layer regularization, should be optimized to find a balance with a low validation error whilst mitigating overfitting. Stop** training or reducing the number of epochs might lead to the convergence of the training and validation lines, indicating a more optimal model.

Figure 9
figure 9

Comparison of measured vs predicted for a single-layer neural network. The datapoints falling above the regression line indicate cases where the model is not conservative for design

Figure 10
figure 10

Learning curve of the single-layer neural network using Eq. (6). The two curves should converge during training, but in this case they do not and therefore the hyperparameters need optimizing

5 Neural Network Hyperparameter Optimization:

Tuning hyperparameters for a NN is a stochastic process, during which the parameters are modified arbitrarily to optimize the model’s performance by minimizing the error function until a satisfactory criterion is met. The set of hyperparameters includes choices such as the number of neurons, layers and learning rate. Tuning hyperparameters using a stochastic approach can be computationally expensive and time-consuming due to the numerous possible combinations of choices. There are simple algorithms such as Grid/RandomizedSearchCV [54, 57] and more sophisticated ones, namely random forests [58]. However, these algorithms are still performed manually and iteratively, lacking a systematic way to build upon previous results. Bayesian optimization (BO) is the preferred method for this study as it considers previous results of hyperparameter choices to guide the search for better solutions in the next iteration. BO takes a probabilistic approach by map** a distribution over an objective function to the region of interest using a gaussian process through a surrogate model. Although BO is computationally expensive, it is well-suited for optimizing models with many hyperparameters as it covers a wider range of possibilities through probabilistic distribution. Through the application of BO, a multi-layered NN was trained, and this predicted a charring rate with a minimum \({\varepsilon }_{VAQT}\) of 9% with its corresponding loss graph in Figure 11. The graph demonstrates that overfitting has been negated as the two curves converge and meet as the training process is complete. This outcome highlights the critical importance of striking a good balance between minimizing the validation error whilst controlling overfitting in the NN training process which in our case the Bayesian optimization approach proved to be effective in achieving this balance.

Figure 11
figure 11

Learning curve for a multi-layer neural network using Eq. (6) which shows there is no overfitting as the two lines converge to show training is complete

6 Overall Model Comparison

A model comparison shows a multi-layered NN trained through Bayesian hyperparameter optimization exhibited the best performance compared to the initial single hidden layer NN and Eurocode-5 (EC5). It also demonstrates the best balance between overfitting and validation error, making it the top performing model in this study. The comparison is illustrated in Figure 12 where the seed remains fixed for all models. Nevertheless, this leads us to question whether the computational cost associated with training a multi-layered NN in addition to probabilistically optimized hyperparameters outweighs the simplicity, efficiency of simpler statistical models, especially when there is a minimal difference in predictions. Despite this consideration, both the statistical models and the NN’s outperformed EC5 in predicting the charring rate of timber based on VAQT, achieving higher accuracy in their predictions. This indicates the viability of using data-driven models to predict the average charring rate of timber for structural engineering calculations, which are equivalent to existing data-driven models in the literature given by empirical correlations to predict the fire resistance of timber beams and columns [59].

Figure 12
figure 12

A model comparison using Eq. (2) with a fixed seed between the models implemented in this study whilst indicating the multi-layered neural network giving the most accurate predictions

6.1 Problem of Reproducibility and Overfitting

Reproducibility poses a significant challenge in the field of Machine learning (ML), as it strives to attain consistent and reliable results with minimal variability [60]. Reproducibility is achieved when the trained algorithm repeatedly produces similar or equal results during each run. The stochastic nature of ML models contributes to this challenge, with various factors introducing randomness, such as dataset splitting prior to training. To mitigate this, it is common to select an even distribution for each variable when splitting the data. However, in this study, due to the large number of data points from different experiments, such an approach would be counterproductive. Instead, we opt for random splitting and k-fold cross-validation following standard practices to achieve reproducibility. Additional sources of randomness in ML models include the random initialization of weights and biases before training a neural network (NN). Each model is trained uniquely 250 times as determined by the preliminary iteration feasibility study. The charring rate predictions from these iterations are presented in Figure 13. The results exhibit stochastic behaviour, as evidenced by the distribution of percentage error for each model. Less complex models, such as ridge regression with strong regularization applied, and the Bayesian regression model, show the most reproducible results due to the smaller range in results and less outliers. On the other hand, the more complex models, the NNs, exhibit higher stochasticity with \({\varepsilon }_{VAQT}\) values but they have more accurate predictions. To address the variability in predictions, using a larger dataset would help in achieving more consistent results.

Figure 13
figure 13

Distribution in stochastic results of the models chosen in this study and their \({\upvarepsilon }_{\text{VAQT}}\) values using Eq. (2) for predicting the charring rate of timber in VAQT. The statistical models have the best reproducibility whilst the multi-layer-NN has the lowest error by a small margin

6.2 A Sensitivity Study

A sensitivity study using permutation importance was conducted on the fully trained model to identify the most important features in VAQT influencing charring rate [61]. The sensitivity was measured by observing the resulting change in the model’s mean squared error (MSE), the method is governed by Eq. (7) where the importance of each feature is measured by \({i}_{j}\), where \(k\) is the number of repetitions in set \(K\) and \(s\) is the score of the model [61].

$$\begin{array}{c}{i}_{j}=s-\frac{1}{K}\sum_{k=1}^{K}{s}_{k,j}\end{array}$$
(7)

Figure 14 illustrates the outcome of this analysis, where insulation is the most important factor, which in term indicates it as a crucial determinant in the model’s predictions. This is commonly known from previous studies which also conclude that insulation significantly reduces the charring rate [5, 30]. In addition, variables such as moisture content, density and type of timber have a similar effect on the model, which is supported in a study by Bartlett et al., reporting that the aforementioned variables have a strong effect on the charring rate of timber [5]. Studies by Richter et al., demonstrated through sensitivity analysis that timber charring is highly sensitive to extrinsic parameters of the test such as heat flux, oxygen concentration and convective heat transfer [8]. Another concern is multicollinearity as it directly effects sensitivity studies. It occurs when one or more features are correlated with each other, making it challenging to estimate the model accuracy. In this study, regularization techniques were employed in the models to minimize multicollinearity [48, 49]. Overall, the sensitivity study conducted in this research is based on predictivity of the model and not on VAQT data directly which would relate to the factors based on physics.

Figure 14
figure 14

A sensitivity analysis was performed using Equation (7) on VAQT through permutation importance which indicates that the insulation and its thickness has the largest effect on the models. The ± indicates the positive or negative change of the charring rate of timber

7 Conclusions

Industry standards, such as Eurocode-5, are a simple and a quick approach to quantifying an average charring rate of timber in fire for structural calculations yielding an \({\varepsilon }_{VAQT}\) of 27%. However, the tests and experiments in VAQT prove that there is no singular value representative of the charring rate of timber as many different factors play a role including the type of timber product, insulation, thickness and density. Insulation was found to be the most important factor. The best statistical model was ridge regression (\(\lambda\) = 0.001) which predicted the charring rate with a minimum \({\varepsilon }_{VAQT}\) of 11%. A complex method, a multi–layer neural network, was employed and predicted the charring rate with a minimum \({\varepsilon }_{VAQT}\) of 9%. These errors can be further reduced by adding additional data into VAQT, which would reduce the skewness and increase data variability. For this study there is a small added benefit of increasing model complexity by using a neural network when the charring rate of timber is predicted within a small range for all models. However, all models outperformed Eurocode-5. Prediction reproducibility is important when utilizing data-driven methods, as they are stochastic. Thus it is important to improve models, so there is minimal change in the predicted average charring rate. This study shows that statistical models produced the most repeatable results, and artificial intelligence is the most accurate but by a small margin. This study presents a novel database of timber charring experiments in VAQT, and provides a set of data-driven predictive models, all of which calculate the average charring rate with a significantly higher accuracy than Eurocode-5 for a wide range of mass timber products.