Predicting the Inelastic Response of Base Isolated Structures Utilizing Regression Analysis and Artificial Neural Network

Indeed, utilizing a base isolation system in RC structures can remarkably minimize the possibility of failure, particularly in seismic-prone countries. Despite that, the design of these structures is a long procedure that consists of choosing the appropriate isolator to optimize the nonlinear behavior of the superstructure. Moreover, the numerical simulations require huge computational effort when high accuracy is required. In recent decades, scientists and engineers have applied numerous estimation approaches such as multiple linear regression and artificial neural networks to decrease the required cost and time for daily design problems. Thus, this study's main objective is to solve the difficulty of rapid response prediction by using soft-computing techniques. Additionally, it aims to study the capability of multiple linear regression and artificial neural networks in estimating the seismic performance of base-isolated RC structures under earthquakes. A nonlinear response history analysis of four different lead rubber-bearing isolated RC structures will be performed in order to determine the responses of these structures. Subsequently, the prediction models will be developed using the responses of the structures as inputs for multiple linear regression and artificial neural networks. Lastly, the reliability of both estimation approaches in terms of the response of base-isolated structures will be investigated by comparing the prediction models' capability. In general, the results of the study show that artificial neural networks provide considerably better accuracy in estimating base-isolated structures compared to multiple linear regression, and their performance results in reliable prediction.


Introduction
The base isolation system has been shown to effectively control and reduce the responses of reinforced concrete (RC) buildings, particularly under solid shaking intensities, by minimizing inter-story drifts that are generated from the inelastic deformations of the structures [1,2]. Accordingly, many studies have been carried out in the past decades to understand the performance of these systems [3][4][5]. This process decreases the seismic response of a building by increasing its natural period, which ultimately results in a significant improvement in structural behavior when subjected to ground motion [6]. Indeed, performing a nonlinear time history analysis is necessary to have a comprehensive understanding of the behavior of a base-isolated structure. Besides, optimizing the response of the isolated structures requires a lengthy procedure of testing different isolator characteristics, which dramatically increases the time and effort needed to attain the appropriate level of performance. In recent years, regression methods have witnessed widespread application in various applications in civil engineering, including the estimation of the coseismic landslide displacement [7], seismic behavior of reinforced concrete walls utilizing a performance-based backbone model [8], and developing the collapse fragility curve of RC bridges [9].
In contrast, artificial neural networks (ANNs) have earned popularity and importance in civil engineering due to their superior capability to precisely solve complex problems and predict solutions. It may also be employed in selfdiagnosis and reinforcement learning approaches [10]. In addition, artificial intelligence was adopted to predict concrete compressive strength [11][12][13][14] and assess slope stability [15,16]. Furthermore, it has been applied to the damage detection of steel portal frames relying on modal vibration parameters [17], prediction of the structural response of twostory shear structures [18], and estimation of the earthquake magnitude using seismicity indicators [19][20][21]. The present techniques of analysis, mainly nonlinear response history, have a few drawbacks, which consist of the limited capacity of the finite element packages in addressing this sort of assessment, the difficulty of conducting and interpreting the outcomes of response history analysis, and the time and effort required by this procedure because the analysis goes through a lengthy trial and error process. Hence, the world's modern tendencies are toward automation and artificial intelligence, and applying this to seismic evaluation as a future scope is essential.
As a result, this research aims to assess and compare the efficiency of multiple linear regression versus ANN in predicting the behavior of base-isolated frames subjected to ground motion loads. Accordingly, this study's innovation suggests a rapid method for response estimation of base-isolated structures and reports the capability of two of the most commonly used soft-computing models. This information was never discussed, and the performance of the selected models for such a prediction is a literature gap. In this investigation, a nonlinear response history analysis will be carried out on several RC buildings to determine the response of irregular models when combined with various lead-rubber bearing systems. Furthermore, a suite of ground motion records will be used. Following that, the responses of the buildings to the chosen earthquakes will be utilized as a dataset for generating the estimation models of the two formerly described approaches. This study will be helpful for researchers in the field of structural engineering to establish more automated techniques for performance-based design and assessment of base-isolated structures, as well as practicing engineers who wish to conduct rapid optimization of base-isolated structures or develop numerical analysis software for designing base-isolated structures.

Materials and Methods
Estimating a realistic building response in base-isolated structures is essential for safe and reliable design. A proposal to expedite the process of designing base-isolated structures by adopting soft-computing techniques is discussed within the scope of this study. It evaluates and highlights the capabilities of two of the most commonly used approaches in this context. The research methodology of this investigation is indicated in Figure 1.

Finite Element Modeling and Analysis
Previously, several studies were conducted to investigate the performance of regular and irregular base-isolated RC structures and highlighted the variation between the response of regular and irregular structures. Consequently, four different base-isolated RC structures (See Figure 2) were utilized with lead rubber bearing to represent regular, extremely soft, heavy story, and stepped structure models. The ASCE 7-16 [22] defined extreme soft and heavy story cases.
The sixteen chosen structural models were equipped with lead rubber bearing isolators with 15, 20, and 25% effective damping ratios. The constituent parts of a lead rubber bearing isolator are a lead core, laminated rubber layers, attachment steel plates, and stiffening steel plates. The vulcanization bonding technique is utilized to construct the thin stiffening steel plates, and rubber layers are built in a successive pattern. In addition to that, attachment steel plates are employed at the top and bottom of the lead rubber bearing. Practically, the stiffness of this isolator differs depending on the direction of the bearings. In the case of the vertical direction, the bearings possess high stiffness in contrast to the horizontal one, which is very flexible.
Based on this, it is possible to separate the horizontal components of ground motions from the structure using lead rubber bearing by introducing a horizontal sheet of high flexibility between the foundation and the building itself while it is equipped with the isolator. This situation leads to unique properties of vertical support system restoring force, flexible horizontal layer, and damping [23]. The bi-linear hysteretic algorithm is applied to model lead rubber bearing. Accordingly, the behavior is based on the characteristic strength, post-elastic stiffness, and displacement at yielding. The characteristics of lead rubber bearing were calculated for modeling the isolator as specified by Hwang & Chiou (1996) [24].

Figure 2. Elevation of used structures
In order to model and evaluate the chosen structures in a two-dimensional (2D) system, the finite element program SAP2000 was used. The building was assumed to be located in a site class D, with the MCER response spectrum described by parameters SMS = 1.875g and SM1 = 0.9g to be comparable with Kitayama & Constantinou [1]. Besides, C16 concrete and S420 reinforcement were employed. The linear elastic approach was utilized to model beam and column sections ( Figure 3) with effective stiffness as prescribed in ACI 318-19 [25]. Thereafter, the equivalent lateral force method was conducted using ASCE/SEI 7-22 [22] to decide whether or not the structure required retrofitting. In general, the same section size and reinforcements were defined for all models to reduce the number of parameters that can alter the period of the structure or have an impact on it.

Figure 3. Illustration of the columns and beams sections used in the models
The nonlinear modeling of the selected structures was carried out according to the directives outlined in the NIST GCR 17-917-46v3 guideline made available by the National Institute of Standards and Technology [26]. The stressstrain behavior for the concrete in both the unconfined and confined states was characterized by Mander et al. [27]. In contrast, the stress-strain performance of the steel reinforcements was described by Park & Paulay [28] as being symmetrical for both the compressive and tensile zones. The fiber hinge model was applied to simulate the nonlinearity in the beams and columns by dividing their sections into three parts, as reported by Kalantari and Roohbakhsh [29]. An unconfined concrete was used for the section's cover, a confined concrete model was employed for the section's core, and a steel model was utilized for the reinforcements. In this study, the superstructure damping ratio was assumed to be 2.5.
Pacific earthquake engineering research center (PEER) database was taken to identify a set of sixteen earthquakes for this investigation. On the other hand, Baker [30] proposed a criterion to indicate the pulse behavior for the chosen sixteen ground motion records (Table 1). This criterion determines the pulse indicator (Ip) and checks the peak ground velocity (PGV). The ground motion record is deemed to possess a pulse behavior if Ip > 0.85 and PGV > 0.3 m/s. Besides, one last criterion regarding the directivity impact was considered by utilizing wavelet analysis to exclude the original ground motion record into pulse signal and residual one.  In terms of scaling the earthquakes, various scaling methods, including ACT [31], ASCE [32], and the mean square error (MSE), can be used. However, the MSE method was used since it reflected the best results compared to other scaling approaches, as concluded by Michaud and Léger [33]. After that, MSE scaling was performed using the PEER website to determine a single factor for earthquake signal then; these single scaling factors were adjusted using a modification factor to reduce the MSE in-between the mean and spectrums to produce a matching between these two spectrums. The 16 earthquakes were scaled to minimize the MSE value, Equation 8, over the 0 to 5 seconds period, as illustrated in Figure 4. Finally, 15 seconds of zeroes were added to each ground motion record to take the effects of the free vibration response of the structures into account [1].

Multiple Linear Regression
It is a statistical analysis that is applied to develop a linear relationship between the response (dependent) variable and many predictors (independent) [34]. According to Achen [35], the mathematical model of the multiple linear regression is illustrated in Equation 6.
where yi is the i th observation on the dependent variable, 1 , … , are the i th observations on the independent variables, βo is an intercept term, β1, …, βk are the coefficients to be estimated, and εi is a residual error of the i th observation

Artificial Neural Network
An artificial neural network mimics biological neurons found in the human brain. It gives the computer the ability to learn and make decisions like people [36]. As a result of its ability to simulate the learning process in humans, it has been put to use in all fields of civil engineering, makes it feasible to solve complicated problems, and possibly permits the adoption of such technology in real-world applications [37]. A simple ANN consists of inputs, weight coefficients, activation and sum function, and outputs [38]. Regarding weight coefficients, they are a vital component of ANN since they represent the significance of each neuron in the input layer by reflecting the ability of each input to stimulate the neurons [33][34][35][36]. The weighted sums of the input components can be calculated using Equation 7.
wij is the weight between i and j neurons, Xi is the output of the i neuron, b is the bias used to model the threshold, and n is the number of neurons.
Feed forward back propagation is a learning process in which the calculations are conducted iteratively to modify the weights and lower the MSE between the observed and predicted data, as presented in Alshihri et al. [39]. Also, this process was adopted for this investigation because it includes the computation moving in one direction from the input nodes towards the output one and then backward from the output through the hidden layer to the inputs. The modified weights are generally calculated using the steepest gradient descent principle shown in Equation 8.
where w is the weight between any two nodes, wn and wn-1 are the changes in this weight at n and n-1 iteration,  is the momentum factor, and η is the learning rate.
The Levenberg-Marquardt back propagation function developed all the prediction models according to Toan & Menhaj [40] and Marquardt Algorithm [41]. It indicates that the selection of 70% of the total dataset for training the prediction model should be random, while the remaining 30% of the data should be utilized for testing and validating the model. The neural network setup consists of one layer each of input, hidden, and output. A trial-and-error process determined the number of neurons that should be employed in the hidden layer to ensure the highest possible performance.

Developing Estimation Models
In order to establish the neural network model, this research adopts the process described in Figure 5. Firstly, the dataset was acquired to be a suitable illustration of the problem domain and divided into the train (70%) and test (30%) sets. Then, a group of hyperparameters for each predictive model in a neural network is determined. After the model is constructed and its parameters are optimized, several metrics are applied to analyze the algorithm's accuracy and evaluate it versus the code-based technique.

Results and Discussions
In this study, the first stage of the research was devoted to conducting finite element modeling and nonlinear time history analysis. Accordingly, the results of the numerical simulations are presented herein. For instance, the four structures' base shear forces are plotted in Figure 6. In general, the heavy story irregularity experienced the largest values in the case of the bare structure compared to the regular model. However, concerning the investigated structures, the isolator proved its effectiveness in minimizing the base shear for the three models at 15, 20, and 25% damping ratios. Lastly, the most drop in the shear force among the four models was obtained in the case of the heavy story, which reflects the best performance of the isolator, while the soft-story model represents the worst. This is caused by the fact that softstory irregularity results in reduced first-story stiffness, which increases the period of the structure. This increase in the period compromises and reduces the efficiency and performance of the lead rubber bearing isolator compared to other irregular models as highlighted by previous studies [42,43].

Figure 6. Base shear of the investigated structures
On the other hand, implementing a lead rubber isolator reduced the acceleration of the four models regardless of the damping ratio value, Figure 7. Whereas the utilization of an isolator in the stepped building caused the highest reduction in the roof acceleration, showing the highest efficiency among the four structural models. Nevertheless, the least efficient isolator was in the case of the soft story irregularity, which can be attributed to the increase in the period of the structure during the formation of soft story irregularity. A similar conclusion regarding this issue was observed by previous studies [42,43].

Figure 7. Roof acceleration of the investigated structures
As presented in Figure 8, the lead rubber bearing systems lowered the roof displacement response at all considered damping ratios. Besides that, the response of base-isolated heavy story models was quite close to soft story models compared to regular and stepped ones that showed much lower results. Moreover, it was found that the best behavior of the isolator for all models was achieved at a 25% damping ratio. Ultimately, the ratio of roof displacement in the softstory isolated building to the bare structure one was the lowest compared to other cases.
The input parameter selection for any prediction model is the key to achieving high accuracy. This study considered three main aspects in selecting the model's predictors: the investigated earthquake, the structure being evaluated, and the utilized isolator. Generally, each earthquake is characterized by many parameters that can somehow account for the degree of severity that such an event holds. However, because taking all these parameters as input into the prediction models can lead to high computation efforts, this study proposes using only the significant parameters that are correlated with the response being estimated. Accordingly, the correlation coefficient (R), also known as Pearson's correlation coefficient or bivariate correlation, was adopted to describe how strong the linear association between the earthquake parameter and the corresponding structural response is. In general, this approach assumes that the means of all independent variables are equal, which is known as the null hypothesis. The correlation coefficient ranges from -1 to +1, where the values close to -1 and +1 indicate a strong linear correlation, as discussed by Asuero et al. [44]. In fact, the significance value (P-value) was also determined for each parameter with a certain response to find if there is a significance or not. For instance, the R and P-value were calculated for each response (base shear, roof acceleration, etc.) concerning only one parameter (fault distance, arias intensity, etc.) at the time where the R sign represents a positive or negative relationship and a P-value of less than or equal 0.05 is considered significant. Thus, this parameter must be considered while developing multiple linear regression and ANN models, whereas values greater than 0.05 indicate no significance and can be ignored. The results of the correlation tests are illustrated in Table 2.  The second aspect is the structural properties that were taken directly by using the response of the bare structure corresponding to the variable being estimated. This idea was taken to overcome the difficulty in representing the capacity of the structural elements, especially in the case of nonlinearity. Finally, to take the properties of the lead rubber isolation system into account, damping ratio and effective stiffness are used as inputs to the prediction models. Based on the results in Table 2, the independent variables that reflect the significance are fault distance, T5, T75, T95, SD, PGD, Vs30, EQ duration, damping ratio, effective stiffness, and base shear of the bare structure. In general, the analysis of variance illustrated a variation in the P-value for the considered parameters compared to the results based on the analysis given in Table 2. This change in the value can is caused by the fact that the P-value in the correlation test was computed for each response in relation to each parameter at the time, while analysis of variance computes the P-value of multiple predictors (parameters) in relation to one response resulting in remarkably change in the degree of significance of a certain predictor when its influence is evaluated together with multiple other predictors. Moreover, this study found that D5-75 and D5-95 were mainly given a coefficient of zero in several multiple linear regression models even though they were significant, as shown in Table 2. This can be attributed to the multicollinearity issue that occurred due to one or more predictors in the multiple linear regression model being correlated and can be directly calculated using other predictors. Thus, since D5-75 is the subtraction between T75 and T5 and D5-95 is the subtraction between T95 and T5, it was realized that if the model has both T5 and T75 or T5 and T95 as inputs D5-75 or D5-95 will be removed directly. Generally, such action of removing either D5-75 or D5-95 does not negatively influence the fitting of the regression model, as discussed by Vatcheva et al. [45]. On the other hand, multicollinearity does not take place in ANN models. However, to be consistent with the multiple linear regression model, this study has investigated two cases of ANN models with and without D5-75 and D5-95 and found that removing these parameters significantly reduces the prediction accuracy, as presented in Table 3. Hence, all earthquake parameters significant for the response were taken as inputs to ANNs, including D5-75 and D5-95. Indeed, an ANN model is typically developed by testing different setups of the number of hidden layers and neurons to determine the best correlation coefficients. After that, the setup with the highest correlation coefficient was selected to build the ANN model. Based on Figure 9, it can be seen that using only one hidden layer and 11 neurons to develop the base shear model gives the best R value. Based on Table 4, the correlation coefficient was very high for all stages of developing the ANN model except for some variation in the testing results, even though validation showed high accuracy. This is caused by the existence of an outlier in the testing dataset. However, the general performance of the model is considered to give good accuracy. The base shear response is provided for both multiple linear regression and ANN models in Figure 10. It can be seen that the fitting rate of the ANN model is better, and the accuracy is higher as compared to the regression model ( Figure 11). The error analysis of multiple linear regression and ANN models for the base shear was performed, as seen in Table 5. This analysis indicated the capability of the neural network model against the regression model. For instance, ANN showed a value of 0.94 for R and 322.49 for MSE compared with regression which yielded a value of 0.75 and 1287.27 for R and MSE, respectively. Finally, a similar observation regarding the capability of ANN compared to multiple linear regression in predicting concrete properties was previously reported [46,47].    Similar to previous sections, the Pearson correlation was carried out to determine the significance of the independent variables. The significant independent variables on the roof acceleration are D5-75, SD, PGA, PGD, PGA/PGV, damping ratio, effective stiffness, and roof acceleration of the bare structure. The selected ANN model for estimating roof acceleration consisted of one hidden layer with 18 neurons to reflect a high R value of 0.98, as illustrated in Figure  12. The correlation coefficient for all datasets in the ANN model for predicting the roof acceleration was remarkably high, as illustrated in Table 6. The fitting rate of the roof acceleration shows good performance for both multiple linear regression and ANN models (See Figures 13 and 14). Despite that, ANN expressed higher accuracy than the linear regression model. The accuracy and efficiency of multiple linear regression and ANN models for predicting the roof acceleration response were examined using the different error functions listed in Table 7. On the basis of this investigation, the ANN model proved its surpassing capability of accurately estimating the roof acceleration relative to the multiple linear regression model.    The significant variables in estimating the roof displacement based on the correlation test were the fault distance, T5, T75, T95, SD, PGD, PGA/PGV, Vs30, EQ duration, damping ratio, effective stiffness, and roof displacement of the bare structure. The ANN model for estimating roof displacement comprises one hidden layer with 11 neurons. Although 3 hidden layers with 8 neurons provided a slightly higher correlation coefficient, the difference is minimal and can be neglected especially knowing that using 3 hidden layers will require significantly more computational efforts. The estimation model for the roof displacement response that was developed via ANN reflected high performance as the R value is 0.95, as seen in Figure 15. The roof displacement response is plotted for multiple linear regression and ANN models. The predicted values of the roof displacement based on linear regression and ANN models were accurate, as observed from the fitting rates (See Figures 16 and 17). The fitting rate of the ANN model exhibited better performance and higher accuracy. As seen in Table 8, the accuracy of testing and validation is somehow close, which suggests that all observations are relatively within the dataset range, and the overall R value is considerably high. As discussed in the preceding section, multiple linear regression and ANN models were investigated using error analysis by determining the accuracy of each model using different error functions. Based on Table 9, both linear regression and artificial neural network models exhibited suitable results relative to the measured data.

Conclusion
This study aims to develop numerical models using multiple linear regression and ANNs techniques to predict the response of base-isolated structures considering different ground motions and geometries. This research's significance is that it suggests a new way of rapid response estimation for base-isolated structures. Numerical simulations have shown that the structural performance is considerably improved when using a base-isolation system. In line with the originality of the study, three different input groups related to the properties of the selected earthquake, the bare structure's behavior, and the isolator's properties were used to achieve high accuracy in predicting the behavior of base-isolated structures. Moreover, all ANN networks were optimized for the number of hidden layers and the number of neurons. The efficiency and performance of the multiple linear regression and ANN models in estimating the seismic behavior of the baseisolated structures were investigated by conducting error assessments. The analysis results indicate the capability of these two approaches in predicting the response of base-isolated buildings. It was found that the multiple linear regression model reaches low correlation coefficient values for the base shear, story displacement, and story acceleration responses compared to ANN results. On the other hand, the limitation of this study is that the results illustrated were based on 2D frame structures, while future investigations need to extend the concept of the paper to 3D structures, including buildings and bridges.

Data Availability Statement
The data presented in this study are available on request from the corresponding author.