Bayesian Confidence Interval for Ratio of the Coefficients of Variation of Normal Distributions: A Practical Approach in Civil Engineering

The coefficient of variation (CV) is a useful statistical tool for measuring the relative variability between multiple populations, while the ratio of CVs can be used to compare the dispersion. In statistics, the Bayesian approach is fundamentally different from the classical approach. For the Bayesian approach, the parameter is a quantity whose variation is described by a probability distribution. The probability distribution is called the prior distribution, which is based on the experimenter’s belief. The prior distribution is updated with sample information. This updating is done with the use of Bayes’ rule. For the classical approach, the parameter is quantity and an unknown value, but the parameter is fixed. Moreover, the parameter is based on the observed values in the sample. Herein, we develop a Bayesian approach to construct the confidence interval for the ratio of CVs of two normal distributions. Moreover, the efficacy of the Bayesian approach is compared with two existing classical approaches: the generalised confidence interval (GCI) and the method of variance estimates recovery (MOVER) approaches. A Monte Carlo simulation was used to compute the coverage probability (CP) and average length (AL) of three confidence intervals. The results of a simulation study indicate that the Bayesian approach performed better in terms of the CP and AL. Finally, the Bayesian and two classical approaches were applied to analyse real data to illustrate their efficacy. In this study, the application of these approaches for use in classical civil engineering topics is targeted. Two real data, which are used in the present study, are the compressive strength data for the investigated mixes at 7 and 28 days, as well as the PM2.5 air quality data of two stations in Chiang Mai province, Thailand. The Bayesian confidence intervals are better than the other confidence intervals for the ratio of CVs of normal distributions.


Introduction
In 2018, the air quality in Thailand was worse than in previous years because smog, exacerbated by human activities and comprising particulate matter (PM), a complex mixture of particles and liquid droplets, has steadily increased. PM is defined by size, and PM2.5 (2.5 µm) is of particular concern. The main causes of PM are the exhaust from diesel engines, the burning of biomass, and industrial activity. The World Health Organization has stated that the level of PM25 should not exceed 25 µg/m 3 over a 24-hour mean period [1], whereas the Pollution Control Department in Thailand has set 50 µg/m 3 over a 24-hour mean period as acceptable [2]. However, some air quality monitoring stations in Thailand have found levels over three times higher than this in some areas of Thailand. Excessive exposure to PM2.5 can lead to health problems such as heart and respiratory diseases, and so people should avoid outdoor activities and wear a facemask during high PM2.5 days. The PM2.5 level dispersion in different areas can be compared using the coefficient of variation (CV). Also, in civil engineering, compressive strength refers to the ability of a certain material or structural element to withstand loads tending to reduce size. The compressive strength is often tested to evaluate the actual mix meets the requirements of the design specification. In order to conduct the compressive strength test, the compressive strength is used at the age of 7, 14, 28, and 56 days. The coefficient of variation (CV) has been used to describe the compressive strength and can be used to compare the compressive strength variability in two or more different days for the investigated mixes. In this study, the Bayesian confidence interval for the ratio of the coefficients of variation (CV) of normal distributions for these two applied issues in civil engineering will be evaluated.
The ratio of the standard deviation and the mean of a population is called the CV, which is a metric that is free of the unit of measurement. The larger the CV value, the greater the dispersion, with the lowest value indicating the lowest risk. The CV has been widely used in many fields, such as atmospheric and medical sciences. For instance, the CV has been used to compare the dispersion of PM2.5 in many areas [3] and blood sample measurements taken from the various laboratories [4]. Several methods for estimating the confidence interval of the CV have been suggested. Vangel [5] analyzed the small-sample distribution of a class of approximate pivotal quantities for the CV of a normal distribution. Tian [6] presented a new approach for making inferences about the common CV of several independent normal distributions using the concepts of generalized variables. Mahmoudvand and Hassani [7] introduced an approximately unbiased estimator for the population CV of a normal distribution. Panichkitkosolkul [8] improved the confidence interval for the CV of a normal distribution by replacing the sample CV in Vangel's confidence interval with the maximum likelihood estimator. Moreover, Wongkhao et al. [9] proposed confidence intervals for the ratio of two independent CVs of normal distributions based on the generalized confidence interval (GCI) and method of variance estimates recovery (MOVER) approaches. Kalkur and Rao [10] provided a Bayesian estimator for the CV and inverse CV of a normal distribution. Thangjai et al. [11] proposed adjusted GCIs for the common CV of several normal distributions. Thangjai et al. [12] presented Bayesian confidence intervals for the means of normal distributions with unknown coefficients of variation. Thangjai et al. [13] presented confidence intervals for the single CV of a normal distribution and the difference between the CVs of two normal distributions.
Statistics can be divided into two different techniques, which are the Bayesian approach and the classical approach [14]. There are differences in Bayesian and classical approaches in methods and analysis. For the Bayesian approach, the parameter is a variable that has a probability distribution. Prior probability is the probability assigned to an event before the arrival of evidence information. After receiving the information, the prior probability is updated with the sample information using the Bayes' rule. The updated prior probability is called the posterior probability. The posterior distribution is used for constructing the confidence interval of the parameter. The Bayesian approach has been successfully utilised to establish the confidence interval for parameters of interest. For example, Thangjai et al. [12] presented the Bayesian confidence interval for the mean of a normal distribution with an unknown CV and the Bayesian confidence interval for the difference between two means of normal distributions with unknown coefficients of variation. Maneerat and Niwitpong [15] compared medical care costs using Bayesian credible intervals for the ratio of means of delta-lognormal distributions. Maneerat et al. [16] proposed a Bayesian approach to construct the interval estimation for the difference between variances of delta-lognormal distributions. Moreover, Thangjai et al. [13] proposed the Bayesian confidence interval for the CV of a normal distribution and the Bayesian confidence interval for the difference between the CVs of normal distributions. In addition, Thangjai et al. [17] constructed the Bayesian confidence interval for the CV of a log-normal distribution and the Bayesian confidence interval for the difference between CVs of log-normal distributions. For the classical approach, the parameter is an unknown constant. The best interval estimation of the parameter was obtained from the results of experiments. The classical approach includes many methods, such as the GCI method and the MOVER method. The GCI approach has been widely used to estimate the confidence interval for parameters. For instance, Tian [6] introduced the GCI for constructing the confidence interval for the common CV of normal distributions.
Tian and Wu [18] constructed the confidence interval for the common mean of log-normal distributions using the GCI approach. Ye et al. [19] presented the confidence interval for the common mean of inverse Gaussian distributions based on the GCI approach. Niwitpong and Wongkhao [20] constructed the confidence interval for the difference between the inverse means of normal distributions. Thangjai et al. [21] presented the GCI approach to construct the confidence interval for the mean and difference between means of normal distributions with unknown coefficients of variation. Thangjai et al. [22] proposed the GCI approach to construct the confidence interval for the difference between variances of one-parameter exponential distributions. Thangjai and Niwitpong [23] used the GCI approach to estimate the confidence intervals for the signal-to-noise ratio and difference between signal-to-noise ratios of log-normal distributions. Moreover, the MOVER approach has been widely used to construct the confidence interval for the parameter. For example, Donner and Zou [24] introduced the MOVER approach for constructing the confidence interval for a function of the normal standard deviation. Niwitpong [25] proposed a confidence interval for the difference between CVs of normal distribution with bounded parameters based on the MOVER approach. Wongkhao et al. [26] constructed the confidence intervals for the ratio of CVs of normal distributions. Thangjai et al. [27] proposed the MOVER approach to construct the simultaneous confidence intervals for all differences in CVs of log-normal distributions.
It is of theoretical importance to develop approaches for confidence interval estimation for the ratio of the CVs of two independent normal distributions. In this study, we developed an approach using the concept of Bayesian statistics. The problem of constructing the confidence interval for the CV of a normal distribution based on the Bayesian approach has been considered by many authors (for example, see Kalkur and Rao [10] and Thangjai et al. [13]). In the present study, the Bayesian approach was used to construct the confidence interval for the ratio of CVs of normal distributions, followed by comparing its performance with that of the GCI and MOVER approaches proposed by Wongkhao et al. [9].
This study is organised as follows: In Section 2, the Bayesian confidence interval for the ratio of CVs of normal distributions is presented. In Section 3, simulation results of estimated coverage probabilities (CPs) and average lengths (ALs) are obtained by using Monte Carlo studies. In addition, a Bayesian approach is applied to a real-life dataset. Concluding remarks are summarised in Section 4.

Bayesian Confidence Interval for Ratio of the CVs
Let = ( 1 , 2 , … . , ) be a random sample of size n from a normal distribution with mean and variance 2 . Also, let = ( 1 , 2 , … . , ) be a random sample of size m from a normal distribution with mean and variance 2 .
The CV is the ratio of standard deviation to the mean. The CVs of and are defined as: The ratio of CVs is defined as: Let ̅ , ̅ , , and be the sample means and sample standard deviations of and , respectively. The maximum likelihood estimator of  is obtained by: The confidence interval for ratio of CVs of normal distribution is considered using the Bayesian approach. The Bayesian approach derives the posterior probability which is based on the likelihood function and the prior probability [28]. In this paper, the highest posterior density (HPD) interval is used to construct the Bayesian confidence interval.
Definition: Suppose that = ( 1 , 2 , … . , ) denotes a random sample. Let = ( 1 , 2 , … , ) be observed value of = ( 1 , 2 , … . , ). Also, let ( | ) be a posterior density function. Box and Tiao [29] defines that a region ℜ in the parameter space of  is called the HPD region of content (1 − ) if the following two conditions are satisfied: The HPD interval is based on various priors such as Jeffreys prior and reference prior. In this paper, the independence Jeffreys prior is used to construct the HPD interval based on Bayesian framework. The independence Jeffreys priors are: And; Considering the mean , the posterior distribution of given 2 and is normal distribution which is defined as: Similarly, the posterior distribution of given 2 and is normal distribution which is defined as: Considering the variance 2 , the posterior distribution for 2 given is inverse gamma distribution which is defined as: The posterior distribution for 2 given is inverse gamma distribution which is defined as: Bayesian approach uses posterior distribution of ratio of CVs to construct the confidence interval through Monte Carlo simulation. Since the posterior distribution of ratio of CVs of normal distributions is defined as: (11) where and are simulated from the posterior distributions as defined in Equations 7 and 8, respectively. Moreover, 2 and 2 are simulated from the posterior distributions as defined in Equations 9 and 10, respectively. Therefore, the 100(1 − )% two-sided confidence interval for the ratio of CVs based on the Bayesian approach is obtained as: Where . and . are the lower limit and the upper limit of the shortest 100(1 − )% highest posterior density interval of , respectively.
The following algorithm is used to construct the Bayesian confidence interval for the ratio of CVs of normal distributions:

Algorithm 1.
Step Step 3: Compute B S  from Equation 11; Step 4: Repeat step 1 -step 3, a total q times and obtain an array of B S  's; Step Here, two approaches of Wongkhao et al. [9] are briefly discussed to construct the confidence intervals for the ratio of CVs of normal distributions. The approaches are the GCI approach and MOVER approach.
Firstly, the GCI approach uses the generalized pivotal quantity to construct the confidence interval. The generalized pivotal quantities for the CVs of X and Y are defined as: And; where x , y , X s , and Y s are the observed values of X , Y , X S , and Y S , respectively. Also, n 1 t  and m 1 t  are the student's t distributions with n 1 The generalized pivotal quantity for the ratio of CVs is defined as: where and are defined in Equations 13 and 14, respectively.
Therefore, the 100(1-α)% two-sided confidence interval for the ratio of CVs based on the GCI approach is defined as: Next, the lower and upper limits of confidence interval for CV of X are: And; Similarly, the lower and upper limits of confidence interval for CV of Y are: And; The lower and upper limits of confidence interval for the ratio of CVs are: And; where XX S / X  , YŶ S / Y  , and X l , X u , Y l , and Y u are defined in Equations 17 to 20, respectively. Therefore, the 100(1-α)% two-sided confidence interval for the ratio of CVs based on MOVER approach is defined as: Where . and . are defined in Equations 21 and 22, respectively.

Simulation Study
A Monte Carlo simulation study was carried out to analyze the performance of the Bayes estimator and to compare it with those of two classical estimators, GCI and MOVER estimators for constructing confidence intervals for the ratio of CVs of two normal distributions. In the study, we set Compute the CP and the AL for each confidence interval. Table 1 summarizes the empirical CPs and ALs of the GCI, MOVER, and Bayesian confidence intervals. According to Thangjai et al. [30], the best confidence interval will have the CP in the range [0.9440, 0.9560] at the 95% confidence level and the confidence interval has the shortest average length. The results show that the CPs of three confidence intervals are in the range [0.9440, 0.9560]. Moreover, only the Bayesian confidence interval provided CPs greater than the nominal confidence level of 0.95 for all cases. In addition, the ALs of the Bayesian confidence interval were shorter than the others. Therefore, the results clearly indicate that the Bayesian approach performed satisfactorily in small, moderate, and large sample sizes.  In this study, the Bayesian approach performs satisfactorily for constructing the confidence interval for the ratio of CVs of normal distributions. It was similar to the results of Thangjai et al. [12,13,17], Maneerat and Niwitpong [15], and Maneerat et al. [16]. Furthermore, the GCI and MOVER approaches do not perform well for constructing the confidence intervals for the ratio of CVs of normal distributions. However, the GCI and MOVER approaches are recommended to construct the simultaneous confidence intervals for all differences in means of normal distributions with unknown CVs [31].

Empirical Application
Example 1: Datasets of compressive strength for the investigated mixes at 7 and 28 days obtained from Ali et al. [32] are reported in    Figure 3 presents geographical regions of Thailand [33]. There are north region, northeast region, central region, east region, west region, and south region. The northern Thailand is region with frequent air pollution problems.

Figure 3. Geographical regions of Thailand [33]
Chiang Mai province is the hub of the northern Thailand. This example is interested in Mueang Chiang Mai district which is the capital district of Chiang Mai province in northern Thailand. This is because the PM2.5 has been the seasonal problem in the North of Thailand likes Chiang Mai province. The PM2.5 appears from January to April. However, the extremely dry conditions increase the magnitude of forest fires in March. Real data examples of PM2.5 data are used to illustrate the GCI, MOVER, and Bayesian approaches. The data were reported by the Regional Environment Office 1 (http://www.reo01.mnre.go.th).
The PM2.5 data of two stations in Mueang Chiang Mai district, Chiang Mai province from 1 March 2018 to 30 April 2018 are reported in Table 3. The data sets consist of 54 measurements in Tambon    The Shapiro-Wilk normality test with p-values 0.08361 and 0.1077 for Tambon Chang Phueak and Tambon Sri Phum stations, respectively. According to the p-values, the data sets are fitted the normal distribution. The true ratio of CVs was 1.0118. The data were used to establish the confidence interval for ratio of CVs using GCI, MOVER, and Bayesian approaches. Firstly, the 95% two-sided GCI was [0.7471, 1.3850] with an interval length of 0.6379. In addition, the 95% two-sided MOVER confidence interval was [0.7447, 1.3771] with an interval length of 0.6324. Finally, the 95% two-sided Bayesian confidence interval was [0.7207, 1.3306] with an interval length of 0.6099. It is also true to claim that all confidence intervals for ratio of CVs cover the population ratio of CVs. The results confirm the simulation results in the previous section that the Bayesian confidence interval has the shortest length. Therefore, the Bayesian approach is recommended to establish the confidence interval for ratio of CVs of normal distributions.

Conclusion
The CV is used to measure the dispersion of a probability distribution. The lower CV describes the lower dispersion or lower risk, whereas the larger CV describes the greater dispersion or greater risk. It is commonly used in many fields, such as engineering and environmental data. The ratio of CVs is used to compare the dispersion of two populations. This paper is interested in the ratio of CVs of two normal populations using engineering and environmental data. For civil engineering data, the ratio of the CVs is used to compare the dispersion of the compressive strength at the ages of 7 and 28 days. Moreover, the ratio of the CVs is evaluated to describe the spread of PM2.5 data for Tambon Chang Phueak and Tambon Sri Phum Stations in Mueang Chiang Mai district, Chiang Mai province. In practice, behavioural models have more often been derived by using the classical approach rather than the Bayesian approach. This paper is an extension of previous works by Wongkhao et al. [9] and Thangjai et al. [13]. Wongkhao et al. [9] proposed the classical GCI and MOVER approaches for constructing the confidence intervals for the ratio of CVs of two normal distributions. Moreover, Thangjai et al. [13] proposed the Bayesian approach for the CV and difference of CVs of normal distributions. Therefore, we proposed a Bayesian approach for confidence interval estimation for the ratio of the CVs of two normal distributions. The Bayesian approach is compared with the existing classical approaches: the GCI and the Sample Quantiles MOVER approach. The Bayesian and GCI approaches use software packages to estimate the confidence intervals, whereas the MOVER approach uses a formula to construct the confidence interval. The simulation results indicate that the Bayesian approach performed better than the classical approaches and is thus recommended for constructing the confidence interval for the ratio of the CVs of two normal populations. The results of this investigation were similar to those of Thangjai and Niwitpong [3] and Thangjai et al. [12,13]. Further research will be proposed with other approaches for comparison.

Data Availability Statement
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest
The authors declare no conflict of interest.