Bayesian Inference of a Finite Mixture of Inverse Weibull Distributions with an Application to Doubly Censoring Data

The families of mixture distributions have a wider range of applications in different fields such as fisheries, agriculture, botany, economics, medicine, psychology, electrophoresis, finance, communication theory, geology and zoology. They provide the necessary flexibility to model failure distributions of components with multiple failure modes. Mostly, the Bayesian procedure for the estimation of parameters of mixture model is described under the scheme of Type-I censoring. In particular, the Bayesian analysis for the mixture models under doubly censored samples has not been considered in the literature yet. The main objective of this paper is to develop the Bayes estimation of the inverse Weibull mixture distributions under doubly censoring. The posterior estimation has been conducted under the assumption of gamma and inverse levy using precautionary loss function and weighted squared error loss function. The comparisons among the different estimators have been made based on analysis of simulated and real life data sets.


Introduction
In survival analysis, data are subject to censoring. The most common type of censoring is right censoring, in which the survival time is larger than the observed right censoring time. In some cases, however, data are subject to left, as well as, right censoring. When left censoring occurs, the only information available to an analyst is that the survival time is less than or equal to the observed left censoring time. A more complex censoring scheme is found when both initial and final times are interval-censored. This situation is referred as double censoring, or the data with both right and left censored observations are known as doubly censored data.
Analysis of doubly censored data for simple (single) distribution has been studied by many authors. Fernandez (2000) investigated maximum likelihood prediction based on type II doubly censored exponential data. Fernandez (2006) has discussed Bayesian estimation based on trimmed samples from Pareto populations. Khan et al. (2010) studied predictive inference from a two-parameter Rayleigh life model given a doubly censored sample. Kim and Song (2010) have discussed Bayesian estimation of the parameters of the generalized exponential distribution from doubly censored samples. Khan et al. (2011) studied sensitivity analysis of predictive modeling for responses from the threeparameter Weibull model with a follow-up doubly censored sample of cancer patients. Pak et al. (2013) has proposed the estimation of Rayleigh scale parameter under doubly type-II censoring from imprecise data.
In statistics, a mixture distribution is signified as a convex fusion of other probability distributions. It can be used to model a statistical population with subpopulations, where constituent of mixture probability densities are the densities of the subpopulations. Soliman (2006) derived estimators for the finite mixture of Rayleigh model based on progressively censored data.  have discussed some properties of the mixture of two inverse Weibull distributions. Saleem and Aslam (2008) presented a comparison of the Maximum Likelihood (ML) estimates with the Bayes estimates assuming the Uniform and the Jeffreys priors for the parameters of the Rayleigh mixture. Kundu and Howalder (2010) considered the Bayesian inference and prediction of the inverse Weibull distribution for type-II censored data. Saleem et al. (2010) considered the Bayesian analysis of the mixture of Power function distribution using the complete and the censored sample.  studied the case of the two parameter exponential distribution under type I censoring to get empirical Bayes estimates. Eluebaly and Bouguila (2011) have presented a Bayesian approach to analyze finite generalized Gaussian mixture models which incorporate several standard mixtures, widely used in signal and image processing applications, such as Laplace and Gaussian. Sultan and Moisheer (2012) developed approximate Bayes estimation of the parameters and reliability function of mixture of two inverse Weibull distributions under Type-2 censoring.
The article is outlined as follows. In section 2, we define the mixture model, sampling and likelihood function of Rayleigh model. In section 3, the posterior distributions are derived under different priors. Expressions for the said estimators and corresponding posterior risks are derived in the section 4. Elicitation of hyperparameters via prior predictive approach is discussed in the section 5. Simulation study and comparison of the estimates are given in the section 6. Real data set to illustrate the methodology of the proposed mixture model are discussed in the section 7. Some concluding remarks close the paper.

The Model and Likelihood Function
If the probability density function (pdf) of the Weibull distribution is: The cumulative distribution function (CDF) of the distribution is: The cumulative distribution function for the mixture model is: Assuming the shape parameter to be known the likelihood function (6) By multiplying equation (10) with equation (7), the joint posterior density for the vector  given the data becomes

Bayesian Estimation using Inverse Levy Prior
The prior for the rate parameters i  for i=1, 2, is assumed to be the inverse Levy distribution, with the hyperparameter v i , given by The prior for p 1 is the beta distribution, whose density is given by By multiplying equation (14) with equation (7), the joint posterior density for the vector  given the data becomes

Bayes Estimation of the Vector of Parameters 
The Bayesian point estimation is connected to a loss function in general, signifying the loss is induced when the estimate  differ from true parameter . It is often noticed that in some situations Bayes estimate under another loss function works better than the Bayes estimate under the true loss function, when true loss function exists. Since there is no specific rule that helps us to identify the appropriate loss function to be used.
Precautionary loss function (PLF), which is defined as In this section, the respective marginal distribution of each parameter has been used to derive the Bayes estimators and posterior risks of 12 ,  and p 1 under precautionary loss function (PLF) and weighted squared error loss function (WSELF). The Bayes estimators and their posterior risks of the parameters 12 ,  and p 1 for the conjugate (gamma and beta) priors using the PLF and WSELF functions are given in this section.
The Bayes estimators of 12 ,  and p 1 under PLF using gamma prior are: The posterior risks of 12 ,  and p 1 under PLF using gamma prior are: The Bayes estimators of 12 ,  and p 1 under WSELF using gamma prior are: The posterior risks of 12 ,  and p 1 using gamma prior are:

Elicitation
In Bayesian analysis the elicitation of opinion is a crucial step. It helps to make it easy for us to understand what the experts believe in, and what their opinions are. In statistical inference, the characteristics of a certain predictive distribution proposed by an expert determine the hyperparameters of a prior distribution. In this article, we focused on a method of elicitation based on prior predictive distribution. The elicitation of hyperparameter from the prior   p  is a difficult task. The prior predictive distribution is used for the elicitation of the hyperparameters which is compared with the experts' judgment about this distribution and then the hyperparameters are chosen in such a way so as to make the judgment agree closely as possible with the given distribution. Readers desiring more detail may refer to: Grimshaw Aslam (2003), the method of elicitation is to compare the prior predictive distribution with experts' assessment about this distribution and then to choose the hyperparameters that make the assessment agree closely with the member of the family. The prior predictive distributions under all the priors are derived using the following formula:

Elicitation under gamma distribution
The prior predictive distribution using gamma prior is: We have assumed (θ 1 , θ 2 ) = (1, 1) for convenience in calculations. For the elicitation of the six hyperparameters, six different intervals are considered. From equation (16), the experts' probabilities/assessments are supposed to be 0.10 for each case. The six integrals for equation (19) are  Now, we have to elicit four hyperparameters, so we have to consider the four integrals. The expert probabilities are assumed to 0.15 for each integral with the following limits of the values of random variable 'Y': (0, 15), (15,30), (30,45) and (45, 60). Using the similar kind of program, as discussed above, we have the following values of hyperparameters v 1 = 0.062138, v 2 = 0.19136, c 2 = 0.895777 and d 2 = 0.63889.

Simulation Study and Comparisons
This section consists of the simulation study to compare the performance of the discussed estimators on the basis of generated samples from the inverse Weibull mixture distribution using doubly censored data. We have assumed (θ 1 , θ 2 ) = (1, 1) for convenience in calculations. We take random samples of sizes n = 20, 40, and 80 from the two component mixture of inverse To develop a mixture data we adopt the probabilistic mixing with probability p 1 and (1-p 1 ). A uniform number u is generated n times and if u < p 1 the observation is taken randomly from 1 F (the inverse Weibull distribution with parameter 1  ) otherwise from 2 F (from the inverse Weibull with parameter 2  ). Hence the parameters to be estimated are known to be 12 ( , )  and p 1 . The choice of the censoring time is made in such a way that the censoring rate in the resultant sample is to be approximately 20%. The simulated data sets have been obtained using following steps: Step 1: Draw samples of size 'n' from the mixture model Step 2: Generate a uniform random no. u for each observation Step 3: If u   , the take the observation from first subpopulation otherwise from the second subpopulation Step 4: Determine the test termination points on left and right, that is, determine the values of r x and s x Step 5: The observations which are less than r x and greater than s x have been considered to be censored from each component         The simulation study has revealed some interesting properties of the Bayes estimates. It is worth mentioning that in each case the posterior risks of estimates of lifetime parameters are decreasing as the sample size increases. The posterior risks of the estimates of 12  It has been observed that for the relatively smaller value of  i.e. (0.1, 0.15), the performance of the precautionary loss function and the gamma prior is better than their counterparts, as the amounts of posterior risks are smaller than those in case of their counterparts. However, inverse Levy prior produces some closer estimates to the true value of parameters. Estimates of mixing proportion are found to be underestimated using inverse levy prior when p 1 = 0.45, but they are pretty good under gamma prior. When we consider the estimation of comparatively larger value of  i.e. (10, 15), again under estimation is observed of the estimates of parameters under both priors and loss functions. But extent of under estimation is higher under precautionary loss function using gamma prior. Nonetheless, this underestimation is due to the random procedure and is tolerable. Further this problem can be faced off by using lager sample sizes. As far as the efficiency of the prior is concerned, gamma is found to be the efficient than inverse Levy prior. Moreover, on assessing the behavior of estimates , in case of the extremely different value of the parameters 1 2 1 2 ( , and )      = (0.1, 15 and 10, 0.15) i.e. one is small and other is hundred fold large, it is noticed that the parameters are once again underestimated, and this underestimation is higher at every point using precautionary loss function under both priors. However, the use of weighted squared error loss function has exhibited the pretty good estimates with few exceptions, in terms of convergence. In general, the estimates under gamma prior using precautionary loss function are the best as the amounts of posterior risks associated with these estimates are the least in almost all the cases.

Real Data Analysis
In this section, we have analyzed real data sets to illustrate the methodology discussed in previous sections. In order to show the usefulness of the proposed mixture model, we applied the findings in this paper to the survival times (in days) of guinea pigs, injected with different doses of tubercle bacilli, in table 9. This data set has been discussed by .  has also analyzed this data set. The regimen number is the common logarithm of the number of bacillary units in 0.5 ml. of challenge solution; i.e., regimen 6.6 corresponds to 4.0 *10 6 bacillary units per 0.5 ml. Corresponding to regimen 6.6, there are 72 observations listed below. Further we used the Kolmogorov-Smirnov and chi square tests to see whether the data follow the inverse weibull distribution. These tests say that the data follow the inverse weibull distribution at 5% level of significance with p-values 0.1361 and 0.1290 respectively. We have assumed (θ 1 , θ 2 ) = (1, 1) for convenience in calculations.


Bayes estimates are obtained assuming informative priors under minimum expected loss function, and k-loss function.
The results in the table 11 indicate that the Bayes estimates under gamma prior are better than those under inverse levy prior under both loss functions. Similarly in comparison of the loss functions it has been assessed that the performance of the precautionary loss function is better than weighted squared error loss function. The larger values of the mixing parameter (p 1 ) impose a positive impact on the performance of the estimation of the first component of the mixture. Hence the analysis of real life data indorsed the findings of the simulation study, suggesting the preference of gamma prior along with precautionary loss function.

Graphical Representation of Posterior Risks under Different Loss Functions Various Priors
Risks of the estimators are empirically evaluated based on a Monte-Carlo simulation study of samples. A number of values of unknown parameters are considered. Sample size is varied to observe the effect of small and large samples on the estimators. Different combinations of parameters are considered in studying the change in the estimators and their risks. The results are summarized in figures 1-4. It is easy to observe that the risk of the estimators will be a function of sample size, population parameters, hyperparameters of the prior distribution. After an extensive study of the results, the conclusions are drawn regarding the behavior of the estimators, which are summarized below. It may be mentioned here that because of space restrictions, all results are not shown in the graphs. It is noted that as sample size increases, the risk of all the estimators decrease, see figures 1-4. The effect of variation of parameters on the risks of the estimator has also been studied. It has been noticed that the risk of the estimators increases when increase the value of parameters.

Conclusion
In this article, we have considered the Bayesian inference of inverse Weibull mixture distribution based on doubly type II censored data. The prior belief of the model is represented by the independent gamma, beta priors and inverse Levy, beta priors on the scale and mixing proportion parameters. Numerical results of the simulation study presented in tables 1-