Inference for the Shape Parameter of Lognormal Distribution in Presence of Fuzzy Data

Traditional Statistical analysis of lognormal distribution have been proposed for precisely defined crisp data. But there are many other situations in which measurement results from continuous quantities are not precise numbers but more or less fuzzy. This article presents the statistical inference on the shape parameter of lognormal distribution involving experiment whose observations are described in terms of fuzzy data. The maximum likelihood procedure are developed for estimating the unknown parameter. Asymptotic distribution of maximum likelihood estimator is used to construct approximate confidence interval. Also, Bayes estimate and the corresponding highest posterior density credible interval of the unknown parameter are obtained by using Markov Chain Monte Carlo technique. In addition, we describe an estimation method based on moments of lognormal distribution. Extensive simulations are performed to compare the performances of the different proposed methods.


Introduction
Lognormal distribution is one of the distributions commonly used for modeling lifetimes or reaction-times, and is particularly useful for modeling data which are long-tailed and positively skewed. It has been discussed extensively by many authors including Johnson et al. (1994), and Rukhin (1984). Let Y be the original lifetime variable that follows a lognormal distribution with parameters  and  . The density of Y is given by    (1) where  and  are the scale and shape parameters, respectively. In the analysis of lifetime data, it is more convenient to work with the equivalent model for the loglifetimes. Consider the random variable   logY X = . Then, the variable X is normally distributed with density 0. > , , 2 Several authors have addressed inferential issues for the scale and shape parameters of lognormal distribution; Among others, Basak et al. (2009) considered the estimation of the parameters of lognormal distribution from progressively censored data. Inference for lognormal data with left truncation and right censoring is studied by Balakrishnan and Mitra (2011). Several estimators of the expectation, median and mode of the lognormal distribution are derived by Longford (2009 presented exact inference procedures (hypotheses tests and confidence intervals) concerning the mean of lognormal distribution by using generalized p-values.
The above inference techniques for estimating the parameters of lognormal distribution are based on precisely defined crisp data. However, in many practical situations we face data which are not only random but vague as well. Randomness involves only uncertainties in the outcomes of an experiment; vagueness, on the other hand, involves uncertainties in the meaning of the data. As an example, consider an opinion poll during which a number of individuals are questioned on their perception of the relative length of different line segments with respect to a fixed longer segment that was used as a standard for comparison. The answers given by the individuals may be vague statements such as "approximately lower than 45 ", "approximately 50 to 55 ", "approximately 60 to 65 but near to 65 ", "approximately higher than 70 ", and so on. In this situation, randomness occurs when the individuals are selected at random and vagueness is due to meaning of the answer. Classical statistical procedures are not appropriate to deal with such imprecise cases. Fuzzy numbers are well used to model the imprecision of data. A fuzzy number is a subset, denoted by x , of the set of real numbers (denoted by  ) and is characterized by the so called membership function (.) x  . For more details about the fuzzy numbers and probability measures of fuzzy sets, one can refer to Dubois and Prade (1980) and Singpurwalla and Booker (2004).
In recent years, many papers on generalization of classical statistical methods to analysis of fuzzy data have been published. Wu (2004) discussed the Bayesian estimation on lifetime data under fuzzy environments. Gil et al. (2006) presented a backward analysis on the interpretation, modelling and impact of the concept of fuzzy random variable. Viertl (2006) studied generalization of classical statistical inference methods for univariate fuzzy data. Zarei et al. (2012) considered the Bayesian estimation of failure rate and mean time to failure based on vague set theory in the case of complete and censored data sets. Very recently, Pak et al. (2013) have developed inferential procedures for the parameters of Weibull distribution in the fuzzy environment. They have derived the estimates of parameters using maximum likelihood and Bayesian procedures. In this paper, we follow a pattern very similar for lognormal distribution to discuss different estimation procedures for the shape parameter  when the available information are described by means of fuzzy numbers. We first describe the construction of fuzzy data from imprecise observations, and then discuss the computation of maximum likelihood estimate (MLE) of the parameter  . Based on fuzzy data, there is no closed form for the MLE; therefore, we employ Newton-Raphson algorithm to determine the maximum likelihood estimate of  . We also construct the approximate confidence interval of the unknown parameter by using the asymptotic distribution of the MLE. We further consider the Bayesian inference of the shape parameter of lognormal distribution. Since the Bayes estimate cannot be obtained in explicit form, we use a Markov Chain Monte Carlo technique to compute the Bayes estimate and construct the highest posterior density (HPD) credible interval of the parameter  . In addition, the estimation via method of moments is provided by using an iterative process. The rest of this paper is organized as follows. In Section 2, we obtain the maximum likelihood estimate of the parameter  and also construct the approximate confidence interval by using asymptotic normality of the MLE. The Bayesian analyses are presented in Section 3. In Section 4, the estimation via method of moments is provided. Extensive simulations are performed in Section 5 to compare the performances of the MLE, Bayes and moment estimates.

ML estimation and confidence interval
Suppose that n identical units are placed on a life test with the corresponding lifetimes n X X ,..., 1 . It is assumed that these variables are independent and identically distributed with density given in (2). Consider the situation where the available information about the lifetime of these experimental units can not be exactly perceived, but that rather it may be assimilated with fuzzy numbers n x x,..., 1 with the corresponding membership functions . Based on the fuzzy numbers n x x,..., 1 and by using Zadeh's definition of the probability of a fuzzy event (Zadeh (1968)), we can obtain the likelihood Thus, the corresponding log-likelihood function ) ,..., The maximum likelihood estimate of the parameter  can be carried out by maximizing the observed-data log-likelihood (4). Equating the derivative of the log-likelihood ) ( To achieve estimation via ML method, it is not easy to solve the equation (5) (5) has a unique solution. In this case, an iterative numerical search can be used to obtain the MLE. Recently, Denoeux (2011) has used the EM algorithm to obtain the estimates of the parameters of normal distribution in presence of fuzzy data. Another viable alternative to the EM algorithm is the well-known Newton-Raphson algorithm which can be implemented easily. By using this method, we can also compute the asymptotic variance of the MLE and construct its asymptotic confidence interval. Therefore, in the following, we describe the Newton-Raphson method to determine the maximum likelihood estimate of the parameter  .
In the Newton-Raphson algorithm, the solution of likelihood equation (5) is obtained through an iterative procedure. In each iterative step, the correction  to the previous The iteration method is based on Taylor series expansion of the estimating equation (5) in the neighborhood of the previous estimate. Neglecting power of  above the first order and using Taylor's theorem, we get the following equation which needs to be solved for  : where the notation 0 | A , for any partial derivative A , means the partial derivative evaluated at 0  . The second-order derivative of the log-likelihood with respect to the parameter, required for proceeding with the Newton-Raphson method, is obtained as The maximum likelihood estimate of  via Newton-Raphson algorithm is thereafter refereed as " NR ˆ" in this paper. Once the maximum likelihood estimate of  is obtained, we can use the asymptotic normality of the MLEs to construct the approximate confidence interval. It is known that the asymptotic distribution of the MLE of  is given by, see Miller (1981) z is an upper percentile of the standard normal variate.

Bayesian estimation
In this section we describe the Bayes estimate of the unknown parameter as well as the corresponding highest posterior density credible interval. we re-parameterize the model In many practical situations, it is observed that the behavior of the parameters representing the various model characteristics cannot be treated as fixed constant throughout the life testing period. Therefore, it would be reasonable to assume that the parameters involved in the model behave as random variables with distribution commonly known as prior probability distribution. Keeping in mind this fact, we conduct a Bayesian study by assuming the following gamma prior for  : By combining (8) with (9), the joint density function of the data and  becomes Step 1) Start with an initial guess 0  and set set 1 = j . Step 2) Generate j  from ) ,..., | ( Step 3) Repeat Step 2, M times and obtain j  and Step 4) The Bayes estimate of  , say B ˆ, with respect to squared error loss function can be obtained as Then, following Chen and Shao (1999), the HPD credible interval can be obtained by choosing the interval which has the shortest length.

Moment estimation
Let n x x,..., 1 denote a fuzzy sample of size n from the population given in (2) Note that Eq. (13) cannot be computed analytically; therefore, in the following, we describe an iterative numerical process to obtain the parameter estimate: 1. Given initial estimate of  , say (0)  .

Inference for the Shape Parameter of Lognormal Distribution in Presence of Fuzzy Data
3 .
Checking convergence, if the convergence occurs then the current is the estimate of  by the method of moments; Otherwise set and go to Step 2. The resultant estimate of  is thereafter refereed as " MME" in this paper.

Simulation study and comparisons
In this section we present some simulation results to compare the performances of the different methods proposed in the previous sections. We mainly compare the performances of the MLE, Bayes and moment estimates of the unknown parameter, in terms of their average biases and mean squared errors. We also compare the average lengths of the confidence and credible intervals and their coverage percentages. All the computations are performed on R 2.11.0.
For simulation purposes, we have considered different choices of sample sizes n and fixed value of the parameter, namely 1 =  . In each case, we have generated fuzzy random sample from the distribution given in (2) by using the algorithm given in Pak et al. (2014) which involves the following steps:
2) Set 1 () ii xu   , 1,..., in  , where  is the cumulative distribution function of normal distribution. Now, 1 ( ,..., ) n xx is a sample of size n from standard normal distribution.

3)
Consider the fuzzy information system shown in Fig.1  We have also computed approximate 95% confidence interval and the HPD credible interval of the unknown parameter. Criteria appropriate to the evaluation of the two methods under scrutiny include: closeness of the coverage probability to its nominal value and expected interval width. For each simulated sample, we have computed confidence/credible intervals and checked whether the true value of the parameter lay within the intervals and recorded the length of the intervals. The estimated coverage probability was computed as the number of intervals that covered the true value divided by 10000 while the estimated expected width of the intervals was computed as the sum of the lengths for all intervals divided by 10000. The coverage probabilities and the expected widths for different sample sizes are presented in Tables 1 and 2.   From the experiments, the following general observations can be made. As we expected, the performances of all estimators are improved when the sample size increases. From the experiments, we found that using the Newton-Raphson or MME algorithm for computing the estimate of  give similar estimation results, but MME is computationally slower.
Because these two procedures have different features in the complexity of the iterative numerical search, we let users choose which to use based on their preferences. The performances of the Bayes estimates with non-informative prior assumption and the maximum likelihood estimates are identical in terms of ABs and MSEs; however, it is observed that the Bayes estimates with informative prior are uniformly better. Now considering the confidence and credible intervals, it is observed that the asymptotic results of the MLE work quite well, even when the sample size is small. It can maintain the coverage percentages in most of the cases. The widths of the confidence/credible intervals narrow down with an increase in the sample size n . The performance of the credible intervals are satisfactory and their coverage percentages are close to the corresponding nominal level. Moreover, it is seen that an informative prior distribution improves the performance of the Bayesian credible interval compared to the one using non-informative prior.

Conclusion
In this paper, we have developed inferential procedures for the shape parameter of lognormal distribution when the available observations are fuzzy and are assumed to be related to underlying crisp realization of a random sample. In particular, we have used the Newton-Raphson algorithm to determine the maximum likelihood estimate of the parameter. For computing the Bayes estimate, we have used Markov Chain Monte Carlo method with different types of prior information. Also, the estimation via moments method has been presented by an iterative process. We have further constructed approximate confidence interval and HPD credible interval of the unknown parameter. The performances of the different methods have been compared by Monte Carlo simulations. Based on the results of the simulation study, we see clearly that, the performances of the ML and moment estimates are very similar in all aspects. Also, the Bayes estimates based on non-informative prior and maximum likelihood estimates give similar estimation results; however, the Bayes estimates with informative prior have smaller MSE, showing that additional prior information about the parameter  provides an improvement in the estimates. The AB and MSE of all the estimators decrease significantly as the sample size n increases, as one would expected. It can be further observed that, in most of the cases, the coverage probabilities of confidence/credible are close to the nominal level and theier average lengths also decrease as n increases. Finally, it should be mentioned that Bayes estimates are more computationally expensive than the MLEs and MMEs.