On Optimal Designs of Some Censoring Schemes

The main objective of this paper is to explore suitability of some entropy-information measures for introducing a new optimality censoring criterion and to apply it to some censoring schemes from some underlying life-time models. In addition, the paper investigates four related issues namely; the effect of the parameter of parent distribution on optimal scheme, equivalence of schemes based on Shannon and Awad sup-entropy measures, the conjecture that the optimal scheme is one stage scheme, and a conjecture by Cramer and Bagh (2011) about Shannon minimum and maximum schemes when parent distribution is reflected power. Guidelines for designing an optimal censoring plane are reported together with theoretical and numerical results and illustrations.


Introduction
Researchers conduct experiments to collect information about a phenomenon of interest. However, there are several statistical procedures/schemes that lead to loss of information. For example a sample has less information than the underlying probability model, a statistic has less information than the sample and censored sample has less information than the complete sample. The question is how to measure this loss in information and how to reduce it. In this paper, we are concerned with defining a suitable measure of information (objective function) that can select optimal progressive type II censoring schemes and apply it in designing censoring schemes from some parent lifetime distributions.
In a progressive type II censoring scheme, researchers put units under test, decide to observe only failure lifetimes and select a pre-specified censoring scheme 1 ( ,..., ) m R R R  that represents the number of units that will be removed, respectively, at the observed failure times. The class of all possible censoring schemes is Assume that the life times of these units are independent and identically distributed as a continuous random variable X whose pdf and cdf are f and F respectively. Let denote the parameter space and be the parameter of the parent distribution of X . The problem of selecting an optimal progressive type II censoring scheme has been considered by several researchers. Burkschat (2008) used ψ-optimality criterion and a partial order relation on ( ) to show that ( ) ( ) ( ). Using this result he showed that ( ) minimizes ( ) ( ). Moreover, if the parent distribution is (decreasing failure rate) then ( ) minimizes both ( ) ( ), and ( ) ( ). An extra condition is needed for ( ) to minimize ( ) ( ) when distribution is . He also showed that, if the parent distribution is (increasing failure rate), then ( ) minimizes ( ) ( ) .
Soliman (2005), obtained MLE and Bayes estimators of reliability, hazard function and parameters of the Burr-XII model, using three loss functions. Based on simulation study, he observed that for fixed and , ( ) seems to provide the smallest variance for the estimates of reliability and hazard functions.
The BLUE"s of parameters when the underlying distribution is a member of the family had been obtained by several researches. Based on numerical approximation to covariance matrix and numerical search procedures among all possible schemes, it is conjectured that "One-stage censoring scheme is the optimal one". Most frequently (1) R and () m R are optimal, see e.g. Balakrishnan Burkschat et al. (2006Burkschat et al. ( , 2007, and Balakrishnan et al. (2007). For the same family, Burkschat (2008) considered best linear equivariant estimation and Löwner ordering on mean squared error matrices. Burkschat et al. (2007) also used maximum and minimum eigenvalues of these matrices. In both cases, he concluded that the optimal scheme is either (1) R or () m R based on the value of q.
For location or scale family, Balakrishnan et al. (2008) suggested the optimality criterion ( ) ( ( ( )) where λ(.;.) is the failure rate function. They showed that (1) R and () m R are optimal when ( ) is increasing and decreasing respectively. Moreover, based on numerical study they conjectured that the optimal scheme is () k R for some k . Oja (1981) Based on numerical searching procedure, Awad (2013) showed that the optimal scheme that maximizes the informational efficiency with respect to each of ten -sup-entropy measures is ( ) when the underlying distribution is a (shape parameter) Pareto distribution. Same result is obtained theoretically by Haj Ahmad and Awad (2009b) when the underlying distribution is scale-shape Pareto distribution.
Cramer and Bagh (2011) showed that the best approximation to the distribution of an sample from any distribution is given by the first step censoring ( ) since it minimizes Kullback-Liebler, (KL), divergence measure, Whereas the worst plan results from Type-II right censoring ( ) since it maximized KL from that distribution. Hofmann et al. (2005), introduced an asymptotic progressive censoring model, and found optimal censoring schemes for location-scale families based on the determinant of the covariance matrix of the asymptotic best linear unbiased estimators. The procedure is illustrated numerically when the parent distributions are Weibull and normal. It is observed that in many situations the obtained optimal schemes significantly improve upon regular Type-II right censoring.
Pradhan and Kundu (2009) used ME algorithm to obtain MLE estimators of the parameters of generalized exponential distribution together with its quantile. They applied the missing information principle based on Fisher information matrix. Moreover, they applied the quantile information measure. They said that "Till date, we do not have any efficient algorithm to find the optimal censoring scheme in this case." Then they proposed sub-optimal censoring scheme. Through a simulation study it is observed that the sub-optimal scheme need not be a one-step scheme.
This paper aims at a) Suggesting an optimality criterion, based on entropy-information measure, for selecting an optimal progressive type II scheme.
Investigating the suitability of a collection of entropy-information measures as optimality functions based on the suggested criterion. c) Exploring the effect of the parameter of the parent distribution on the optimal scheme. d) Exploring the uniqueness of optimal censoring scheme together with being one stage scheme. e) Exploring the conjecture of Cramer and Bagh (2011) about minimum and maximum Shannon entropy censoring schemes when the underlying distribution is reflected power. f) Commenting on the Burkschat (2008) criterion which is based on a partial order relation on the space of all censoring scheme ( ).
Optimality criteria aim at minimizing cost of experimentation, duration of experimentation and/or total time, variability in estimators, or maximizing amount of available information in sampling scheme. In this paper, we are concerned only with objective functions that are based on entropy-information measures.
Let and be join of a progressive type II censored sample based on a scheme , and a complete sample scheme . We classify an entropy-information measure , with respect to a given censoring scheme ( ), and a given value of as follows.
is called of min-type, (missing informationtype) with respect to given ( ), ) then is called of equivalent-type with respect given ( ).
This classification motivates the following definition of optimality criteria.

Definition 1
a) For a given , the efficiency of a scheme For a given , a censoring scheme c) For a given , let denote the scheme and denote the joint of the scheme . Then the relative efficiency of a scheme R with respect to the scheme and the measure * I is . A measure * I is considered suitable for selecting optimal scheme if it is a function of . Moreover, a scheme may be classified as a) Low efficient with respect to if ( ) .
b) Almost optimal with respect to if ( ) and ( ) .
The following information measures will be used in the sequel a) Fisher information (1925): The paper is organized as follows. Section 2 introduces some preliminary facts about reflected power distribution and generalized order statistics that are used in the paper. Section 3 investigates suitability of Fisher information, Kullback-Leibler divergence, Shannon entropy and Awad sup-modification of Shannon entropy as objective functions to select an optimal progressive type-II censoring scheme when the underlying distribution is reflected power. It also deals with properties of both Shannon entropy and its Awad modification of progressive censoring schemes and provides a partial proof of Cramer and Bagh (2011) conjecture. Moreover, it introduces Shannon maximum and minimum types together with Shannon equivalence of schemes at a given value of . Furthermore it deals with properties of Awad-sup entropy together with equivalence of schemes on the parameter space. Section 4provides results of graphical and numerical computations and comparisons between schemes based on Shannon and Awad-sup entropy measures when parent distribution is reflected power. Finally, Section 5 provides a discussion of obtained results and conclusions of this study.

Preliminaries
Let us put n iid units under test that have a reflected power distribution, ( ( )), with parameter  of the probability density function has IFR, and quantile function This density is bounded if * +. Under this condition For the purpose of the evaluation of the above defined information measures, notice that progressive type-II order statistics are special cases of generalized ordered statistics (see e.g. Kamps (1995)). The following lemma is used in the sequel.

Lemma 1 (Kamps 1995)
There exist ,..., Corollary 1: For the progressive type II sample with scheme ( ) from reflected power distribution with parameter , there exist ,..., Proof: a) Using Lemma 1 and quantile function of distribution it follows that

Corollary 2:
Consider the progressive type II sample with scheme ( ) from reflected power distribution with parameter . Then , ( ( ))- Proof All required forms follow by using corollary 1, the independence of the involved uniform random variables, and the facts that ( log( )) 1 EU  and

Information Measures of Progressive censored scheme
The first condition on an information (objective) function to be suitable for selecting an optimal censoring scheme is being a function of the censoring scheme . This section aims at deciding which of the above defined information measures satisfy this first condition.

Fisher Information
Let us investigate the suitability of Fisher information through the following Therefore Fisher information measure is of maximum-type but it cannot pick an optimal scheme since it is free of the censoring scheme . Moreover, the efficiency of any scheme is .

Divergence Measures
The Kullback-Leibler divergence measure between and is given (7). So, it is a function of and through and for This measure looks like the 8 well-known entropy loss function, So, ( ) may be interpreted as the total loss of using as an alternative to The symmetric form of this measure is The smaller this value is the more closely are the two schemes. So, it may be used to find alternative schemes of an existing traditionally used one. This type of problem will be fully investigated in another paper.

Entropy Measures
In this section we investigate optimality and efficiency of progressive type-II schemes with respect to two entropy measures.

Corollary 3:
Consider the progressive type II sample with scheme ( ) from reflected power distribution with parameter . Then a)

( ) and
( ) are functions of through . So, it is worth investigating them as information functions that may be used to select optimal scheme. This is done in the rest of the paper.
It should be mentioned that Cramer and Bagh (2011) treated reflected power distribution using Shannon entropy. However, the treatment and the results in this paper are completely different from those of Cramer and Bagh.

Theorem 1:
Let ( ), , and denote Shannon entropy measure, joint of progressive type II censoring scheme ( ), and joint of complete first order statistics when the parent distribution is reflected power ( ( )). Then 3. There is no scheme that is H-equivalent to the complete scheme C. However, if the true value of ∑ ( ) ( ), then will be the most efficient scheme within ( ).

Proof:
Since plays an important role in the proof, set ) , These two inequalities imply that , i.e. and hence ( ) the maximum value of ( ) is negative. Moreover, Rearrange terms of ( ) in (9), and use the fact that The second part follows by dividing the inequalities in the (12) by ( ) and using the L"Hospital rule to calculate the limits.
Finally, direct differentiation shows that ( ) has a minimum value at .
The minimum value is ).
Since ( ) , there is no scheme that is Shannon equivalent to the complete scheme C. However, if the true value of , then the scheme will be the most efficient scheme within ( ) since it is closest to .

Theorem 2:
Let and be two schemes from ( Since ∑ ( ) , the required results follows.
Therefore the condition on equivalence of the two schemes (Theorem 2), that ∑ ( ) is sufficient but not necessary.
2) There may be more than two schemes that are Shannon equivalent at the same value of . For example, at * + * + ( ) , and at , * + * + ( ) . This means that even if ( ) or ( ) are minimum or maximum Shannon plans, then they may not be unique. 1) Part (b) of Theorem 2 shows that if then it is not necessarily true that S is more informative than R as it may be concluded from the partial order relation given by Burkschat (2008).

2)
For the curves of ( ( ) ) ( ( ) ) concaves downward and intersect at and the entropy curve of any other scheme ( ) is also concaves downward and will intersect each of ( ( ) ) ( ( ) ) at ( ) and ( ) respectively. There is always a scheme such that ( ) is either less than or greater than ( ( ) ) ( ( ) ). This remark supports the second part of Cramer and Bagh (2011) conjecture.

Proof:
It is obvious that ( ) and ( ) ( ) since each is the negative of the expected value of logarithm of a plausibility function which is bounded between zero and one.

Remark 3:
1) Theorem 3 implies that ( ) Awad sup-entropy is of maximum type when , of minimum type when , and equivalent type when .

2)
A scheme R is equivalent to the complete scheme at . It is interesting to note that this point is also the point at which is the most Shannon efficient scheme within the class ( ). 3)
The second part follows from the first one and the condition .
For the third part, by contradiction, assume that there is a such that ( ( ) ) ( ( ) ).
This implies that ∑ ( On the other hand, it is clear that It is clear that (13) contradicts (14). So, the required result is obtained.
Finally, for the last part, The required result follows since ∑ ( ) .
It is interesting to note that the condition on equivalence of a pair of censoring schemes is free of the parameter . This means that if the condition ∑ ( )( ) holds then the two schemes and are equivalent for all .

2.
For a given if is a measure of maximum type with respect to then is more informative than . Otherwise if it is a measure of minimum type with respect to then scheme is more informative than . Table 2 reports all -equivalent pairs of schemes when and . For these values of n and m , it is interesting to note that there are no pairs of equivalent schemes other than those reported in Table 2.

Proof
Note that ( ) , and .

Remark 5:
It is interesting to note that, this result neither implies ( ) nor ( ) is the optimal scheme since the decision depends on the entropy AH being min-or max-type with respect to these schemes and the parameter . In our terminology the scheme whose entropy is the closest to the entropy of the complete sample is the optimal one.

Numerical Computations
To explore the Cramer and Bagh (2011) conjecture and illustrate the use of the obtained theoretical results we have developed a Mathematica10 code to do all necessary illustrations. Typical part of output of this code is reported in the appendix that includes the following parts; 1. Table 1 reports the values of Shannon entropy in vicinity of , (that appeared in conjecture of Cramer and Bagh (2011)), i.e. at for all schemes in ( ), where Note that ( is the same for all schemes in ( )). From Table 1 we observe that a) At there is only one pair of schemes * + * + that are Shannon equivalent. These two schemes are minimum Shannon plan, i.e. such a plan is not unique. h) This indicates how the minimum and maximum schemes are very sensitive to the value of the unknown parameter of the parent distribution.
2. Table 2 reports all pairs of schemes ( ) such that R and S are A-equivalent for ( ) and takes all possible values between 2 and .
3. Table 3 reports values of and ( ) at which ( ) ( ) is minimum and the efficiency of the scheme is greater than . At the same values of , each of those schemes is equivalent to the complete scheme . It is observed that total number of schemes is 126 out of which only 13 schemes can lead to efficiency more than and the values of belongs to the sub-parameter space [1.5954, 1.64563]. This suggests that if someone has prior information that the value of is in this subspace, and he is planning to put 10 units under test and to observe only 5 failure times, then he may only take into consideration these 13 schemes. 4. Figure 1a provides typical plots of ( ) , ( ), ( ), and , while figure 1b provides typical plots of ( ) ,A ( ), ( ), and when * +. It is seen from Figure 1b that the curves of ( ) , and A ( ) intersect at exactly one point. This means that the censoring scheme may be minimum, maximum or equivalent type based on value of . From the curve of ( ), the partition points of parameter space are * +, i.e. If then the scheme has low efficiency which is less than , otherwise the efficiency is more than 0.5. The efficiency is one at exactly one value of . If is to the left of the previous value then the scheme is minimum type and if it is to the right then the scheme is maximum type. Furthermore the efficiency starts decreasing for . Figure 1c: provides typical plots of A-entropies and A-efficiencies of ( ) ( ) * +. It is observed from these tables and solutions for cut points of the obtained curves that each of ( ) ( ) , has -efficiency equals one at respectively. In addition, among these three schemes, the most informative scheme is ( ) when ( ), when ( ), otherwise it is ( )

Conclusions and Recommendations
In this paper, it is proved that a) Any two one-stage censoring schemes from ( ) are not Awad sup-entropy equivalent. However, they are Shannon equivalent at a single value of b) Any schemes from ( ) may be Awad equivalent for all values of However, they may be Shannon equivalent at a single value of . Moreover, for a given value of there may exist more than two schemes that are H-equivalent.

c)
Each scheme R is Awad equivalent to C at the single value There is no scheme that is Shannon equivalent to the complete scheme However, is the most H-efficient scheme when , in the sense that the minimum distance between the curves of ( ) and ( ) occurs at However, e) ( ) ( ( ) ) ( ) ( ( ) ). However, this does not mean that ( ) is the most informative scheme, and it also does not mean that ( ) is the least informative one within the class ( ). To be more specific, let b) If ( ) , then ( ) is of minimum type, and it is the A-optimal scheme within the class ( ) c) If ( ) , then is maximum type while ( ) is minimum type, and the A-optimal scheme is either ( ) within the class , then is minimum type while ( ) is maximum type, and the A-optimal scheme either ( ) within the class g) The suggested definition of efficiency that is based on classifying schemes as being of maximum, minimum, or equivalent type seems to be more appropriate than that definition that just compares the values of the entropy in the schemes without taking this issue into consideration.
h) At the design stage of a progressive censoring plan one may take into consideration the following remarks.
1) If someone observed failure times of units out of the units that under test, then it seems reasonable to claim that any scheme with efficiency less than is a scheme with low efficiency. In such case, it is not preferable to use censoring since any inference based on low efficient scheme will not be beneficial.
2) A prior information (from past experience or a pilot study on complete samples) about the possible value of the unknown parameter is very helpful in selecting optimal scheme, since optimal schemes are sensitive to the values of .
3) Since, for a given , there may be no A-or H-equivalent scheme, one may find an equivalent or almost efficient scheme for some in the vicinity of the prior value of and searches for a most efficient scheme.
4) The researcher is advised to find all equivalent schemes to the selected optimal one and then to use his judgment based on practical situations and objectives of his experiment to use the most appropriate one.
Finally for a given underling distribution, one may construct tables that report ( ) that are greater that . All equivalent schemes should be grouped. Based on such a table, the researcher may use the prior subspace of values of to select m, n and R that may lead to efficiency greater than m/n. If there is no scheme with efficiency greater than m/n, it is advised to use censoring.    Legend: Censored, Complete, Efficiency, m/n ( ( ) ), ( ), ( ( ) ), ( ) ( ( ) ), ( ( ) ), ( ), Table 1 c: Plots of A-entropies and A-efficiencies of ( ) ( ) * +.