Generalized Exponential-Cum-Exponential Estimator in Adaptive Cluster Sampling

In this paper, a generalized exponential-cum-exponential estimator is proposed utilizing the two auxiliary variables based on average values of the networks in adaptive cluster sampling. The exponential ratio-cum-exponential ratio, exponential product-cum- exponential product, exponential ratio-cum- exponential product and exponential product-cum- exponential ratio type estimators are the special cases of proposed estimator using simple random sampling without replacement in adaptive cluster sampling. The expressions for the mean square error and bias of the proposed estimator have been derived. The class of special cases of proposed estimator may be used for estimating the finite population mean and comparable with estimators in case of high correlation but also useful when the correlation between study variable and auxiliary variables is low in the adaptive cluster sampling. The simulation studies have been carried out to demonstrate and compare the efficiencies of the estimators. It is shown that the proposed estimators are more efficient as compared to the mean per unit estimator in adaptive cluster sampling, modified ratio and modified product, exponential ratio and exponential product estimators in adaptive cluster sampling, under given conditions.


Introduction
The Adaptive Cluster Sampling (ACS) is suitable and efficient for the rare and clustered population. Examples of such clustered population includes: animals and plants of rare and endangered species, fisheries, uneven minerals exploration, pollution concentrations, epidemiology of sporadic diseases, noise problems, drug users, HIV and AIDS patients, criminals and hot spot investigations. The field studies for nature is a primary aspect of the motivation for expanding the studies of adaptive sampling designs. Adaptive cluster sampling method revealed the efficient estimation in contrast to the existing sampling designs for the rare and clustered population. For the rarely available clustered form population without showing any pattern, it is difficult to sample the nature. The environmental populations, such as plants and animals having patchy distribution of units have been the motivation for adaptive cluster sampling designs. Adaptive cluster sampling designs can make efficient outcome to guesstimate the mass of disease affected plants in particular cultivated region. The use of auxiliary information might be constructive to get better the efficiency of an estimator in adaptive cluster sampling case. For example, in an ornithological survey, it is likely to obtain improved results using Pak.j.stat.oper.res. Vol.XI No.4 2015 pp553-574 554 ACS to grasp a rare and clustered species. The count of a particular species in a locality is the study variable then changes in food availability, habitat, or temperature would be the auxiliary variables. Another example may be, the count of disease affected plants is the study variable in an agricultural survey and the auxiliary variable(s) might be the fertility of the ground, the cultivable region or the climate conditions.
In adaptive cluster sampling the initial sample is selected by a conventional sampling design such as simple random sampling then the neighbourhood of each unit selected is considered if the value of the study variable from the sampled unit meet a pre-defined condition C usually y > 0. The neighbouring unit is added and examined if the condition is satisfied and the process continues until the new unit meets the condition. The final sample comprises all the units studied and the initial sample. A network consists of those units that meet the predefined condition. The units that do not meet the specified condition are known as edge units. A cluster is a combination of network and edge units. The neighbourhood can be defined by social and institutional relationships between units. The first-order neighbourhood consists of the sampling unit itself and four adjacent units denoted as east (above), west (below), north (right), and south (left). The second-order neighbourhood consists of first-order neighbouring units and the units including northeast, northwest, southeast, and southwest units i.e. diagonal quadrats. Thompson (1990) first introduced the idea of the adaptive cluster sampling to estimate the rare and clustered population and proposed four unbiased estimators in adaptive cluster sampling. Smith et al (1995) studied the efficiency of adaptive cluster for estimating density of wintering water fowl and found that the efficiency is highest as compare to simple random sampling design when the within network variance is close to population variance. Dryver (2003) found that ACS performs well in a univariate setting. The simulation on real data of blue-winged and red-winged results shows that Horvitz-Thompson type estimator was the most efficient estimator using the condition of one type of duck to estimate that type of duck. For highly correlated variables the ACS performs well for the parameters of interest. Chao (2004) proposed the ratio estimator in adaptive cluster sampling and showed that it produces better estimation results than the original estimator of adaptive cluster sampling. Dryver and Chao (2007) suggested the classical ratio estimator in adaptive cluster sampling (ACS) and proposed two new ratio estimators under ACS. Chutiman and Kumphon (2008) proposed a ratio estimator in adaptive cluster sampling using two auxiliary variables. Chutiman (2013) proposed ratio estimators using population coefficient of variation and coefficient of kurtosis, regression and difference estimators by using single auxiliary variable.

Some Estimators in Simple Random Sampling
Let a sample of size n is selected by using simple random sampling without replacement from the total number of units in the population N. The variable of interest and auxiliary variable are denoted by y and x with their population means and , population standard deviation S y and S x , coefficient of variation C y and C x respectively. Also ρ xy represent population correlation coefficient between X and Y, , jk  . The following estimators are available in simple random sampling: Cochran (1940) and Robson (1957) proposed the classical ratio and product estimators, respectively, for estimating the population mean as The mean square error (MSE) of the estimators of (1) and (2) are respectively. Bahl and Tuteja (1991) proposed the exponential ratio and exponential product estimators to estimate the population mean. t will be more efficient than classical product estimator if

Some Estimators in Adaptive Cluster Sampling
Suppose a finite population of size N is labelled as 1,2,3,…,N and an initial sample of n units is selected with a simple random sample without replacement. Let respectively. Adaptive cluster sampling can be considered as simple random sampling without replacement when the averages of networks are considered (Thompson, 2002;Dryver and Chao, 2007 Thompson (1990) developed an unbiased estimator for population mean in ACS based on a modification of the Hansen-Hurwitz estimator which can be used when sampling is with replacement or without replacement: Dryver and Chao (2007) proposed a modified ratio estimator for the population mean keeping in view ACS: Shahzad and Hanif (2014) proposed modified product and exponential product estimators for the population mean keeping in view adaptive cluster sampling using one auxiliary variable is: The bias and mean square error of these estimators are Bias 9 and . 82 respectively.

Proposed Generalized Exponential-Cum-Exponential Estimator in Adaptive
Cluster Sampling Thompson (1990) first introduced the idea of the adaptive cluster sampling and modified Hansen-Hurwitz (1943) and Horvitz-Thompson (1952) type estimators. A variety of subsequent research efforts on adaptive cluster sampling was launched from Thompson (1990). However, how those adaptive design factors including the predefined condition (or magnitude of critical value), definition of neighbourhood affect the efficiency of ACS in comparison with the non-adaptive design (simple random sampling) and possible challenges arising from case to case are not concretely touched. The appended derivation process of ACS estimators for the specific design from Thompson (1990) can be used as a helpful reference for the development of the estimators for the other modifications of ACS. Chao (2004) proposed the ratio estimator, Dryver and Chao (2007) discussed classical ratio estimator in adaptive cluster sampling (ACS) and proposed two new ratio estimators under ACS, one of which is unbiased for ACS designs. Chutiman and Kumphon (2008) proposed a ratio estimator, and Chutiman (2013) proposed ratio estimators using population coefficient of variation and coefficient of kurtosis, regression and difference estimators by using single auxiliary variable. There is still need to address the efficiency issues and proposed better and better estimators suitable for the adaptive design.
Following the Upadhyaya et al. (2011), the generalized form of the proposed exponentialcum-exponential estimator utilizing two auxiliary variables may be written as (4.1) Where, ,  and , ab are generalizing and optimization constants, respectively. In order to derive the bias and mean square error expressions, we may proceed as follows

The Bias and Mean Square Error of the Proposed Estimator
Using the notations (3.1), we may rewrite the estimator (4.3) as Expanding the exponential function (5.8) and ignoring the terms with power three or greater as Taking expectation on both sides of (5.11), we get  

Special Cases of Proposed Generalized Estimator
The special cases of the proposed generalized estimator may be obtained by using the different values of the constants.

Exponential Ratio-Cum-Ratio Estimators
The generalized ratio-cum-ratio estimator in the exponential form may be obtained for 11  t , 12 t , 13 t and 14 t may be obtained by using the different values of generalizing and optimization constants in (5.14) and (5.20) respectively.

Exponential Product-Cum-Product Estimators
The product-cum-product estimator in the exponential form may be obtained for 11  t , 16 t , 17 t and 18 t may be obtained by using the different values of generalizing and optimization constants in (5.14) and (5.20) respectively.

Exponential Ratio-Cum-Product Estimators
The ratio-cum-product estimators in the exponential form may be obtained for 11  t and 32 t may be obtained by using the different values of generalizing and optimization constants in (5.14) and (5.20) respectively.

Simulation Study
To compare the efficiency of proposed estimators with the other estimators, a simulated population is used and performed simulations for the study. The condition C for added units in the sample is y > 0. The y-values are obtained and averaged for keeping the sample network according to the condition and for each sample network x-values and zvalues are obtained and averaged. For the simulation study ten thousands iteration was run for each estimator to get accuracy estimates with the simple random sampling without replacement and the initial sample sizes of 5,10,15,20 and 25.
In adaptive cluster sampling, the final sample size is usually greater than the initial sample size. Let, E(v) denotes the expected final sample size in ACS, is sum of the probabilities of inclusion of all quadrats, In the adaptive cluster sampling the expected final sample size varies from sample to sample. For the comparison, the sample mean from a srswor based on E(v) has variance using the formula 2 * ( ( )) () () The estimated mean squared error of the estimated mean is Where * t is the value for the relevant estimator for the sample i and r is the number of iterations.

Population
In this population, we consider a pair of auxiliary variables. The pair has been taken from smith et al. (1995), the total area for original data is 5000 km 2 , which has been divided into 50 100-km 2 quadrats in central Florida. In the pair, blue-winged teal data in (Table   7.1) and Green-winged teal data in (Table 7.2) has been used as auxiliary variable x and z respectively.
The variability of the study variable is proportional to the auxiliary variable itself in model (7.5) whereas it is proportional to the within-network mean level of the auxiliary variable in model (7.6). Consequently, the within network variances of the study variable in the two networks consisting of more than one units are much larger in the population generated by model (7.5).
In simulated population (Table 7.3) the variance of study variable is proportional to the sum of the auxiliary variables itself. Thus, within network variances of the study variable in the networks consisting of more than one unit are expected to be much larger in the population generated by model (7.7). The variability will remain low if the study variable is simulated by utilizing the within-network mean level of the auxiliary variables. The adaptive cluster sampling is more suitable for the situation when within-network variances are sufficiently large. The model (7.7) provides the larger within network variances to ensure the better performance of estimators in adaptive cluster sampling.  According to the condition of interest there found 3 networks, from these 1 networks is of size one while the network (12,20,93,54,8,11,9) and network (173,713,63770,52,4,486,820) are of size 7 each. The within network variance of the study variable for the network (12,20,93,54,8,11,9) is 1042.286, while for the network (173,713,63770,52,4,486,820) the variance is 574239554.1. The overall variance of the study variable is 81231457. The within network variance accounts a large portion of overall variance. Thus, adaptive estimators are expected to perform better and more efficient than the comparable usual estimators. The conventional estimators are more efficient than the adaptive estimators if within-network variances do not account for a large portion of the overall variance (Dryver and Chao 2007   There found a very high correlation 0.999 between the study variable and auxiliary variables and this correlation remains same in the transformed population (Tables 7.4 -7.6) as well. Thus, there is a high correlation between the sampling unit level as well as network (region) level. Dryver and Chao (2007) showed that usual estimators in srswor perform better than adaptive cluster sampling estimators for strong correlation at unit level but performs worse when having the strong correlation at network level. Thompson (2002) investigated that adaptive cluster sampling is preferable than the comparable conventional sampling methods if the within network variance is sufficiently large as compared to overall variance of the study variable and presented the condition when the modified Hansen-Hurwitz estimator for adaptive cluster sampling have lower variance than variance of the mean per unit for a simple random sampling without replacement of size () Ev if and only if, The overall variance of the study variable is 81231457 while in the transformed population this variance reduced to 10912964. Let us apply the condition (7.9) we have, It is clear from the efficiency conditions given in the table 7.7 that the Hansen-Hurwitz estimator in adaptive cluster sampling will have lower variance than the mean per unit estimator in the simple random sampling without replacement for the all initial sample sizes. In general, adaptive cluster sampling is preferable than the comparable sampling method if the within-network variance is sufficiently high as compared to the overall variance (Thompson2002). The result of simulation studies are given in Table (7.8 a,b,c) for 32 different estimators including four estimators from srswor and twenty eight estimators from adaptive cluster