A Family of Estimators of a Sensitive Variable Using Auxiliary Information in Stratified Random Sampling

In this article, a combined general family of estimators is proposed for estimating finite population mean of a sensitive variable in stratified random sampling with non-sensitive auxiliary variable based on randomized response technique. Under stratified random sampling without replacement scheme, the expression of bias and mean square error (MSE) up to the first-order approximations are derived. Theoretical and empirical results through a simulation study show that the proposed class of estimators is more efficient than the existing estimators, i.e., usual stratified random sample mean estimator, Sousa et al (2014) ratio and regression estimator of the sensitive variable in stratified sampling.


Introduction
It is common practice in sample survey related to agriculture, market, industries, and social research, and so forth that usually more than one characteristic is observed from each sampled unit of population.Stratified random sampling is more suitable than other survey designs used for obtaining information from heterogeneous population for reasons of economy and efficiency.And the problem of estimation of the population parameters of a sensitive quantitative variable is well known in survey sampling.In this study, the main goal is to propose a combined general family of estimators for estimating the finite population mean of a sensitive variable in stratified random sampling with non-sensitive auxiliary variable based on randomized response technique.
Many authors have discussed ratio and regression estimators when both Y and X are directly observable.These include Kadilar and Cingi (2003), Shabbir and Gupta (2005), and Nangsue (2009).Gupta and shabbir (2008) have suggested a general class of ratio estimators when the population parameters of the auxiliary variable are known.These estimators have also been extended by kadilar and Cingi (2003) to stratified random sampling scheme.Koyuncu and Kadilar (2010) have suggested a family of estimators in stratified random sampling following Kadilar and Cingi (2003).Sousa et al. (2010) and Gupta et al. (2012) have introduced ratio and regression mean estimators for a sensitive variable and Sousa et al. (2014) have suggested mean estimation of a sensitive variable using auxiliary information in stratified random sampling.This paper suggests a combined general family of estimators of population mean of a sensitive variable using non-sensitive auxiliary information, using RRT methodology (Warner 1965;Gupta et al. 2002 and2010) in stratified random sampling.Under stratified random sampling without replacement scheme, the expression of bias and mean square error (MSE) up to the first-order approximations are derived.Theoretical and empirical results through a simulation results support the reliability of the present study.

Terminology
We denote the finite population    are the known stratum weights".
To estimate be the population mean for the scrambled variable Z .The respondent is asked to report a scrambled response for Y given by Z Y S  but is asked to provide a true response for X .
To discuss the properties of the different estimators, And combined stratified regression estimator given as

Proposed a combined general family of Estimators
Where 1 k and 2 k are weights whose values are to be determined, 0 And optimum values of 1 k and 2 k , respectively, are found as, Substituting these optimum values in (3.5), the minimum MSE of Sist t is given by

Simulation Study
For simulation study, we use "two bivariate normal populations with different covariance matrices to represent the distribution of   , YX .The scrambling variable S is taken to be normal distribution with mean equal to zero and standard deviation equal to 10% of the standard deviation of X .The reported scrambled responses on Y is given by Z Y S .
The simulated populations have theoretical mean of   For each population we considered four sample sizes: n =30, 60, 150 and 300.The population is divided in two strata according to a certain criteria set for the auxiliary variable.The sample size from each stratum is based on Neyman allocation".

Numerical Example
In this study, we use the data set earlier considered by Sousa et al. (2014).In this data, "the variable of interest Y is the purchase orders in 2010 and the auxiliary variable X is the enterprises of turnover.So we consider three strata: the first is enterprises with less than 10 million of turnover, the second between 10 and less than 30 million of turnover, and third with 30 million or more of turnover.

The scrambling variable
S is taken to be normal distribution with mean equal to zero and standard deviation equal to 10% of the standard deviation of X .The reported scrambled responses on Y is given by Z Y S (the purchase order value plus a random quantity).Table 3 presents the empirical and theoretical results of MSE estimates and PRE of the various estimators in the stratified sample.We estimate the empirical MSE using 5000 samples of size n selected from the population.

Sampling information:
According to the MSE and PRE results in table 3, the proposed a combined general family of estimators is considerably better than the existing estimators i.e., usual stratified random sample mean estimator, Sousa et al (2014) ratio and regression estimator of the sensitive variable in stratified sampling.

Conclusion
We can conclude from this study, the mean estimation of a sensitive variable by using non-sensitive variable can be improved in stratified random sampling based on randomized response technique.Our simulation study and numerical example show that the proposed a combined family of estimators are more efficient than the existing estimators i.e., usual stratified random sample mean estimator, Sousa et al (2014) ratio and regression estimator of the sensitive variable in stratified sampling.Also there is no additional loss of privacy as compared to what it is for an additively scrambled RRT model.
we obtain the following estimator

6 )
By using(3.6), for different values of

Table 1 and
2 gives the empirical and theoretical MSE's for the various estimators based on 1 st order approximation.We estimate the empirical MSE using 5000 samples of size n and considering the average of all the observed values.We use the following expression to find the percent relative efficiency   Table 2 and 3 below gives the empirical and theoretical MSE's and PRE for the competing estimators.The results of the MSE's and PRE show that the proposed a combined general family of estimators performs better than the existing estimators.The use of auxiliary information provides a gain for a stratified random sample.And the proposed estimators get more efficient as XY increases.