An Efficient Compromised Imputation Method for Estimating Population Mean 1 Association of Indian Universities, New Delhi, India
1. INTRODUCTION Imputation means
replacing a missing value with another value based on a reasonable estimate.
Information on the related auxiliary variable is generally used to recreate the
missing values for completing datasets. Incomplete data is usually categorized
into three different response mechanisms: Missing Completely at Random (MCAR);
Missing at Random (MAR); and Missing Not at Random (MNAR or NMAR) Little and Rubin (2002). Missing completely at random (MCAR): Missing
data are randomly distributed across the variable and unrelated to other
variables. Missing at random (MAR): Missing data are not randomly distributed
but they are accounted for by other observed variables. Missing not at random
(MNAR): Missing data systematically differ from the observed values. From the
above-mentioned classifications of missing data, we, in the present study, have
assumed MCAR. Auxiliary
information is important for survey practitioner as it is utilized to improve
the performance of the methods. It may be utilized at the design stage or the
estimation stage of the survey to get the more efficient estimator. At
estimation stage ratio, product and regression methods are traditionally used. Bhal
and Tuteja (1991) introduced
exponential ratio and product estimator for estimation of population mean. Many
modifications have been proposed using these methods till date. For handling
missing data on the study variable several extensions and developments were
proposed in the literature. Singh (2003) suggested
product estimation for imputation. Shakti Prasad (2018) adapts exponential
product type estimator given by Bahal and Tuteja (1991) and
proposed exponential estimators for imputation. Kadilar and Cingi (2008) investigated some ratio-type imputation
methods and proposed three new estimators to overcome the problem of the
missing data. Diana and Perri (2010)
proposed three regression type estimators which were
more efficient than the Kadilar and Cingi (2008). The
present article suggests a general ratio product exponential type method of
imputation and accordingly proposed three estimators using the different amount
of available auxiliary information as utilized by Ahmad
et al. (2006), Kadilar and Cingi (2008), and Diana and Perri (2010). The proposed methods are
than compared by traditional procedure of imputation. The proposed estimators
come out to be more efficient than the usual ratio, product, regression, and
exponential method for handling missing observations to estimate the population
mean. Given a finite population
2. Some
existing methods of imputation 1)
The mean method of imputation suggests replacing the
missing observations with the mean of the observations available on response
units i.e. Then the estimator of the population mean
2)
The ratio method of imputation uses information on one auxiliary
variable
Where This gives the resulting estimator by The MSE of
It is noted that, in the presence of missing data, the
availability of information on auxiliary variable 3) Diana
and Perri (2010) proposed three estimators as by using different regression-type
method of imputation such that the imputed data is given by For these
methods the resulting estimators are
They proved
that the suggested estimators are more efficient than the Kadilar and Cingi (2008) estimators. 3. The proposed Estimator With the above
imputation method, the resulting estimator of the population mean
4. First
Degree Approximation to the Bias To derive the Bias and MSE expressions of the proposed estimator Thus, we have The expectation of these And under simple random sampling without replacement, where Now representing (2.1) in terms of We assume that the sample is large enough to make
Theorem 2.1. The
conditional bias up to the first order of approximation of the estimator Where Proof: From (2.2) we have
Taking expectation on both side we obtain the bias of |