Optimal Sample Allocation in the Presence of Non-response

Mājas > Statistika > Optimal Sample Allocation in the Presence of Non-response

Optimal Sample Allocation in the Presence of Non-response

2014-08-03 Komentēt Go to comments

Assume stratified simple random sampling design. Population is split in $H$ strata. We have fixed sample size $n$ . The aim is to find optimal sample allocation $n_1, n_2, \ldots, n_h, \ldots, n_H$ , where $\sum_{h=1}^H n_h = n$ and $V \left( \hat{Y} \right)$ is minimised.

There is some population information available. The population size by strata: $N_h$ , population standard deviation of $y_i$ variable: $S_h$ , and expected response rate in each stratum: $R_h$ .

Optimal sample allocation in the presence of non-response can be computed as $n_h = n \cdot \frac{\frac{N_h S_h}{\sqrt{R_h}}}{\sum_{j=1}^H \frac{N_j S_j}{\sqrt{R_j}}}$ .

I do not have a proof but I have done some numerical calculations to test the optimality of the allocation. The R code of the calculations is available here.

The presented allocation is equal to the well known Neyman allocation [1] in case we have the same response expectation in each stratum.

[1] Neyman, Jerzy (1937). Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability. Philosophical Transactions of the Royal Society of London. Series A, Mathematical and Physical Sciences 236 (767): 333–380