Log in

Combining cluster sampling and link-tracing sampling to estimate the size of a hidden population: Asymptotic properties of the estimators

  • Published:
Journal of Statistical Theory and Practice Aims and scope Submit manuscript

Abstract

Félix-Medina and Thompson proposed a variant of link-tracing sampling to estimate the size of a hidden population such as drug users or sexual workers. In their variant a sampling frame of sites where the members of the population tend to gather is constructed. The frame is not assumed to cover the whole population, but only a portion of it. A simple random sample of sites is selected; the people in the sampled sites are identified and are asked to name other members of the population, who are added to the sample. Those authors proposed maximum likelihood estimators (MLEs) of the population size that derived from a multinomial model for the numbers of people found in the sampled sites and a model that considers that the probability that a person is named by any element in a particular sampled site (link-probability) does not depend on the named person, that is, that the probabilities are homogeneous. Later, Félix-Medina et al. proposed unconditional and conditional MLEs of the population size, which derived from a model that takes into account the heterogeneity of the link-probabilities. In this work we consider this sampling design and set conditions for a general model for the link-probabilities that guarantees the consistency and asymptotic normality of the estimators of the population size and of the estimators of the parameters of the model for the link-probabilities. We showed that the unconditional and conditional MLEs of the population size are consistent, that they have different asymptotic normal distributions, and that the unconditional ones are more efficient than the conditional ones.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agresti, A. 2002. Categorical data analysis, 2nd ed. New York, NY: Wiley.

    Book  Google Scholar 

  • Birch, M. W. 1964. A new proof of the Pearson-Fisher theorem. Annals of Mathematical Statistics 35:817–24. doi:10.1214/aoms/1177703581.

    Article  MathSciNet  Google Scholar 

  • Bishop, Y. M. M., S. E. Fienberg, and P. W. Holland. 1975. Discrete multivariate analysis: Theory and practice. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  • Coull, B. A., and A. Agresti. 1999. The use of mixed logit models to reflect heterogeneity in capture-recapture studies. Biometrics 55:294–301. doi:10.1111/biom.1999.55.issue-1.

    Article  Google Scholar 

  • Ding, Y. 1996. On the asymptotic normality of multinomial population size estimators with application to backcalculation of AIDS epidemic. Biometrika 83:695–99. doi:10.1093/biomet/83.3.695.

    Article  MathSciNet  Google Scholar 

  • Farcomeni, A., and L. Tardella. 2012. Identifiability and inferential issues in capture-recapture experiments with heterogeneous detection probabilities. Electronic Journal of Statistics 6:2602–26. doi:10.1214/12-EJS758.

    Article  MathSciNet  Google Scholar 

  • Félix-Medina, M. H., P. E. Monjardin, and A. N. Aceves-Castro. 2015. Combining link-tracing sampling and cluster sampling to estimate the size of a hidden population in presence of heterogeneous link-probabilities. Survey Methodology 41:349–76.

    Google Scholar 

  • Félix-Medina, M. H., and S. K. Thompson. 2004. Combining cluster sampling and link-tracing sampling to estimate the size of hidden populations. Journal of Official Statistics 20:19–38.

    Google Scholar 

  • Feller, W. 1968. An introduction to probability theory and its applications, Vol. 1, 3rd ed. New York, NY: Wiley.

  • Fewster, R. M., and P. E. Jupp. 2009. Inference on population size in binomial detectability models. Biometrika 96:805–20. doi:10.1093/biomet/asp051.

    Article  MathSciNet  Google Scholar 

  • Harville, D. A. 1997. Matrix Algebra from a statistician’s perspective. New York, NY: Springer.

    Book  Google Scholar 

  • Holzmann, H., A. Munk, and W. Zucchini. 2006. On identifiability in capture-recapture models. Biometrics 62:934–36. doi:10.1111/j.1541-0420.2006.00637_1.x.

    Article  MathSciNet  Google Scholar 

  • Johnston, L. G., and K. Sabin. 2010. Sampling hard-to-reach populations with respondent driven sampling. Methodological Innovations Online 5 (2):38.1–48. doi:10.4256/mio.2010.0017.

    Article  Google Scholar 

  • Kalton, G. 2009. Methods for oversampling rare populations in social surveys. Survey Methodology 35:125–41.

    Google Scholar 

  • Link, W. A. 2003. Nonidentifiability of population size from capture-recapture data with heterogeneous detection probabilities. Biometrics 59:1123–30. doi:10.1111/biom.2003.59.issue-4.

    Article  MathSciNet  Google Scholar 

  • Magnani, R. K., K. Sabin, T. Saidel, and D. Heckathorn. 2005. Review of sampling hard-to-reach populations for HIV surveillance. AIDS 19:S67–S72. doi:10.1097/01.aids.0000172879.20628.e1.

    Article  Google Scholar 

  • Rao, C. R. 1958. Maximum likelihood estimation for the multinomial distribution with infinite number of cells. Sankhyā: the Indian Journal of Statistics 20:211–18.

    MathSciNet  MATH  Google Scholar 

  • Rao, C. R. 1973. Linear statistical inference and its applications, 2nd ed. New York, NY: Wiley.

    Book  Google Scholar 

  • Sanathanan, L. 1972. Estimating the size of a multinomial population. Annals of Mathematical Statistics 43:142–52. doi:10.1214/aoms/1177692709.

    Article  MathSciNet  Google Scholar 

  • Serfling, R. J. 1980. Approximation theorems of mathematical statistics. New York, NY: Wiley.

    Book  Google Scholar 

  • Serfling, R. J. 2011. Asymptotic relative efficiency in estimation. In International encyclopedia of statistical science, ed. M. Lovric, 68–72. Berlin, Germany: Springer.

    Chapter  Google Scholar 

  • Spreen, M. 1992. Rare populations, hidden populations and link-tracing designs: What and why? Bulletin De Méthodologie Sociologique 36:34–58. doi:10.1177/075910639203600103.

    Article  Google Scholar 

  • Thompson, S. K., and O. Frank. 2000. Model-based estimation with link-tracing sampling designs. Survey Methodology 26:87–98.

    Google Scholar 

  • Varadhan, S. R. S. 2008. Large deviations. Annals of Probability 36:397–419. doi:10.1214/07-AOP348.

    Article  MathSciNet  Google Scholar 

Download references

Funding

This work was supported by Universidad Autónoma de Sinaloa: PIFI-2013-25-73-1.4.3-8.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martín H. Félix-Medina.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Félix-Medina, M.H. Combining cluster sampling and link-tracing sampling to estimate the size of a hidden population: Asymptotic properties of the estimators. J Stat Theory Pract 12, 463–496 (2018). https://doi.org/10.1080/15598608.2017.1405374

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1080/15598608.2017.1405374

Keywords

AMS Subject Classification

Navigation