Log in

Population synthesis for urban resident modeling using deep generative models

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The impact of new real estate developments is strongly associated with its target population distribution, that is, the characteristics that define a population such as composition of household, income, and socio-demographics, conditioned on characteristics of the development itself, such as dwelling typology, price, location, and floor level. This paper presents a machine learning-based method to model the population distribution of upcoming developments of new buildings within larger neighborhood/condo settings. We use a real data set from Ecopark Township, a real estate development project in Hanoi, Vietnam and study two machine learning algorithms from the deep generative models literature to create a population of synthetic agents: conditional variational auto-encoder (CVAE) and conditional generative adversarial networks (CGAN). A large experimental study was performed, showing that the CVAE outperforms both the empirical distribution, a non-trivial baseline model, and the CGAN in estimating the population distribution of new real estate development projects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. O’Donoghue C, Morrissey K, Lennon J (2014) Spatial microsimulation modelling: a review of applications and methodological choices

  2. Shi Z, Fonseca JA, Schlueter A (2017) A review of simulation-based urban form generation and optimization for energy-driven urban design. Build Environ 121:119–129

    Article  Google Scholar 

  3. Litman T (2014) Transportation and the quality of life. Springer, Netherlands, Dordrecht, pp 6729–6733

    Google Scholar 

  4. Deller SC, Tsai TH, Marcouiller DW, English DB (2001) The role of amenities and quality of life in rural economic growth. Am J Agr Econ 83(2):352–365

    Article  Google Scholar 

  5. Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems. Curran Associates Inc., pp 3483–3491

    Google Scholar 

  6. Mirza M, Osindero S (2014) Conditional generative adversarial nets

  7. Choi E, Biswal S, Malin B, Duke J, Stewart WF, Sun J (2017) Generating multi-label discrete patient records using generative adversarial networks

  8. Yoon J, Jordon J, Van Der Schaar M (2019) PATE-GAN: generating synthetic data with differential privacy guarantees. In: International conference on learning representations

  9. Badu-Marfo G, Farooq B, Paterson Z (2020) Composite travel generative adversarial networks for tabular and sequential population synthesis. 04

  10. Xu L, Skoularidou M, Cuesta-Infante A, Veeramachaneni K (2019) Modeling tabular data using conditional gan. ar**v preprint ar**v:1907.00503

  11. Garrido S, Borysov SS, Pereira FC, Rich J (2019) Prediction of rare feature combinations in population synthesis: application of deep generative modelling. Elsevier

    Google Scholar 

  12. Saadi I, Eftekhar H, Teller J, Cools M (2018) Investigating scalability in population synthesis: a comparative approach. Transp Plan Technol 41(1–12):07

    Google Scholar 

  13. Farooq B, Bierlaire M, Hurtubia R, Flötteröd G (2013) Simulation based population synthesis. Transp Res Part B Method 58:12

    Article  Google Scholar 

  14. Sun L, Erath A (2015) A bayesian network approach for population synthesis. Transp Res Part C Emerg Technol 61:49–62

    Article  Google Scholar 

  15. Saadi I, Mustafa A, Teller J, Farooq B, Cools M (2016) Hidden markov model-based population synthesis. Transp Res Part B Method 90(1–21):08

    Google Scholar 

  16. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press

    MATH  Google Scholar 

  17. Borysov S, Rich J, Pereira FC (2019) How to generate micro-agents? a deep generative modeling approach to population synthesis. Transp Res Part C Emerg Technol 106:73–97

    Article  Google Scholar 

  18. Borysov SS, Rich J (2019) Introducing super pseudo panels: application to transport preference dynamics

  19. Borysov S, Rich J, Pereira F (2019) Scalable population synthesis with deep generative modeling. Elsevier

    Google Scholar 

  20. Kingma DP, Welling M (2014) Auto-encoding variational bayes. CoRR, ar**v:1312.6114

  21. Tschannen M, Bachem O, Lucic M (2018) Recent advances in autoencoder-based representation learning. CoRR

  22. Kingma DP, Welling M (2019) An introduction to variational autoencoders. Foundations and trends in machine learning

  23. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceedings of the 27th International conference on neural information processing systems - volume 2. NIPS’14, page 2672–2680, Cambridge, MA, USA, MIT Press

  24. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan

  25. Zhao Y, Chetty G, Tran D (2019) Deep learning with xgboost for real estate appraisal. 12:1396–1401

  26. Bin J, Gardiner B, Li E, Liu Z (2019) Peer-dependence valuation model for real estate appraisal. Data-Enabled Discov Appl 3:12

    Article  Google Scholar 

  27. Alejandro Y, Palafox L (2019) Gentrification prediction using machine learning. Advances in soft computing. Springer

  28. Baldominos Gómez A, José Moreno A, Iturrarte R, Bernárdez Ó, Afonso C (2018) Identifying real estate opportunities using machine learning. ar**v:1809.04933

  29. Lv HX, Yu G, Tian XY, Wu G (2014) Deep learning-based target customer position extraction on social network. In: International conference on management science and engineering—annual conference proceedings. pp 590–595, 08

  30. Robinson C, Dilkina B, Hubbs J, Zhang W, Guhathakurta S, Brown MA, Pendyala RM (2017) Machine learning approaches for estimating commercial building energy consumption. Appl Energy 208:889–904

    Article  Google Scholar 

  31. Ryu SH, Moon HJ (2016) Development of an occupancy prediction model using indoor environmental data based on machine learning techniques. Build Environ 107:1–9

    Article  Google Scholar 

  32. Lan J, Guo Q, Sun H (2018) Demand side data generating based on conditional generative adversarial networks. Energy Proc 152:1188–1193

    Article  Google Scholar 

  33. Mae R (2019) 21 ai real estate companies to know

  34. Go-Weekly (2020) Go weekly magazine: the 20 most innovative companies in real estate (or proptech)

  35. CIO-Applications(2019) Top 10 proptech companies: 2019. www.proptech.cioapplicationseurope.com

  36. Violet W, Brian H. Sidewalk labs blog: a first step toward the future of neighborhood design

  37. Jeff B (2020) Citybldr website: https://www.citybldr.com/solutions

  38. Localize (2020) Localize website: https://www.localize.city/

  39. Yan X, Yang J, Sohn K, Lee H (2016) Attribute2image: conditional image generation from visual attributes

  40. Fedus W, Goodfellow I, Dai AM (2018) Maskgan: better text generation via filling in the

  41. Mohamed S, Rosca M, Figurnov M, Mnih A (2020) Monte carlo gradient estimation in machine learning. J Mach Learn Res 21(132):1–62

    MathSciNet  MATH  Google Scholar 

  42. Harder F, Adamczewski K, Park M (2021) Dp-merf: Differentially private mean embeddings with randomfeatures for practical privacy-preserving data generation. In: International conference on artificial intelligence and statistics. PP 1819–1827. PMLR

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Garrido.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Partial joints in the extended application data set

See Fig. 8.

Fig. 8
figure 8

Performance of the partial joints in the Extended application data set. From left to right; (1) the bivariate distribution between age and nationality, (2) the trivariate distribution between age, nationality, and prior home district, and (3) the trivariate distribution between age, prior home district, and investor. The scatter plot represents the partial joint distribution between the sampled agents from the Extended application set against the real agents from the Extended application set. The axes are denoted in normalized bin frequencies on both the vertical and the horizontal axis

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Johnsen, M., Brandt, O., Garrido, S. et al. Population synthesis for urban resident modeling using deep generative models. Neural Comput & Applic 34, 4677–4692 (2022). https://doi.org/10.1007/s00521-021-06622-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06622-2

Keywords

Navigation