Facade Layout Completion with Long Short-Term Memory Networks

  • Conference paper
  • First Online:
Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2021)

Abstract

In a workflow creating 3D city models, facades of buildings can be reconstructed from oblique aerial images for which the extrinsic and intrinsic parameters are known. If the wall planes have already been determined, e.g., based on airborne laser scanning point clouds, facade textures can be computed by applying a perspective transform. These images given, doors and windows can be detected and then added to the 3D model. In this study, the “Scaled YOLOv4” neural network is applied to detect facade objects. However, due to occlusions and artifacts from perspective correction, in general not all windows and doors are detected. This leads to the necessity of automatically continuing the pattern of facade objects into occluded or distorted areas. To this end, we propose a new approach based on recurrent neural networks. In addition to applying the Multi-Dimensional Long Short-term Memory network and the Quasi Recurrent Neural Network, we also use a novel architecture, the Rotated Multi-Dimensional Long Short-term Memory network. This architecture combines four two-dimensional Multi-Dimensional Long Short-term Memory networks on rotated images. Independent of the 3D city model workflow, the three networks were additionally tested on the Graz50 dataset for which the Rotated Multi-Dimensional Long Short-term Memory network delivered better results than the other two networks. The facade texture regions, in which windows and doors are added to the set of initially detected facade objects, are likely to be occluded or distorted. Before equip** 3D models with these textures, inpainting should be applied to these regions which then serve as automatically obtained inpainting masks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://cityjson.org/specs/ (accessed: 2023/01/21 09:34:06).

  2. 2.

    https://github.com/SimonHensel/LSTM-Facade-Completion.

References

  1. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283. USENIX Association, Savannah (2016)

    Google Scholar 

  2. Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 70, pp. 214–223. PMLR (2017)

    Google Scholar 

  3. Bradbury, J., Merity, S., **ong, C., Socher, R.: Quasi-recurrent neural networks. ar**v ar**v:1611.01576 (2016)

  4. Chen, J., Yi, J.S.K., Kahoush, M., Cho, E.S., Cho, Y.K.: Point cloud scene completion of obstructed building facades with generative adversarial inpainting. Sensors 20(18), 5029 (2020)

    Article  Google Scholar 

  5. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49

    Chapter  Google Scholar 

  6. Dai, D., Riemenschneider, H., Schmitt, G., Van Gool, L.: Example-based facade texture synthesis. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1065–1072 (2013)

    Google Scholar 

  7. Dehbi, Y., Staat, C., Mandtler, L., Pl, L., et al.: Incremental refinement of facade models with attribute grammar from 3D point clouds. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci. 3, 311 (2016)

    Article  Google Scholar 

  8. Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates, Inc. (2014)

    Google Scholar 

  9. Graves, A., Fernández, S., Schmidhuber, J.: Multi-dimensional recurrent neural networks. CoRR (2007)

    Google Scholar 

  10. Gröger, G., Kolbe, T.H., Czerwinski, A.: OpenGIS CityGML Implementation Specification (City Geography Markup Language). Open Geospatial Consortium Inc., OGC (2007)

    Google Scholar 

  11. He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  13. Hensel, S., Goebbels, S., Kada, M.: Facade reconstruction for textured LoD2 CityGML models based on deep learning and mixed integer linear programming. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci., IV-2/W5, 37–44 (2019). https://doi.org/10.5194/isprs-annals-IV-2-W5-37-2019

  14. Hensel, S., Goebbels, S., Kada, M.: LSTM architectures for facade structure completion. In: Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 1: GRAPP, pp. 15–24. INSTICC, SciTePress (2021). https://doi.org/10.5220/0010194400150024

  15. Hu, H., Wang, L., Zhang, M., Ding, Y., Zhu, Q.: Fast and regularized reconstruction of building facades from street-view images using binary integer programming. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci. V-2-2020, 365–371 (2020). https://doi.org/10.5194/isprs-annals-V-2-2020-365-2020

  16. Huang, J.B., Kang, S.B., Ahuja, N., Kopf, J.: Image completion using planar structure guidance. ACM Trans. Graph. (TOG) 33(4), 1–10 (2014)

    Google Scholar 

  17. Kalchbrenner, N., Danihelka, I., Graves, A.: Grid long short-term memory. ar**v:1507.01526 (2015)

  18. Kottler, B., Bulatov, D., Zhang, X.: Context-aware patch-based method for façade inpainting. In: Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 1: GRAPP, pp. 210–218 (2020)

    Google Scholar 

  19. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988 (2017)

    Google Scholar 

  20. Mehra, S., Dogra, A., Goyal, B., Sharma, A.M., Chandra, R.: From textural inpainting to deep generative models: an extensive survey of image inpainting techniques. J. Comput. Sci. 16(1), 35–49 (2020)

    Article  Google Scholar 

  21. Mtibaa, F., Nguyen, K.K., Azam, M., Papachristou, A., Venne, J.S., Cheriet, M.: LSTM-based indoor air temperature prediction framework for HVAC systems in smart buildings. Neural Comput. Appl. 32, 1–17 (2020)

    Google Scholar 

  22. Nazeri, K., Ng, E., Joseph, T., Qureshi, F.Z., Ebrahimi, M.: EdgeConnect: generative image inpainting with adversarial edge learning. ar**v:1901.00212 (2019)

  23. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  24. Riemenschneider, H., et al.: Irregular lattices for complex shape grammar facade parsing. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1640–1647 (2012)

    Google Scholar 

  25. Salehinejad, H., Sankar, S., Barfett, J., Colak, E., Valaee, S.: Recent advances in recurrent neural networks. ar**v:1801.01078 (2017)

  26. Sherstinsky, A.: Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D 404, 132306 (2020)

    Article  MATH  Google Scholar 

  27. Tan, M., Pang, R., Le, Q.V.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)

    Google Scholar 

  28. Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., Paragios, N.: Shape grammar parsing via reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2273–2280. IEEE (2011)

    Google Scholar 

  29. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)

    Google Scholar 

  30. Tyleček, R., Šára, R.: Spatial pattern templates for recognition of objects with regular structure. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 364–374. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40602-7_39

    Chapter  Google Scholar 

  31. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Scaled-YOLOv4: scaling cross stage partial network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13029–13038 (2021)

    Google Scholar 

  32. Wonka, P., Wimmer, M., Sillion, F., Ribarsky, W.: Instant architecture. ACM Trans. Graph. (TOG) 22(3), 669–677 (2003)

    Article  Google Scholar 

  33. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5505–5514 (2018)

    Google Scholar 

  34. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  35. Yu, T., et al.: Region normalization for image inpainting. Proc. AAAI Conf. Artif. Intell. 34(07), 12733–12740 (2020). https://doi.org/10.1609/aaai.v34i07.6967

    Article  Google Scholar 

  36. Zhang, D., Wang, D.: Relation classification: CNN or RNN? In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 665–675. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_60

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Hensel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hensel, S., Goebbels, S., Kada, M. (2023). Facade Layout Completion with Long Short-Term Memory Networks. In: de Sousa, A.A., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2021. Communications in Computer and Information Science, vol 1691. Springer, Cham. https://doi.org/10.1007/978-3-031-25477-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25477-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25476-5

  • Online ISBN: 978-3-031-25477-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation