Domain Generalization for Multimodal Disaster Tweet Classification

  • Conference paper
  • First Online:
Proceedings of the 13th International Conference on Computer Engineering and Networks (CENet 2023)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1125))

Included in the following conference series:

  • 226 Accesses

Abstract

Disaster-related information on social media platforms facilitates situational awareness of disaster events. Benefiting from the development of multimodal deep learning, it has become easy to extract and fuse information from different modalities of social media data. However, existing multimodal models are developed on limited disaster datasets and have problems in generalization. The lack of generalization is mainly manifested by significant performance degradation on disaster events outside the dataset. One key reason for overfitting might be that the models combine class information too closely with specific disaster events during training. In multimodal social media applications, existing models are rarely optimized for the representation of event domains. In this paper, we treat different disaster events as separate domains and propose a new domain generalization method for multimodal representation learning. Our method takes multimodal features as the sum of domain-specific features and class-specific features. The proposed model obtains domain-invariant representations through feature separation and reconstruction. In addition, the multimodal information exchange is realized by the cross-modal interaction module, and highly generalized class representations are finally obtained. Experimental results show that the proposed method improves generalization performance on the CrisisMMD benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 234.33
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 299.59
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cheung, T., Lam, K.: Crossmodal bipolar attention for multimodal classification on social media. Neurocomputing 514, 1–12 (2022)

    Article  Google Scholar 

  2. Liang, T., Lin, G., Wan, M.: Expanding large pre-trained unimodal models with multimodal information injection for image-text multimodal classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15492–15501 (2022)

    Google Scholar 

  3. Abavisani, M., Wu, L., Hu, S., et al.: Multimodal categorization of crisis events in social media. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14679–14689 (2020)

    Google Scholar 

  4. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), vol. 1, pp. 4171–4186 (2019)

    Google Scholar 

  5. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700–4708 (2017)

    Google Scholar 

  6. Zhou, K., Liu, Z., Qiao, Y., et al.: Domain generalization: a survey. IEEE Trans. Pattern Anal. Mach. Intell., 1–20 (2022)

    Google Scholar 

  7. Alam, F., Ofli, F., Imran, M.: CrisisMMD: Multimodal twitter datasets from natural disasters. In: Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), vol. 12, pp. 456–473 (2018)

    Google Scholar 

  8. Wu, X., Mao, J., **e, H., Li, G.: Identifying humanitarian information for emergency response by modeling the correlation and independence between text and images. Inf. Process. Manage. 59, 102977 (2022)

    Article  Google Scholar 

  9. Yang, C., Westover, M.B., Sun. J.: ManyDG: Many-domain generalization for healthcare applications. In: Proceedings of the International Conference on Learning Representations (ICLR) (2023)

    Google Scholar 

  10. Yang, J., Duan, J., Tran, S., et al.: Vision-language pre-training with triple contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15671–15680 (2022)

    Google Scholar 

  11. Yang, Y., Wang, H., Katabi, D.: On multi-domain long-tailed recognition, imbalanced domain deneralization and beyond. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 57–75 (2022)

    Google Scholar 

  12. Gulrajani, I., Lopez-Paz, D.: In search of lost domain generalization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2021)

    Google Scholar 

Download references

Acknowledgments

This work was supported in part by the Natural Science Foundation of **njiang Uygur Autonomous Region (No. 2022D01B187).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiguo Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, C., Yin, K., Wang, Z. (2024). Domain Generalization for Multimodal Disaster Tweet Classification. In: Zhang, Y., Qi, L., Liu, Q., Yin, G., Liu, X. (eds) Proceedings of the 13th International Conference on Computer Engineering and Networks. CENet 2023. Lecture Notes in Electrical Engineering, vol 1125. Springer, Singapore. https://doi.org/10.1007/978-981-99-9239-3_28

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-9239-3_28

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-9238-6

  • Online ISBN: 978-981-99-9239-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation