Abstract
Disaster-related information on social media platforms facilitates situational awareness of disaster events. Benefiting from the development of multimodal deep learning, it has become easy to extract and fuse information from different modalities of social media data. However, existing multimodal models are developed on limited disaster datasets and have problems in generalization. The lack of generalization is mainly manifested by significant performance degradation on disaster events outside the dataset. One key reason for overfitting might be that the models combine class information too closely with specific disaster events during training. In multimodal social media applications, existing models are rarely optimized for the representation of event domains. In this paper, we treat different disaster events as separate domains and propose a new domain generalization method for multimodal representation learning. Our method takes multimodal features as the sum of domain-specific features and class-specific features. The proposed model obtains domain-invariant representations through feature separation and reconstruction. In addition, the multimodal information exchange is realized by the cross-modal interaction module, and highly generalized class representations are finally obtained. Experimental results show that the proposed method improves generalization performance on the CrisisMMD benchmark.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cheung, T., Lam, K.: Crossmodal bipolar attention for multimodal classification on social media. Neurocomputing 514, 1–12 (2022)
Liang, T., Lin, G., Wan, M.: Expanding large pre-trained unimodal models with multimodal information injection for image-text multimodal classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15492–15501 (2022)
Abavisani, M., Wu, L., Hu, S., et al.: Multimodal categorization of crisis events in social media. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14679–14689 (2020)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), vol. 1, pp. 4171–4186 (2019)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700–4708 (2017)
Zhou, K., Liu, Z., Qiao, Y., et al.: Domain generalization: a survey. IEEE Trans. Pattern Anal. Mach. Intell., 1–20 (2022)
Alam, F., Ofli, F., Imran, M.: CrisisMMD: Multimodal twitter datasets from natural disasters. In: Proceedings of the International AAAI Conference on Web and Social Media (ICWSM), vol. 12, pp. 456–473 (2018)
Wu, X., Mao, J., **e, H., Li, G.: Identifying humanitarian information for emergency response by modeling the correlation and independence between text and images. Inf. Process. Manage. 59, 102977 (2022)
Yang, C., Westover, M.B., Sun. J.: ManyDG: Many-domain generalization for healthcare applications. In: Proceedings of the International Conference on Learning Representations (ICLR) (2023)
Yang, J., Duan, J., Tran, S., et al.: Vision-language pre-training with triple contrastive learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15671–15680 (2022)
Yang, Y., Wang, H., Katabi, D.: On multi-domain long-tailed recognition, imbalanced domain deneralization and beyond. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 57–75 (2022)
Gulrajani, I., Lopez-Paz, D.: In search of lost domain generalization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2021)
Acknowledgments
This work was supported in part by the Natural Science Foundation of **njiang Uygur Autonomous Region (No. 2022D01B187).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yu, C., Yin, K., Wang, Z. (2024). Domain Generalization for Multimodal Disaster Tweet Classification. In: Zhang, Y., Qi, L., Liu, Q., Yin, G., Liu, X. (eds) Proceedings of the 13th International Conference on Computer Engineering and Networks. CENet 2023. Lecture Notes in Electrical Engineering, vol 1125. Springer, Singapore. https://doi.org/10.1007/978-981-99-9239-3_28
Download citation
DOI: https://doi.org/10.1007/978-981-99-9239-3_28
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9238-6
Online ISBN: 978-981-99-9239-3
eBook Packages: EngineeringEngineering (R0)