Context-aware cross feature attentive network for click-through rate predictions

Lee, Soo**; Hwang, Sangheum

doi:10.1007/s10489-024-05659-9

Context-aware cross feature attentive network for click-through rate predictions

Published: 13 July 2024

(2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Abstract

Click-through rate (CTR) prediction aims to estimate the likelihood that a user will interact with an item. It has gained significant attention in areas such as online advertising and e-commerce. Existing studies have verified that feature interactions play a crucial role in CTR prediction, highlighting the need for efficient modeling of these interactions. However, most existing approaches in CTR prediction tend to overlook specific feature characteristics, relying instead on deep neural networks or advanced attention mechanisms to learn meaningful feature interactions. In real-world scenarios, features can be categorized into groups based on prior information, which motivates the explicit consideration of interactions between groups of features. For example, the unique context of an item often has a substantial correlation with a particular user, and a specific item often has a strong relationship with a particular user demographic. An efficient model, therefore, requires an appropriate inductive bias to learn these relationships. To address this issue, we present a Context-aware Cross Feature Attentive Network (CCFAN) that explicitly considers the relationship or association between items and users. We categorize input variables into four groups: user, item, user context, and item context, which allows learning significant interactions between (user)-(item context) and (item)-(user context) in an explicit way. These interactions are learned using a multi-head self-attention network that includes modules for user-item interaction and cross-feature interaction. To demonstrate the effectiveness of CCFAN, we conduct experiments on two public benchmark datasets, MovieLens1M and Frappe, and one real-world dataset from an educational service provider, WJTB. The experimental results show that CCFAN not only outperforms previous state-of-the-art CTR methods but also offers a high degree of explainability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Data Availability and Access

The MovieLens1M and Frappe datasets are publicly available at https://grouplens.org/datasets/movielens/1m/, and https://www.baltrunas.info/context-aware, respectively. The WJTB dataset is not publicly available due to privacy and confidential issues but is available from the authors upon reasonable request and with the permission of Woong** ThinkBig Co., Ltd.

Code

The code is not publicly available due to confidential issues but is available for research purposes from the authors upon reasonable request and with the permission of Woong** ThinkBig Co., Ltd.

Notes

https://grouplens.org/datasets/movielens/1m/
https://www.baltrunas.info/context-aware
Distance calculations were performed using Euclidean, cosine, and Manhattan distance, all of which yielded similar results.

References

Cheng H-T, Koc L, Harmsen J, Shaked T, Chandra T, Aradhye H, Anderson G, Corrado G, Chai W, Ispir M, Anil R, Haque Z, Hong L, Jain V, Liu X, Shah H (2016) Wide & deep learning for recommender systems. Proceedings of the 1st workshop on deep learning for recommender systems. pp 7—10
Guo H, Tang R, Ye Y, Li Z, He X (2017) DeepFM: A Factorization-machine Based Neural Network for CTR Prediction. Proceedings of the 26th international joint conference on artificial intelligence. pp 1725–-1731
**ao J, Ye H, He X, Zhang H, Wu F, Chua T-S (2017) Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks. Proceedings of the 26th international joint conference on artificial intelligence. pp 3119–-3125
Song W, Shi C, **ao Z, Duan Z, Xu Y, Zhang M, Tang J (2019) AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. Proceedings of the 28th ACM international conference on information and knowledge management. pp 1161–-1170
Xu Y, Zhu Y, Yu F, Liu Q, Wu S (2021) Disentangled self-attentive neural networks for click-through rate prediction. Proceedings of the 30th ACM international conference on information & knowledge management. pp 3553-–3557
Li Z, Cheng W, Chen Y, Che H, Wang W (2020) Interpretable click-through rate prediction through hierarchical attention. Proceedings of the 13th international conference on web search and data mining. pp 313-–321
Zhang K, Qian H, Cui Q, Liu Q, Li L, Zhou J, Ma J, Chen E (2021) Multi-interactive attention network for fine-grained feature learning in CTR prediction. Proceedings of the 14th ACM international conference on web search and data mining
Wang Y, Zhao X, Xu T, Wu X (2022) AutoField: automating feature selection in deep recommender systems. Proceedings of the ACM web conference 2022
Yang L, Zheng W, **ao Y (2022) Exploring Different Interaction among Features for CTR Prediction. Soft Comput 26(13):6233–6243
Article Google Scholar
Li S, Cui Z (2022) Explicit and implicit feature interaction based on attention networks for click-through rate prediction. Proceedings of the 5th international conference on artificial
Lin W, Zhao X, Wang Y, Xu T, Wu X (2022) AdaFS: Adaptive Feature Selection in deep recommender system. Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining
Yan C, Li X, Chen Y, Zhang Y (2022) JointCTR: a Joint CTR Prediction Framework Combining Feature Interaction and Sequential Behavior Learning. Appl Intell 52(4):4701–4714
Article Google Scholar
Zhang J, Zhong C, Fan S, Mu X, Ni Z (2022) Hierarchical Attention and Feature Projection for Click-through Rate Prediction. Appl Intell 52(8):8651–8663
Article Google Scholar
Mao K, Zhu J, Su L, Cai G, Li Y, Dong Z (2023) FinalMLP: an enhanced two-stream MLP model for CTR prediction. Proceedings of the 37th AAAI conference on artificial intelligence
Wang F, Gu H, Li D, Lu T, Zhang P, Gu N (2023) Towards deeper, lighter and interpretable cross network for CTR prediction. Proceedings of the 32nd ACM international conference on information and knowledge management
Mei L, Ren P, Chen Z, Nie L, Ma J, Nie J-Y (2018) An attentive interaction network for context-aware recommendations. Proceedings of the 27th ACM international conference on information and knowledge management. pp 157-–166
Li L, Dong R, Chen L (2019) Context-aware co-attention neural network for service recommendations. IEEE 35th International Conference on Data Engineering Workshops (ICDEW). pp 201–208
McMahan HB, Holt G, Sculley D, Young M, Ebner D, Grady J, Nie L, Phillips T, Davydov E, Golovin D, Chikkerur S, Liu D, Wattenberg M, Hrafnkelsson AM, Boulos T, Kubica J (2023) Ad click prediction: a view from the trenches. Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. pp 1222-–1230
Rendle S (2010) Factorization machines. 2010 IEEE international conference on data mining. pp 995-–1000
Juan Y, Zhuang Y, Chin W-S, Lin C-J (2016) Field-aware factorization machines for CTR prediction. Proceedings of the 10th ACM conference on recommender systems. pp 43–-50
Zhang W, Du T, Wang J (2016) Deep learning over multi-field categorical data: a case study on user response prediction. 38th European conference on information retrieval. Springer, pp 45-–57
He X, Liao L, Zhang H, Nie L, Hu X, Chua T-S (2017) Neural collaborative filtering. Proceedings of the 26th international conference on world wide web. pp 173-–182
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Proceedings of the 31st international conference on neural information processing systems. pp 6000-–6010
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers for image recognition at scale. Proceedings of the 9th international conference on learning representations
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (Long and Short Papers). pp 4171-–4186
Radford A, Narasimhan K, Salimans T, Sutskever I (2018) Improving language understanding by Generative Pre-Training. Proceedings of technical report, OpenAI
He X, Chua T-S (2017) Neural factorization machines for sparse predictive analytics. Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. pp 355-–364
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. Proceedings of the 3rd international conference on learning representations
Dacrema MF, Cremonesi P, Jannach D (2019) Are we really making much progress? A worrying analysis of recent neural recommendation approaches. Proceedings of the 13th ACM conference on recommender systems. pp 101-–109
Anelli VW, Bellogín A, Noia TD, Pomo C (2021) Reenvisioning the comparison between neural collaborative filtering and matrix factorization. Proceedings of the 15th ACM conference on recommender systems. pp 521-–529
Chin JY, Chen Y, Cong G (2022) The datasets dilemma: how much do we really know about recommendation datasets? Proceedings of the 15th ACM international conference on web search and data mining. pp 141-–149
Li P, Burges CJC, Wu Q (2007) McRank: learning to rank using multiple classification and gradient boosting. Proceedings of the 20th international conference on neural information processing systems. pp 897-–904
Liu T-Y (2009) Learning to Rank for Information Retrieval. Found Trends Inf Retr 3(3):225–331
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT and the Ministry of Education) (RS-2024-00352184 and NRF-2019R1A6A1A03032119).

Author information

Authors and Affiliations

Industrial and Information Systems Engineering, Seoul National University of Science and Technology, Seoul, 01811, Republic of Korea
Soo** Lee & Sangheum Hwang
EduTech Laboratory, Woong** ThinkBig Co., Ltd., Seoul, 04521, Republic of Korea
Soo** Lee
Research Center for Electrical and Information Technology, Seoul National University of Science and Technology, Seoul, 01811, Republic of Korea
Sangheum Hwang

Authors

Soo** Lee
View author publications
You can also search for this author in PubMed Google Scholar
Sangheum Hwang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the conception of the presented idea. S.L. performed data collection, experiments, and analysis. S.L. and S.H. wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Sangheum Hwang.

Ethics declarations

Competing Interests

The authors have no conflict of interest.

Ethical and Informed Consent for Data Used

Not applicable

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lee, S., Hwang, S. Context-aware cross feature attentive network for click-through rate predictions. Appl Intell (2024). https://doi.org/10.1007/s10489-024-05659-9

Download citation

Accepted: 30 June 2024
Published: 13 July 2024
DOI: https://doi.org/10.1007/s10489-024-05659-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Context-aware cross feature attentive network for click-through rate predictions

Abstract

Access this article

Subscribe and save

Buy Now

Data Availability and Access

Code

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Ethical and Informed Consent for Data Used

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation