Abstract
To protect user privacy in data analysis, a state-of-the-art strategy is differential privacy in which scientific noise is injected into the real analysis output. The noise masks individual’s sensitive information contained in the dataset. However, determining the amount of noise is a key challenge, since too much noise will destroy data utility while too little noise will increase privacy risk. Though previous research works have designed some mechanisms to protect data privacy in different scenarios, most of the existing studies assume uniform privacy concerns for all individuals. Consequently, putting an equal amount of noise to all individuals leads to insufficient privacy protection for some users, while over-protecting others. To address this issue, we propose a self-adaptive approach for privacy concern detection based on user personality. Our experimental studies demonstrate the effectiveness to address a suitable personalized privacy protection for cold-start users (i.e., without their privacy-concern information in training data).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Papernot, N., McDaniel, P.D., Sinha, A., Wellman, M.P.: Towards the science of security and privacy in machine learning. CoRR (2016)
McKenzie, P.J., Burkell, J., Wong, L., Whippey, C., Trosow, S.E., McNally, M.B.: User-generated online content 1: overview, current state and context. First Monday 17, 4–6 (2012)
Narayanan, A., Shmatikov, V.: Robust de-anonymization of large sparse datasets. In: Proceedings of the 2008 IEEE Symposium on Security and Privacy, pp. 111–125. SP 2008 (2008)
Fredrikson, M., Jha, S., Ristenpart, T.: Model inversion attacks that exploit confidence information and basic countermeasures. In: Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333. CCS 2015 (2015)
Pennebaker, J.W., King, L.A.: Linguistic styles: language use as an individual difference. J. Personal. Soc. Psychol. 77, 1296–1312 (1999)
Flekova, L., Gurevych, I.: Personality profiling of fictional characters using sense-level links between lexical resources. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1805–1816 (2015)
Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. CoRR (2010)
Flekova, L., Gurevych, I.: Can we hide in the web? Large scale simultaneous age and gender author profiling in social media notebook for PAN at CLEF 2013. In: Working Notes for CLEF 2013 Conference, Valencia, Spain, 23–26 September 2013 (2013)
Vu, X.S., Jiang, L., Brändström, A., Elmroth, E.: Personality-based knowledge extraction for privacy-preserving data analysis. In: Proceedings of the Knowledge Capture Conference, pp. 45:1–45:4. K-CAP 2017 (2017)
Cynthia, D.: Differential privacy. In: ICALP, pp. 1–12 (2006)
Mairesse, F., Walker, M.A., Mehl, M.R., Moore, R.K.: Using linguistic cues for the automatic recognition of personality in conversation and text. J. Artif. Int. Res. 30, 457–500 (2007)
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: ICDE, pp. 217–228 (2005)
Wang, R., Wang, X., Li, Z., Tang, H., Reiter, M.K., Dong, Z.: Privacy-preserving genomic computation through program specialization. In: CCS , pp. 338–347(2009)
McSherry, F.D.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: SIGMOD (2009)
Mohan, P., Thakurta, A., Shi, E., Song, D., Culler, D.: GUPT: privacy preserving data analysis made easy. In: SIGMOD (2012)
Ebadi, H., Sands, D., Schneider, G.: Differential privacy: now it’s getting personal. In: Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. POPL 2015, pp. 69–81 (2015)
Jorgensen, Z., Yu, T., Cormode, G.: Conservative or liberal? Personalized differential privacy. In: 2015 IEEE 31st International Conference on Data Engineering, pp. 1023–1034 (2015)
Sumner, C., Byers, A., Shearing, M.: Determining personality traits and privacy concerns from Facebook activity. In: Black Hat Briefings, pp. 197–221 (2011)
John, O.P., Srivastava, S.: The big five trait taxonomy: History, measurement, and theoretical perspectives. In: Handbook of Personality: Theory and Research, pp. 102–138 (1999)
Murtagh, F.: Multilayer perceptrons for classification and regression. Neurocomputing 2(5), 183–197 (1991)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014)
Rumelhart, D.E., Durbin, R., Golden, R., Chauvin, Y.: Backpropagation, pp. 1–34 (1995)
Abadi, M., et al.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Costa, P.T., McCrae, R.R.: The Revised NEO Personality Inventory (NEO-PI-R), pp. 179–198 (2008)
Majumder, N., Poria, S., Gelbukh, A., Cambria, E.: Deep learning-based document modeling for personality detection from text. IEEE Intell. Syst. 32, 74–79 (2017)
Farnadi, G., Zoghbi, S., Moens, M.: Cock. Recognising personality traits using Facebook status updates, M.D., pp. 14–18 (2013)
Farnadi, G., et al.: Computational personality recognition in social media. User Model. User-Adapt. Interact. 26, 109–142 (2016)
Vu, X.S., Flekova, L., Jiang, L., Gurevych, I.: Lexical-semantic resources: yet powerful resources for automatic personality classification. In: Proceedings of the 9th Global WordNet Conference (2018)
Vu, T., Nguyen, D.Q., Vu, X.S., Nguyen, D.Q., Trenell, M.: Nihrio at semeval-2018 task 3: a simple and accurate neuralnetwork model for irony detection in twitter. In: Proceedings of the 12nd International Workshop on Semantic Evaluation (SemEval-2018), pp. 525–530. Association for Computational Linguistics (2018)
Bowyer, K.W., Chawla, N.V., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2011)
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Vu, XS., Jiang, L. (2023). Self-adaptive Privacy Concern Detection for User-Generated Content. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13396. Springer, Cham. https://doi.org/10.1007/978-3-031-23793-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-23793-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23792-8
Online ISBN: 978-3-031-23793-5
eBook Packages: Computer ScienceComputer Science (R0)