Abstract
We present an efficient protocol for the privacy-preserving, distributed learning of decision-tree classifiers. Our protocol allows a user to construct a classifier on a database held by a remote server without learning any additional information about the records held in the database. The server does not learn anything about the constructed classifier, not even the user’s choice of feature and class attributes.
Our protocol uses several novel techniques to enable oblivious classifier construction. We evaluate a prototype implementation, and demonstrate that its performance is efficient for practical scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, C.: On k-anonymity and the curse of dimensionality. In: VLDB (2005)
Agrawal, D., Aggarwal, C.: On the design and quantification of privacy-preserving data mining algorithms. In: PODS (2001)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: SIGMOD (2000)
Blum, A., Dwork, C., McSherry, F., Nissim, K.: Practical privacy: the SuLQ framework. In: PODS (2005)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)
Brickell, J., Porter, D.E., Shmatikov, V., Witchel, E.: Privacy-preserving remote diagnostics. In: CCS (2007)
Ciriani, V., De Capitani di Vimercati, S., Foresti, S., Samarati, P.: k-anonymity. Secure Data Management in Decentralized Systems (2007)
Du, W., Zhan, Z.: Building decision tree classifier on private data. In: ICDM (2002)
Dwork, C., Nissim, K.: Privacy-preserving data mining on vertically partitioned databases. In: Franklin, M. (ed.) CRYPTO 2004. LNCS, vol. 3152, pp. 528–544. Springer, Heidelberg (2004)
Evfimievski, A., Gehrke, J., Srikant, R.: Limiting privacy breaches in privacy-preserving data mining. In: PODS (2003)
Ishai, Y., Paskin, A.: Evaluating branching programs on encrypted data. In: Vadhan, S.P. (ed.) TCC 2007. LNCS, vol. 4392, pp. 575–594. Springer, Heidelberg (2007)
Jarecki, S., Shmatikov, V.: Efficient two-party secure computation on committed inputs. In: Naor, M. (ed.) EUROCRYPT 2007. LNCS, vol. 4515, pp. 97–114. Springer, Heidelberg (2007)
Kruger, L.: Sfe-tools (2008), http://pages.cs.wisc.edu/~lpkruger/sfe/
LeFevre, K., DeWitt, D., Ramakrishnan, R.: Workload-aware anonymization. In: KDD (2006)
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: Privacy beyond k-anonymity and â„“-diversity. In: ICDE (2007)
Lindell, Y., Pinkas, B.: Privacy preserving data mining. J. Cryptology 15(3), 177–206 (2002)
Lindell, Y., Pinkas, B.: A proof of Yao’s protocol for secure two-party computation (2004), http://eprint.iacr.org/2004/175
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: â„“-diversity: Privacy beyond k-anonymity. In: ICDE (2006)
Martin, D., Kifer, D., Machanavajjhala, A., Gehrke, J., Halpern, J.: Worst-case background knowledge for privacy-preserving data publishing. In: ICDE (2007)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
Nergiz, M., Atzori, M., Clifton, C.: Hiding the presence of individuals from shared database. In: SIGMOD (2007)
Netflix. Netflix Prize (2006), http://www.netflixprize.com/
Paillier, P.: Public-key cryptosystems based on composite degree residuosity classes. In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, p. 223. Springer, Heidelberg (1999)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Rastogi, V., Suciu, D., Hong, S.: The boundary between privacy and utility in data publishing. In: VLDB (2007)
Samarati, P.: Protecting respondents’ identities in microdata release. IEEE Trans. on Knowledge and Data Engineering 13(6) (2001)
Sweeney, L.: Int. J. Uncertain. Fuzziness Knowl.-Based Syst. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)
Vaidya, J., Clifton, C.: Privacy-preserving decision trees over vertically partitioned data. In: DBSec (2005)
Vaidya, J., Kantarcioglu, M., Clifton, C.: Privacy-preserving Naive Bayes classification. The VLDB Journal 17(4) (2008)
Yang, Z., Zhong, S., Wright, R.: Privacy-preserving classification of customer data without loss of accuracy. In: SDM (2005)
Yao, A.: How to generate and exchange secrets. In: FOCS (1986)
Zhang, L., Jajodia, S., Brodsky, A.: Information disclosure under realistic assumptions: Privacy versus optimality. In: CCS (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Brickell, J., Shmatikov, V. (2009). Privacy-Preserving Classifier Learning. In: Dingledine, R., Golle, P. (eds) Financial Cryptography and Data Security. FC 2009. Lecture Notes in Computer Science, vol 5628. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03549-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-03549-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03548-7
Online ISBN: 978-3-642-03549-4
eBook Packages: Computer ScienceComputer Science (R0)