Log in

Security-Aware Database Migration Planning

  • Published:
Constraints Aims and scope Submit manuscript

Abstract

Database migration is an important problem faced by companies dealing with big data. Not only is migration a costly procedure, but it also involves serious security risks as well. For some institutions, the primary focus is on reducing the cost of the migration operation, which manifests itself in application testing. For other institutions, minimizing security risks is the most important goal, especially if the data involved is of a sensitive nature. In the literature, the database migration problem has been studied from a test cost minimization perspective. In this paper, we focus on an orthogonal measure, i.e., security risk minimization. We associate security with the number of shifts needed to complete the migration task. Ideally, we want to complete the migration in as few shifts as possible, so that the risk of data exposure is minimized. In this paper, we provide a formal framework for studying the database migration problem from the perspective of security risk minimization (shift minimization) and establish the computational complexities of several models in the same. For the NP-hard models, we develop memetic algorithms that produce solutions that are within \(10\%\) and \(7\%\) of the optimal in \(95\%\) of the instances under 8 and 82 seconds, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Algorithm 3
Fig. 1

Similar content being viewed by others

Notes

  1. https://drive.google.com/file/d/1uKWlqw4XbKNY4JXYGUlJLcQfAag4_xLY/view?usp=share_link

References

  1. Acikalin, U.U., & Caskurlu, B. (2022). Multilevel memetic hypergraph partitioning with greedy recombination. In GECCO ’22: Genetic and Evolutionary Computation Conference, Companion Volume, Boston, Massachusetts, USA, July 9 - 13, 2022, pp. 168–171. ACM

  2. Acikalin, U.U., Caskurlu, B., Wojciechowski, P., & Subramani, K. (2021). New results on test-cost minimization in database migration. In International Symposium on Algorithmic Aspects of Cloud Computing, pp. 38–55, Springer

  3. Azeroual, O., & Jha, M. (2021). Without data quality, there is no data migration. Big Data and Cognitive Computing, 5(20), 24.

    Article  Google Scholar 

  4. Barhate, S., & Dhore, M. (2015). Data migration issues in cloud computing: a survey. International Journal of Electronics, Communication and Soft Computing Science & Engineering (IJECSCSE), 360

  5. Chang, J., Gabow, H. N., & Khuller, S. (2014). A model for minimizing active processor time. Algorithmica, 70(3), 368–405.

    Article  MathSciNet  MATH  Google Scholar 

  6. Chatterjee, A., & Segev, A. (1991). Data manipulation in heterogeneous databases. SIGMOD Record, 20(4), 64–68.

    Article  Google Scholar 

  7. Chon, H. D., Agrawal, D., & El Abbadi, A. (2002). Data management for moving objects. IEEE Data Eng. Bull., 25(2), 41–47.

    Google Scholar 

  8. Chu, G., Stuckey, P.J., Schutt, A., Ehlers, T., Gange, G., & Francis, K. (2020). Chuffed, a lazy clause generation solver. https://github.com/chuffed/chuffed

  9. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms, Third Edition (3rd ed.). The MIT Press.

    MATH  Google Scholar 

  10. Dell’Amico, M., Díaz, J. C. D., & Iori, M. (2012). The bin packing problem with precedence constraints. Operations Research, 60(6), 1491–1504.

    Article  MathSciNet  MATH  Google Scholar 

  11. Drumm, C., Schmitt, M., Do, H. H., & Rahm, E. (2007). Quickmig: automatic schema matching for data migration projects. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, Lisbon, Portugal, November 6-10, 2007, pp. 107–116

  12. Epstein, L., Favrholdt, L. M., & Levin, A. (2011). Online variable-sized bin packing with conflicts. Discrete Optimization, 8(2), 333–343.

    Article  MathSciNet  MATH  Google Scholar 

  13. Epstein, L., & Levin, A. (2008). An aptas for generalized cost variable-sized bin packing. SIAM Journal on Computing, 38(1), 411–428.

    Article  MathSciNet  MATH  Google Scholar 

  14. Epstein, L., & Levin, A. (2008). On bin packing with conflicts. SIAM Journal on Optimization, 19(3), 1270–1298.

    Article  MathSciNet  MATH  Google Scholar 

  15. Even, G., Levi, R., Rawitz, D., Schieber, B., Shahar, S., & Sviridenko, M. (2008). Algorithms for capacitated rectangle stabbing and lot sizing with joint set-up costs. ACM Transactions on Algorithms (TALG), 4(3), 1–17.

    Article  MathSciNet  MATH  Google Scholar 

  16. Falkenauer, E. (1996). A hybrid grou** genetic algorithm for bin packing. Journal of heuristics, 2(1), 5–30.

    Article  Google Scholar 

  17. Falkenauer, E., & Delchambre, A. (1992). A genetic algorithm for bin packing and line balancing. In: Proceedings of the 1992 IEEE International Conference on Robotics and Automation, Nice, France, May 12-14, 1992, IEEE Computer Society, pp. 1186–1192

  18. Ferrandina, F., Meyer, T., Zicari, R., Ferran, G., & Madec, J. (1995). Schema and database evolution in the O2 object database system. In: VLDB’95, Proceedings of 21th International Conference on Very Large Data Bases, September 11-15, 1995, Zurich, Switzerland, pp. 170–181.

  19. Friesen, D. K., & Langston, M. A. (1986). Variable sized bin packing. SIAM journal on computing, 15(1), 222–230.

    Article  MATH  Google Scholar 

  20. Gandhi, R., Halldórsson, M. M., Kortsarz, G., & Shachnai, H. (2004). Improved results for data migration and open shop scheduling. In: Automata, Languages and Programming: 31st International Colloquium, ICALP 2004, Turku, Finland, July 12-16, 2004. Proceedings, pp. 658–669

  21. Gandhi, R., & Mestre, J. (2009). Combinatorial algorithms for data migration to minimize average completion time. Algorithmica, 54(1), 54–71.

    Article  MathSciNet  MATH  Google Scholar 

  22. Garey, M. R., & Johnson, D. S. (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H: Freeman.

    MATH  Google Scholar 

  23. Garey, M. R., Johnson, D. S., Simons, B. B., & Tarjan, R. E. (1981). Scheduling unit-time tasks with arbitrary release times and deadlines. SIAM Journal on Computing, 10(2), 256–269.

    Article  MathSciNet  MATH  Google Scholar 

  24. Goldman, R., McHugh, J., & Widom, J. (1999). From semistructured data to XML: migrating the lore data model and query language. In: ACM SIGMOD Workshop on The Web and Databases, WebDB 1999, Philadelphia, Pennsylvania, USA, June 3-4, 1999. Informal Proceedings, pp. 25–30

  25. Golubchik, L., Khuller, S., Kim, Y. A., Shargorodskaya, S., & Wan, Y. J. (2004). Data migration on parallel disks. In: Algorithms - ESA 2004, 12th Annual European Symposium, Bergen, Norway, September 14-17, 2004, Proceedings, pp. 689–701

  26. Hall, J., Hartline, J. D., Karlin, A. R., Saia, J., & Wilkes, J. (2001). On algorithms for efficient data migration. In: Proceedings of the Twelfth Annual Symposium on Discrete Algorithms, January 7-9, 2001, Washington, DC, USA, pp. 620–629

  27. Hirofuchi, T., Ogawa, H., Nakada, H., Itoh, S., & Sekiguchi, S. (2009). A live storage migration mechanism over WAN for relocatable virtual machine services on clouds. In: 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGrid 2009, Shanghai, China, 18-21 May 2009, pp. 460–465.

  28. Jansen, K. (1999). An approximation scheme for bin packing with conflicts. Journal of combinatorial optimization, 3(4), 363–377.

    Article  MathSciNet  MATH  Google Scholar 

  29. Jensen, M., Schwenk, J., Gruschka, N., & Iacono, L. L. (2009). On technical security issues in cloud computing. In: IEEE International Conference on Cloud Computing, CLOUD 2009, Bangalore, India, 21-25 September, 2009, pp. 109–116

  30. Karmarkar, N., & Karp, R. M. (1982). An efficient approximation scheme for the one-dimensional bin-packing problem. In :23rd Annual Symposium on Foundations of Computer Science (sfcs 1982), IEEE, pp. 312–320

  31. Kelarev, A., Seberry, J., Rylands, L., & Yi, X. (2017). Combinatorial algorithms and methods for security of statistical databases related to the work of mirka miller. In: Combinatorial Algorithms - 28th International Workshop, IWOCA 2017, Newcastle, NSW, Australia, July 17-21, 2017, Revised Selected Papers, pp. 383–394

  32. Khuller, S., Kim, Y. A., & Wan, Y. J. (2003). Algorithms for data migration with cloning. In: Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 9-12, 2003, San Diego, CA, USA, pp. 27–36

  33. Liu, Q., Cheng, H., Tian, T., Wang, Y., Leng, J., Zhao, R., Zhang, H., & Wei, L. (2021). Algorithms for the variable-sized bin packing problem with time windows. Computers & Industrial Engineering, 155, 107175.

    Article  Google Scholar 

  34. Lougee-Heimer, R. (2003). The common optimization interface for operations research: Promoting open-source software in the operations research community. IBM Journal of Research & Development, 47(1), 57–66.

    Article  Google Scholar 

  35. Mens, T., Demeyer, S., Hainaut, J.-L., Cleve, A., Henrard, J., & Hick, J.-M. (2008). Migration of legacy information systems. Software Evolution, 105–138.

  36. Murgolo, F. D. (1987). An efficient approximation scheme for variable-sized bin packing. SIAM Journal on Computing, 16(1), 149–161.

    Article  MathSciNet  MATH  Google Scholar 

  37. Narayanan, D., Thereska, E., Donnelly, A., Elnikety, S., & Rowstron, A. I. T. (2009). Migrating server storage to SSDs: analysis of tradeoffs. In: Proceedings of the 2009 EuroSys Conference, Nuremberg, Germany, April 1-3, 2009, pp. 145–158

  38. Nethercote, N., Stuckey, P. J., Becket, R., Brand, S., Duck, G. J., & Tack, G. (2007). Minizinc: Towards a standard cp modelling language. In: Principles and Practice of Constraint Programming–CP 2007: 13th International Conference, CP 2007, Providence, RI, USA, September 23-27, 2007. Proceedings 13, pp. 529–543. Springer

  39. Otto, A., Otto, C., & Scholl, A. (2013). Systematic data generation & test design for solution algorithms on the example of salbpgen for assembly line balancing. European Journal of Operational Research, 228(1), 33–45.

    Article  MathSciNet  MATH  Google Scholar 

  40. Patil, S., Roy, S., Augustine, J., Redlich, A., Lodha, S., Vin, H. M., Deshpande, A., Gharote, M. S., & Mehrotra, A. (2010). Minimizing testing overheads in database migration lifecycle. In: COMAD, Citeseer, p. 191

  41. Perron, L., & Furnon, V. (2022) Or-tools. Google.

  42. Quiroz-Castellanos, M., Cruz-Reyes, L., Torres-Jimenez, J., Gómez, C., Huacuja, H. J. F., & Alvim, A. C. (2015). A grou** genetic algorithm with controlled gene transmission for the bin packing problem. Computers and Operations Research, 55, 52–64.

    Article  MathSciNet  MATH  Google Scholar 

  43. Quiroz-Castellanos, M., Cruz Reyes, L., Torres-Jiménez, J., Santillán, C. G., Fraire Huacuja, H. J., & Alvim, A. C. F. (2015). A grou** genetic algorithm with controlled gene transmission for the bin packing problem. Computers & OR, 55, 52–64.

    Article  MathSciNet  MATH  Google Scholar 

  44. Saranya, N., Brindha, R., Aishwariya, N., Kokila, R., Matheswaran, P., & Poongavi, P. (2021). Data migration using etl workflow. In: 2021 7th International Conference on Advanced Computing & Communication Systems (ICACCS), vol. 1, IEEE, pp. 1661–1664

  45. Scholl, A., Klein, R., & Jürgens, C. (1997). Bison: A fast hybrid procedure for exactly solving the one-dimensional bin packing problem. Computers & OR, 24(7), 627–645.

    Article  MATH  Google Scholar 

  46. Schulte, C., Lagerkvist, M., & Tack, G. (2006). Gecode. Software download & online material at the website: http://www.gecode.org, 11–13

  47. Sianipar, J., Sukmana, M., & Meinel, C. (2018). Moving sensitive data against live memory dum**, spectre and meltdown attacks. In: 2018 26th International Conference on Systems Engineering (ICSEng), IEEE, pp. 1–8

  48. Singh, A., & Gupta, A. K. (2007). Two heuristics for the one-dimensional bin-packing problem. OR Spectrum, 29(4), 765–781.

    Article  MathSciNet  MATH  Google Scholar 

  49. Subramani, K., Caskurlu, B., & Acikalin, U. U. (2019). Security-aware database migration planning. In: International Symposium on Algorithmic Aspects of Cloud Computing, Springer, pp. 103–121

  50. Subramani, K., Caskurlu, B., & Velasquez, A. (2018) Minimization of testing costs in capacity-constrained database migration. In: Algorithmic Aspects of Cloud Computing - 4th International Symposium, ALGOCLOUD 2018, Helsinki, Finland, August 20-21, 2018, Revised Selected Papers, pp. 1–12.

  51. Syswerda, G. (1991) A study of reproduction in generational and steady-state genetic algorithms. In: Foundations of genetic algorithms, vol. 1. Elsevier, pp. 94–101

  52. Wang, J., & Lochovsky, F. H. (2003). Data extraction and label assignment for web databases. In Proceedings of the Twelfth International World Wide WebConference, WWW 2003, Budapest, Hungary, May 20-24, 2003, pp. 187–196

  53. Wee, T., & Magazine, M. J. (1982). Assembly line balancing as generalized bin packing. Operations Research Letters, 1(2), 56–58.

    Article  MATH  Google Scholar 

  54. Wojciechowski, P., Subramani, K., Velasquez, A., & Caskurlu, B. (2021). Algorithmic analysis of priority-based bin packing. In: Conference on Algorithms and Discrete Applied Mathematics, pp. 359–372, Springer

  55. Zhao, X., Lin, Q., Chen, J., Wang, X., Yu, J., & Ming, Z. (2016). Optimizing security and quality of service in a real-time database system using multi-objective genetic algorithm. Expert Syst. Appl., 64, 11–23.

    Article  Google Scholar 

  56. Zuckerman, D. (2006). Linear degree extractors and the inapproximability of max clique and chromatic number. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computing, pp. 681–690

Download references

Acknowledgements

This research was supported in part by the Defense Advanced Research Projects Agency through grant HR001123S0001-FP-004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Subramani.

Ethics declarations

Conflicts of interest

We declare that we have no conflict of interest.

Additional information

A preliminary version of this work have appeared in the proceedings of the 5th International Symposium on Algorithmic Aspects of Cloud Computing (ALGOCLOUD 2019) [49].

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Acikalin, U.U., Caskurlu, B. & Subramani, K. Security-Aware Database Migration Planning. Constraints 28, 472–505 (2023). https://doi.org/10.1007/s10601-023-09351-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10601-023-09351-6

Navigation