Abstract
Hash is widely used in various storage systems due to its excellent insertion and search performance. However, existing hash designs are not friendly for block devices because they will generate a lot of random small I/Os, which will significantly reduce the I/O efficiency on block devices. This paper proposes BDCuckoo (Block Device Cuckoo) hash, a I/O-optimized Cuckoo hash for block device. BDCuckoo reduces the amount of I/Os during the slot detection by limiting the location where the element may be stored on the hash table. Unlike the traditional cuckoo hash that triggers an disk I/O for each detection, BDCuckoo hash loads the possible slots of the target element into DRAM in a single large disk I/O. The paper also shows a use case that uses BDCuckoo to optimize a large directory index in the EXT4 file system. The evaluation shows that BDCuckoo hash outperforms the traditional Cuckoo hash for all YCSB workloads been tested and has a 2.64 times performance improvement at most for workload D with the load factor of 0.7. In the use case, the directory index of BDCuckoo-optimized file system achieves 1.79 times performance improvement for stat command and 1.64 times performance improvement for rm command.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., Vivier, L.: The new ext4 filesystem: current status and future plans. In: Proceedings of the Linux symposium, vol. 2, pp. 21–33. Citeseer (2007)
Phillips, D.: A directory index for EXT2. In: Annual Linux Showcase & Conference (2001)
Chung, L., Gray, J., Horst, R., Worthington, B.: Windows 2000 disk IO performance (2000)
Meister, D., Brinkmann, A.: dedupv1: improving deduplication throughput using solid state drives (SSD). In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–6. IEEE (2010)
Narayanan, D., Thereska, E., Donnelly, A., Elnikety, S., Rowstron, A.: Migrating server storage to SSDs: analysis of tradeoffs. In: 2009 Proceedings of the 4th ACM European Conference on Computer Systems, pp. 145–158 (2009)
Myers, D.D.S.: On the use of NAND flash memory in high-performance relational databases. Ph.D. dissertation, Massachusetts Institute of Technology (2008)
Devroye, L., Morin, P.: Cuckoo hashing: further analysis. Inf. Process. Lett. 86(4), 215–219 (2003)
Li, X., Andersen, D.G., Kaminsky, M., Freedman, M.J.: Algorithmic improvements for fast concurrent cuckoo hashing. In: 2014 Proceedings of the 9th European Conference on Computer Systems, pp. 1–14 (2014)
Pike, G.: CityHash: fast hash functions for strings. Stanford University class slides, October 2012 (2012)
Yamaguchi, F., Nishi, H.: Hardware-based hash functions for network applications. In: 2013 19th IEEE International Conference on Networks (ICON), pp. 1–6. IEEE (2013)
Aneesh Kumar, K.V., Cao, M., Santos, J.R., Dilger, A.: Ext4 block and inode allocator improvements. In: Linux Symposium, vol. 1 (2008)
Barata, M., Bernardino, J., Furtado, P.: YCSB and TPC-H: big data and decision support benchmarks. In: 2014 IEEE International Congress on Big Data, pp. 800–801. IEEE (2014)
Vangoor, B.K.R., Tarasov, V., Zadok, E.: To fuse or not to fuse: performance of user-space file systems. In: 2017 Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 59–72 (2017)
David, T., Guerraoui, R., Trigonakis, V.: Asynchronized concurrency: the secret to scaling concurrent search data structures. ACM SIGARCH Comput. Architect. News 43(1), 631–644 (2015)
Marcus, R., et al.: Benchmarking learned indexes. ar**v preprint ar**v:2006.12804 (2020)
Fagin, R., Nievergelt, J., Pippenger, N., Strong, H.R.: Extendible hashing-a fast access method for dynamic files. ACM Trans. Database Syst. 4(3), 315–344 (1979)
Nam, M., Cha, H., Choi, Y.-R., Noh, S.H., Nam, B.: Write-optimized dynamic hashing for persistent memory. In: 2019 Proceedings of USENIX Conference on File and Storage Technologies (FAST), pp. 31–44 (2019)
Lee, S.K., Mohan, J., Kashyap, S., Kim, T., Chidambaram, V.: RECIPE: converting concurrent dram indexes to persistent-memory indexes. In: 2019 Proceedings of the 27th ACM Symposium on Operating Systems Principles, pp. 462–477 (2019)
Lu, B., Hao, X., Wang, T., Lo, E.: Dash: scalable hashing on persistent memory. Proc. VLDB Endow. 13(8), 1147–1161 (2020)
Ren, K., Gibson, G.: TABLEFS: enhancing metadata efficiency in the local file system. In: 2013 Proceedings of USENIX Technical Conference (ATC), pp. 145–156 (2013)
Dent, A.: Getting Started with LevelDB. Packt Publishing Ltd. (2013)
O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Informatica 33(4), 351–385 (1996)
Lensing, P.H., Cortes, T., Brinkmann, A.: Direct lookup and hash-based metadata placement for local file systems. In: Proceedings of the 6th International Systems and Storage Conference, pp. 1–11 (2013)
Liu, Y., Li, H., Lu, Y., Chen, Z., Zhao, M.: An efficient and flexible metadata management layer for local file systems. In: 2019 IEEE 37th International Conference on Computer Design (ICCD). IEEE, pp. 208–216 (2019)
Acknowledgments
We thank the reviewers for their insightful feedback to improve this paper. This work was supported by National Key R&D Program of China (2018YFC1406205), National Natural Science Foundation of China (No. 61872392,61832020), Zhejiang Lab (NO. 2021KC0AB04), Key-Area Research and Development Program of Guangdong Province (2019B010107001), Guangdong Natural Science Foundation (2018B030312002), Pearl River S & T Nova Program of Guangdong (201906010008) and the Major Program of Guangdong Basic and Applied Research (2019B030302002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 IFIP International Federation for Information Processing
About this paper
Cite this paper
Zheng, X., Ma, J., Liu, Y., Chen, Z. (2022). BDCuckoo: an Efficient Cuckoo Hash for Block Device. In: Cérin, C., Qian, D., Gaudiot, JL., Tan, G., Zuckerman, S. (eds) Network and Parallel Computing. NPC 2021. Lecture Notes in Computer Science(), vol 13152. Springer, Cham. https://doi.org/10.1007/978-3-030-93571-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-93571-9_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93570-2
Online ISBN: 978-3-030-93571-9
eBook Packages: Computer ScienceComputer Science (R0)