Enhanced Disk-Based Databases Towards Improved Hybrid In-Memory Systems
Authors: Samuel Kaspi, Sitalakshmi Venkatraman
Abstract:
In-memory database systems are becoming popular due to the availability and affordability of sufficiently large RAM and processors in modern high-end servers with the capacity to manage large in-memory database transactions. While fast and reliable inmemory systems are still being developed to overcome cache misses, CPU/IO bottlenecks and distributed transaction costs, disk-based data stores still serve as the primary persistence. In addition, with the recent growth in multi-tenancy cloud applications and associated security concerns, many organisations consider the trade-offs and continue to require fast and reliable transaction processing of diskbased database systems as an available choice. For these organizations, the only way of increasing throughput is by improving the performance of disk-based concurrency control. This warrants a hybrid database system with the ability to selectively apply an enhanced disk-based data management within the context of inmemory systems that would help improve overall throughput. The general view is that in-memory systems substantially outperform disk-based systems. We question this assumption and examine how a modified variation of access invariance that we call enhanced memory access, (EMA) can be used to allow very high levels of concurrency in the pre-fetching of data in disk-based systems. We demonstrate how this prefetching in disk-based systems can yield close to in-memory performance, which paves the way for improved hybrid database systems. This paper proposes a novel EMA technique and presents a comparative study between disk-based EMA systems and in-memory systems running on hardware configurations of equivalent power in terms of the number of processors and their speeds. The results of the experiments conducted clearly substantiate that when used in conjunction with all concurrency control mechanisms, EMA can increase the throughput of disk-based systems to levels quite close to those achieved by in-memory system. The promising results of this work show that enhanced disk-based systems facilitate in improving hybrid data management within the broader context of in-memory systems.
Keywords: Concurrency control, disk-based databases, inmemory systems, enhanced memory access (EMA).
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1098924
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2042References:
[1] C. Balkesen, J. Teubner, G. Alonso, and M. T. Özsu. "Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware". In Proceedings of the International Conference on Data Engineering (ICDE), 2013, pp. 362–373.
[2] P. Larson, S. Blanas, C. Diaconu, C. Freedman, J. Patel, and M. Zwilling. “High-performance concurrency control mechanisms for mainmemory database”. PVLDB, 5(4):298–309, 2011.
[3] F. Färber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner. "Saphana database: data management for modern business applications". SIGMOD Rec., vol. 40, no. 4, pp.:45–51, Jan. 2012.
[4] P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. "Scalable atomic visibility with RAMP transactions". In ACM SIGMOD Conference, 2014.
[5] D.Jacobs and S. Aulbach. "Ruminations on Multi-Tenant Databases". In Proc. BTW, pp. 514–521, 2007.
[6] V., Ramanathan, S. Venkatraman, and S.R. Asaithambi, "A practical cloud services implementation framework for e-businesses”, Book Chapter In Tarnay, K., Xu, L and Imre, S. (Ed.), Research and Development in E-Business through Service-Oriented Solutions, IGI Global Publishers, USA, 2013.
[7] B.Mozafari, C. Curino, and S. Madden, “Resource and performance prediction for building a next generation database cloud”. CIDR, 2013.
[8] S. Kaspi, and S. Venkatraman, "Performance Analysis Of Concurrency Control Mechanisms For OLTP Databases". International Journal of Information and Education Technology, 4, 4, pp. 313-318, August 2014.
[9] H. Plattner. A common database approach for OLTP and OLAP using an in-memory column database. In SIGMOD Conference, 2009.
[10] J. Baker, C. Bond, J. Corbett, J. Furman, A. Khorlin, J. Larson, J.-M. L´eon, Y. Li, A. Lloyd, and V. Yushprakh. Megastore: "Providing scalable, highly available storage for interactive services". In Proc. Conf. on Innovative Data Systems Research (CIDR), 2011.
[11] I. Petrov, D. Bausch, R. Gottstein, and A. Buchmann, “Data-intensive systems on evolving memory hierarchies,” in Proc. of Workshop Entwicklung energiebewusster Software (EEbS 2012), 42. GI Jahrestagung, 2012.
[12] S. Das, S. Nishimura, D. Agrawal, and A. El Abbadi. "Albatross: lightweight elasticity in shared storage databases for the cloud using live data migration". Proc. VLDB Endow. (PVLDB), vol. 4, no. 8, 2011.
[13] P. Franaszek, J.T Robinson,.and A., Thomasian, “Access Invariance and Its Use in High-Contention Environments”, Proceedings of the 6th International Data Engineering Conference, Los Angeles, Feb 1990, pp 47 - 55.
[14] P. Franaszek, J.T. Robinson, and A., Thomasian, “Concurrency Control for High Contention Environments”, ACM TODS, Vol.17, No.2, June 1992, pp 304 - 345
[15] G. Graefe. "Modern B-Tree Techniques". Foundations and Trends in Databases, vol. 3, no. 4, pp. 203–402, 2011.
[16] J. Krueger, C. Tinnefeld, M. Grund, A. Zeier, and H. Plattner. "A case for online mixed workload processing". In Third International Workshop on Testing Database Systems, 2010.
[17] J. J. Levandoski, P.-A. Larson, and R. Stoica." Identifying hot and cold data in main-memory databases". In ICDE, 2013.
[18] S. Idreos, F. Groffen, N. Nes, S. Manegold, S.Mullender, and M. L. Kersten. "MonetDB: Two Decades of Research in Column-oriented Database Architectures. IEEE Data Eng. Bull., vol. 35, no. 1, pp. 40–45, 2012.
[19] R. Agrawal, M. J. Carey and M. Livny. “Concurrency control performance modeling: Alternatives and implications”. ACM Transactions on Database Systems, 12(14): 609–654, 1987.
[20] P. A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987.
[21] P. A. Bernstein and N. Goodman. “Concurrency control in distributed database systems”. ACM Computing Survey, 13(2):185–221, 1981.
[22] D. Agrawal and S. Sengupta. “Modular synchronization in distributed, multiversion databases: Version control and concurrency control”. IEEE TKDE, 5, 1993.
[23] S., Kaspi, “Optimizing Transaction throughput in databases via an intelligent scheduler”, Proceedings of the 1997 IEEE International Conference on Intelligent Processing Systems, Beijing, October, 1337 – 1341, 1997.
[24] C.H.C. Leung, and S. Kaspi, “A flexible paradigm for semantic integration in cooperative heterogeneous databases” Proceedings of FGCS '94, ICOT, Tokyo, December 1994.
[25] A., Thomasian, “A performance Comparison of locking methods with limited wait depth”, IEEE Transactions on Knowledge and Data Engineering, 9(3):421-434, 1997.