Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33093
Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data

Authors: Sana Hamdi, Emna Bouazizi, Sami Faiz

Abstract:

In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result dynamic environments where data as well as queries are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding in account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach for real-time spatial Big data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attribute sequence by the use of Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantees for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable, and that it outperforms comparable algorithms.

Keywords: Real-Time Spatial Big Data, Quality Of Service, Vertical partitioning, Horizontal partitioning, Matching algorithm, Hamming distance, Stream query.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1474944

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1055

References:


[1] A. Jindal J. Dittrich, Relax and let the database do the partitioning online. In International Workshop on Business Intelligence for the Real-Time Enterprise (pp. 65-80). Springer, Berlin, Heidelberg, 2011, September.
[2] C. Curino, E. Jones, Y. Zhang S. Madden, Schism: a workload-driven approach to database replication and partitioning. Proceedings of the VLDB Endowment, 3(1-2), 48-57, 2010.
[3] D. A. Shraddha Phansalkar, Transaction aware vertical partitioning of database (TAVPD) for responsive OLTP applications in cloud data stores. Journal of Theoretical and Applied Information Technology, 59(1), 2014.
[4] D. W. Comer S. Y. Philip, A vertical partitioning algorithm for relational databases. In Data Engineering, 1987 IEEE Third International Conference on (pp. 30-35). IEEE, 1987, February.
[5] D. W. Cornell P. S. Yu, An effective approach to vertical partitioning for physical design of relational databases. IEEE Transactions on Software engineering, 16(2), 248-258, 1990.
[6] G. Karypis V. Kumar, METIS–unstructured graph partitioning and sparse matrix ordering system, version 2.0., 1995.
[7] J. A. Hoffer D. G. Severance, The use of cluster analysis in physical data base design. In Proceedings of the 1st International Conference on Very Large Data Bases (pp. 69-86). ACM, 1975, September.
[8] J. H. Son M. H. Kim, An adaptable vertical partitioning method in distributed systems. Journal of Systems and Software, 73(3), 551-561, 2004.
[9] L. Rodrguez X. Li, A support-based vertical partitioning method for database design. In Electrical Engineering Computing Science and Automatic Control (CCE), 2011 8th International Conference on (pp. 1-6). IEEE, 2011, October.
[10] M. Guo H. Kang, The Implementation of Database Partitioning Based on Streaming Framework. In Web Information Systems and Applications Conference, 2016 13th(pp. 157-162). IEEE, 2016, September.
[11] M.F. Mokbel, X. Xiong, W.G. Aref, S.E. Hambrusch, S. Prabhakar M.A. Hammad, PALACE: a query processor for handling real-time spatio-temporal data streams, In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, VLDB Endowment, August, pp.1377-1380, 2004.
[12] M. Hammer B. Niamir, A heuristic approach to attribute partitioning. In Proceedings of the 1979 ACM SIGMOD international conference on Management of data (pp. 93-101). ACM, 1979, May.
[13] M. Liroz-Gistau, R. Akbarinia, E. Pacitti, F. Porton P. Valduriez, Dynamic workload-based partitioning for large-scale databases. In International Conference on Database and Expert Systems Applications (pp. 183-190). Springer, Berlin, Heidelberg, 2012, September.
[14] M. V. Bhat A. Haupt, An efficient clustering algorithm. IEEE Transactions on Systems, Man, and Cybernetics, (1), 61-64, 1976.
[15] P. A. Bernstein, I. Cseri, N. Dani, N. Ellis, A. Kalhan, G. Kakivaya, ... T. Talius, Adapting microsoft SQL server for cloud computing. In Data Engineering (ICDE), 2011 IEEE 27th International Conference on (pp. 1255-1263). IEEE, 2011, April.
[16] S. Agrawal, V. Narasayya B. Yang, Integrating vertical and horizontal partitioning into automated physical database design. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data (pp. 359-370). ACM, 2004, June.
[17] S. Ahirrao R. Ingle, Scalable transactions in cloud data stores. Journal of Cloud Computing: Advances and Applications, SpringerOpen, 4:1-14, 2015.
[18] S. Hamdi, E. Bouazizi S. Faiz, A new qos management approach in real-time gis with heterogeneous real-time geospatial data using a feedback control scheduling. In Proceedings of the 19th International Database Engineering Applications Symposium, pages 174179. ACM, 2015.
[19] S. Hamdi, E. Bouazizi S. Faiz, A Speculative Concurrency Control in Real-Time Spatial Big Data Using Real-Time Nested Spatial Transactions and Imprecise Computation. In Computer Systems and Applications (AICCSA), 2017 IEEE/ACS 14th International Conference on (pp. 534-540). IEEE, 2017, October.
[20] S. Das, A. El Abbadi D. Agrawal, ElasTraS: An Elastic Transactional Data Store in the Cloud. HotCloud, 9, 131-142, 2009.
[21] S. Navathe, S. Ceri, G. Wiederhold J. Dou, Vertical partitioning algorithms for database design. ACM Transactions on Database Systems (TODS), 9(4), 680-710, 1984.
[22] S. Navathe M. Ra, Vertical partitioning for database design: a graphical algorithm. In ACM Sigmod Record(Vol. 18, No. 2, pp. 440-450). ACM, 1989, June.
[23] S. Papadomanolakis A. Ailamaki, An integer linear programming approach to database design. In Data Engineering Workshop, 2007 IEEE 23rd International Conference on (pp. 442-449). IEEE, 2007, April.
[24] S. Phansalkar S. Ahirrao, Survey of data partitioning algorithms for big data stores. In Parallel, Distributed and Grid Computing (PDGC), 2016 Fourth International Conference on (pp. 163-168). IEEE, 2016, December.
[25] TPC-DS: Transaction Performance Processing Council for Decision Support-decision Support
[online] http://www.tpc.org/tpcds/
[accessed, April 30, 2014].
[26] W. W. Chu I. T. Ieong, A transaction-based approach to vertical partitioning for relational database systems. IEEE Transactions on Software Engineering, 19(8), 804-812, 1993.
[27] W. Zhao, Y. Cheng F. Rusu, Workload-Driven Vertical Partitioning for Effective Query Processing over Raw Data., 2015.
[28] X. Lin, M. Orlowska Y. Zhang, A graph based cluster approach for vertical partitioning in database design. Data Knowledge Engineering, 11(2), 151-169, 1993.