Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30184
Approximate Range-Sum Queries over Data Cubes Using Cosine Transform

Authors: Wen-Chi Hou, Cheng Luo, Zhewei Jiang, Feng Yan

Abstract:

In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells- values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness.

Keywords: DCT, Data Cube

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1058357

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1385

References:


[1] W. Acharya and P. Gibbons and V. Poosala, Aqua: A Fast Decision Support System Using Approximate Query Answers, 1999, Proc. 25th VLDB Conference.
[2] D. Barbara and M. Sullivan, Quasi-cubes: Exploiting approximation in multi-dimensional databases, 1997, SIGMOD Record, 26, 12-17.
[3] C. Chan and Y. Ioannidis, Hierarchical cubes for range-sum queries, 1999, Proc. VLDB, 675-686.
[4] S. Geffner and D. Agrawal and A. Abbadi and T. Smith, Relative prefix sums: an efficient approach for querying dynamic OLAP Data Cubes, 1999, Proc. ICDE, 328-335.
[5] S. Geffner and D. Agrawal and A. Abbadi, The dynamic data cubes, 2000, Proceeding of International Conference on Extending Database Technology (EDBT), 237-253.
[6] J. Gray and A. Bosworth and A. Layman and H. Pirahesh, Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals, 1996, Proc. ICDE Conference.
[7] C. Ho and R. Agrawal and N. Megiddo and R. Srikant, Range queries in OLAP data cubes, 1997, Proc. ACM SIGMOD Conference, 73-88.
[8] L. Lakshmanan and J. Pei and J. Han, Quotient cube: How to summarize the semantics of a data cube, 2002, Proc. 28th VLDB Conference, 528- 539.
[9] J. Lee and D. Kim and C. Chung, Multi-dimensional Selectivity Estimation Using Compressed Histogram Information, 1999, ACM SIGMOD, 205-214.
[10] S. Li and S. Wang, Semi-closed cube: An effective approach to trading off data cube size and query response time, Journal of Computer Science and Technology, 20(3), 367-372.
[11] Y. Matias and J. Vitter and M. Wang, Wavelet-based histograms for selectivity estimation, 1998, ACM SIGMOD Conference, 448-459.
[12] Y. Matias and J. Vitter and M. Wang, Dynamic Maintenance of Wavelet- Based Histograms, 2000, Proc 26th VLDB Conference, 101-110.
[13] Y. Nievergelt, Wavelets Made Easy, 1999, Birkhauser.
[14] Y. Sismanis and N. Roussoupoulos and A. Deligiannakis and Y. Kotidis, Dwarf: Shrinking the petacube, 2002, Proc. ACM SIGMOD Conference, 464-475.
[15] J. Vitter and M. Wang and B. Lyer, Data cube approximation and histograms via wavelets, 1998, Proc. CIKM, 96-104.
[16] W. Wang and J. L. Feng, Condensed cube: An effective approach to reducing data cube size, 2002, Proceedings of the 18th International Conference on Data Engineering.
[17] G. Zipf, Human behavior and the principle of least effort, 1949, Addison- Wesley.
[18] TPC, TPC benchmark D, decision support, 1995.
[19] BC, http://www.bls.census.gov/sipp/ ftp.html#sipp04, 2004.