Abstract
ROLAP (Relational OLAP) and MOLAP (Multidimensional OLAP) are two opposing techniques for building On-line Analytical Processing (OLAP) systems. MOLAP has good query performance while ROLAP is based on mature RDBMS technologies. Many data warehouses contain sparse but clustered multidimensional data which neither ROLAP or MOLAP handles efficiently and scalably.We propose a denseregion-based OLAP (DROLAP) approach which surpasses both ROLAP and MOLAP in space efficiency and query performance. DROLAP takes the bests of ROLAP and MOLAP and combines them to support fast queries and high storage utilization. The core of building a DROLAP system lies in the mining of dense regions in a data cube, for which we have developed an efficient index-based algorithm EDEM to handle. Extensive performance studies consistently show that the DROLAP approach is superior to both MOLAP and ROLAP in handling sparse but clustered multidimensional data. Moreover, our EDEM algorithm is efficient and effective in identifying dense regions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
S. Agrawal, R. Agrawal, P.M. Deshpande, A. Gupta, J.F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In Proceedings of VLDB, pages 506–521, Bombay, India, September 1996.
R. Agrawal, J. Gehrke, and D. Gunopulos. Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications. In Proceedings of the ACM SIGMOD Conference on Management of Data, Seattle, Washington, May 1998.
David W. Cheung, Bo Zhou, Ben Kao, Kan Hu and Sau Dan Lee. DROLAP—A Dense-Region Based Approach to On-line Analytical Processing. Technical Report (TR-99-02), Dept. of Computer Science & I.S., the University of Hong Kong, 1999. http://www.csis.hku.hk/publications/techreps/document/TR-99-02.ps
G. Colliat. OLAP, relational, and multidimensional database systems. SIGMOD Record, pages 64–69, Vol.25, No.3, September 1996.
M. Ester, H. Kriegel, J. Sander, and X. Xu. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of KDD, pages 226–231, Portland, Oregon, August 1996.
J. Gray, A. Bosworth, A. Layman, and H. Piramish. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-total. In Proceeding of ICDE, pages 152–159, New Orleans, February 1996.
S. Guha, R. Ratogi, and K. Shim. CURE: An Efficient Clustering Algorithm for Large Databases. In Proceedings of the ACM SIGMOD Conference on Management of Data, Seattle, Washington, May 1998.
H. Gupta, V. Harinarayan, A. Rajaraman, and J. Ullman. Index selection for OLAP. In Proceedings of the 13th Intl. Conference on Data Engineering, pages 208–219, Burmingham, UK, April 1997.
C.T. Ho, R. Agrawal, N. Megiddo and R. Srikant. Range Queries in OLAP Data Cubes. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 73–88, Tucson, Arizona, May 1997.
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 205–216, Montreal, Quebec, June 1996.
R.T. Ng, and J. Han. Efficient and Effective Clustering Methods for Spatial Data Mining. In Proc. of VLDB, pages 144–155, Santiago, Chile, 1994.
N. Roussopoulos, Y. Kotidis, and M. Roussopoulos. Cubetree: organization of and bulk incremental updates on the data cube. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 89–99, Tucso, Arizona, May 1997.
K.A. Ross and D. Srivastava. Fast computation of sparse datacube. In Proc. of VLDB, pages 116–125, Athens, Greece, August 1997.
T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 103–114, Montreal, Quebec, June 1996.
Y.H. Zhao, P.M. Deshpande, and J.F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 159–170, Tucson, Arizona, May 1997.
Y.H. Zhao, K. Tufte, and J.F. Naughton. On the Performance of an Array-Based ADT for OLAP Workloads. Technical Report CS-TR-96-1313, University of Wisconsin-Madison, CS Department, May 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheung, D.W., Zhou, B., Kao, B., Hu, K., Lee, S.D. (1999). DROLAP - A Dense-Region Based Approach to On-Line Analytical Processing. In: Bench-Capon, T.J., Soda, G., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 1999. Lecture Notes in Computer Science, vol 1677. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48309-8_71
Download citation
DOI: https://doi.org/10.1007/3-540-48309-8_71
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66448-2
Online ISBN: 978-3-540-48309-0
eBook Packages: Springer Book Archive