Using Perspective Schemata to Model the ETL Process
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33122
Using Perspective Schemata to Model the ETL Process

Authors: Valeria M. Pequeno, Joao Carlos G. M. Pires

Abstract:

Data Warehouses (DWs) are repositories which contain the unified history of an enterprise for decision support. The data must be Extracted from information sources, Transformed and integrated to be Loaded (ETL) into the DW, using ETL tools. These tools focus on data movement, where the models are only used as a means to this aim. Under a conceptual viewpoint, the authors want to innovate the ETL process in two ways: 1) to make clear compatibility between models in a declarative fashion, using correspondence assertions and 2) to identify the instances of different sources that represent the same entity in the real-world. This paper presents the overview of the proposed framework to model the ETL process, which is based on the use of a reference model and perspective schemata. This approach provides the designer with a better understanding of the semantic associated with the ETL process.

Keywords: conceptual data model, correspondence assertions, data warehouse, data integration, ETL process, object relational database.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1071870

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1516

References:


[1] W. H. Inmon, Building the data warehouse, 4th ed. Wiley Publishing, 2005.
[2] R. F. Raminhos, "ETL state of the art," New University of Lisbon, Tech. Rep., June 2007, unplished.
[3] C. Imhoff, N. Galemmo, and J. G. Geiger, Mastering Data Warehouse Design - Relational and Dimensional Techniques. Wiley Publishing, 2003.
[4] J. M. P'erez, R. Berlanga, M. J. Aramburu, and T. B. Pedersen, "A relevance-extended multi-dimensional model for a data warehouse contextualized with documents," in DOLAP-05: Proc. of the 8th ACM Intl. Workshop on Data Warehousing and OLAP. USA: ACM, 2005, pp. 19-28.
[5] R. Matias and J. Moura-Pires, "Revisiting the olap interaction to cope with spatial data and spatial data analysis," in ICEIS 2007 - Proc. of the 9th Intl. Conf. on Enterprise Information Systems, J. Cardoso, J. Cordeiro, and J. Filipe, Eds., vol. DISI, 2007, pp. 157-163.
[6] D. Calvanese, L. Dragone, D. Nardi, R. Rosati, and S. M. Trisolini, "Enterprise modeling and data warehousing in TELECOM ITALIA," Inf. Syst., vol. 31, no. 1, pp. 1-32, 2006.
[7] R. Knackstedt and K. Klose, "Configurative reference model-based development of data warehouse systems," Idea group publishing, vol. Managing Modern Organizations through Information Technology, pp. 32-39, 2005.
[8] R. Kimball, M. Ross, W. Thornthwaite, J. Mundy, and B. Becker, The Data Warehouse Lifecycle Tookit, 2nd ed. Wiley Publishing, 2008.
[9] W. Eckerson, "Four ways to build a data warehouse," What works, vol. 15, 2003. (Online). Available: http://www.tdwi.org/research/display.aspx?id=6699.
[10] D. L. Moody, "From enterprise models to dimensional models: A methodology for data warehouse and data mart design," in Proc. of the Intl. Workshop on Design and Management of Data Warehouses, 2000.
[11] E. F. Codd, "A relational model of data for large shared data banks," in Communications of the ACM, 1970, pp. 377-387.
[12] R. G. Cattell and D. Barry, Eds., The Object Database Standard ODMG 3.0. Morgan Kaufmann Publishers, 2000.
[13] T. B. Pedersen, "Warehousing the world: a few remaining challenges," in DOLAP-07: Proc. of the ACM 10th intl. workshop on data warehousing and OLAP. USA: ACM, 2007, pp. 101-102.
[14] R. Elmasri and S. B. Navathe, Fundamentals of database systems, 5th ed. Pearson Education, 2006.
[15] V. M. Pequeno and J. C. G. M. Pires, "A formal object-relational data warehouse model," New University of Lisbon, Tech. Rep., November 2007.
[16] G. Zhou, R. Hull, and R. King, "Generating data integration mediators that use materialization," J. Intell. Inf. Syst., vol. 6(2/3), pp. 199-221, May 1996.
[17] IBM, DB2 version 9.1 for z/OS - SQL reference, 6th ed. IBM Corporation, December 2008.
[18] S. Abreu and V. Nogueira, "Using a logic programming language with persistence and contexts," in Declarative Programming for Knowledge Management, 16th intl. conf. on applications of declarative programming and knowledge management, INAP 2005, Japan. Revised Selected Papers., ser. Lecture Notes in Computer Science, O. Takata, M. Umeda, I. Nagasawa, N. Tamura, A. Wolf, and G. Schrader, Eds., vol. 4369. Springer, 2006, pp. 38-47.
[19] G. Wiederhold, "Mediators in the architecture of future information systems," in IEEE Computer, vol. 25(3), 1992, pp. 38-49.
[20] D. Dori, R. Feldman, and A. Sturm, "From conceptual models to schemata: An object-process-based data warehouse construction method," Inf. Syst., vol. 33, no. 6, pp. 567-593, 2008.
[21] E. Malinowski and E. Zim'anyi, "A conceptual model for temporal data warehouses and its transformation to the ER and the object-relational models," Data knowl. eng., vol. 64, no. 1, pp. 101-133, 2008.
[22] M. Golfarelli, V. Maniezzo, and S. Rizzi, "Materialization of fragmented views in multidimensional databases," Data Knowl. Eng., vol. 49, no. 3, pp. 325-351, 2004.
[23] B. Husemann, J. Lechtenborger, and G. Vossen, "Conceptual data warehouse modeling," in Design and Management of Data Warehouses, 2000, p. 6.
[24] S. Rizzi, "Conceptual modeling solutions for the data warehouse," In Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications, vol. Information Science Reference, pp. 208-227, 2008.
[25] R. Wrembel, "On a formal model of an object-oriented database with views supporting data materialisation," in Proc. of the Conf. on Advances in Databases and Information Systems, 1999, pp. 109-116.
[26] E. Franconi and A. Kamble, "A data warehouse conceptual data model," Proc. of the Int. Conf. on Scientific and Statistical Database Management, vol. 00, pp. 435-436, 2004.
[27] A. S. Kamble, "A conceptual model for multidimensional data," in APCCM-08: Proc. of the 15th on Asia-Pacific Conf. on Conceptual Modelling. Australia: Australian Computer Society, Inc., 2008, pp. 29-38.
[28] C. Sapia, M. Blaschka, G. H¨ofling, and B. Dinter, "Extending the E/R model for the multidimensional paradigm," in Proc. of the Workshops on Data Warehousing and Data Mining, 1999, pp. 105-116.
[29] N. Tryfona, F. Busborg, and J. G. B. Christiansen, "starER: a conceptual model for data warehouse design," in DOLAP -99: Proc. of the 2nd ACM Intl. Workshop on Data warehousing and OLAP. USA: ACM, 1999, pp. 3-8.
[30] S. Luj'an-Mora, J. Trujillo, and I.-Y. Song, "A UML profile for multidimensional modelling in data warehouses," Data Knowl. Eng., vol. 59, no. 3, pp. 725-769, 2005.
[31] T. B. Nguyen, A. M. Tjoa, and R. Wagner, "An object oriented multidimensional data model for OLAP," in Web-Age Inf. Management, 2000, pp. 69-82.
[32] J. Trujillo, M. Palomar, and J. Gomez, "Applying object-oriented conceptual modeling techniques to the design of multidimensional databases and OLAP applications," WAIM-00. Lecture Notes in Computer Science (LNCS), vol. 1846, pp. 83-94, 2000.
[33] F. Ravat and O. Teste, "A temporal object-oriented data warehouse model," in Proc. of the Int. Workshop on Database and Expert Systems Applications, 2000, pp. 583-592.
[34] P. Vassiliadis, A. Simitsis, and S. Skiadopoulos, "Conceptual modeling for ETL processes," in DOLAP-02: Proc. of the 5th ACM Intl. Workshop on Data Warehousing and OLAP. USA: ACM, 2002, pp. 14-21.
[35] D. Skoutas and A. Simitsis, "Designing ETL processes using semantic web technologies," in DOLAP-06: Proceedings of the 9th ACM international workshop on Data warehousing and OLAP. USA: ACM, 2006, pp. 67-74.