Commenced in January 2007
Paper Count: 30320
A Keyword-Based Filtering Technique of Document-Centric XML using NFA Representation
Abstract:XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1079240Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1413
 T. Bray, J. Paoli, C. M. Sperberg-McQueen, and E. Maler. Extensible Markup Language (XML) 1.0 Second Edition W3C Recommendation. Technical Report REC-xml-200010006, World Wide Web Consortium.
 J. Kamps, M. Marx, M. de Rijke, and B. Sigurbjörnsson, "Best-match Query form Document-centric XML," In Proc. Int. Workshop on the Web and Databases, pp. 55-60, 2004.
 J. Clark, and S. DeRose. XML Path Language (XPath) Version 1.0 W3C Recommendation. Technical Report REC-xpath-19991116, World Wide Web Consortium.
 S. Boag, D. Chamberlin, M. F. Fern├índez, D. Florescu, J. Robie, and J. Siméon. XQuery 1.0: An XML Query Language W3C Working Draft. Technical Report WD-xquery-20050404, World Wide Web Consortium.
 A. V. Aho and M. J. Corasick, "Efficient String Matching: An Aid to Bibliographic Search," Communications of the ACM, Vol. 18, Issue 6, pp. 333-340, 1975.
 Y. Diao, M. Altinel, M. J. Franklin, H. Zhang, and P. Fischer, "Path Sharing and Predicate Evaluation for High-Performance XML Filtering," ACM Trans. Database Systems, Vol. 28, Issue 4, pp. 467-516, 2003.
 T. J. Green, A. Gupta, G. Miklau, M. Onizuka, and D. Suciu, "Processing XML Streams with Deterministic Automata and Stream Indexes," ACM Trans. Databases Systems, Vol. 29, Issue 4, pp. 752-788, 2004.
 N. Bruno, L. Gravano, N. Koudas, and D. Srivastava, "Navigation- vs. Index-based XML Multi-query Processing," In Proc. IEEE Int. Conf. Data Engineering, pp. 139-150, 2003.
 C. Chan, P. Felber, M. Garofalakis, and R. Rastogi, "Efficient Filtering of XML Documents with XPath Expressions," In Proc. IEEE Int. Conf. Data Engineering, pp. 235, 2002.
 V. Josifovski, M. Fontoura, and A. Barta, "Querying XML Streams," Int. J. Very Large Data Bases, Vol. 14, Issue 2, pp. 197-210, 2005.
 J. Kwon, P. Rao, B. Moon, and S. Lee, "FiST: Scalable XML Document Filtering by Sequencing Twig Patterns," In Pro. Int. Conf. Very Large Data Bases, pp. 294-315, 2005.
 A. K. Gupta and D. Suciu, "Stream Processing of XPath Queries with Predicates," In Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 419-430, 2003.
 F. Tian, B. Reinwald, H. Pirahesh, T. Mayr, and J. Myllymaki, "Implementing A Scalable XML Publish/Subscribe System Using Relational Database Systems," In Proc. ACM SIGMOD Int. Conf. Management of Data, pp. 479-490, 2004.
 F. Peng, and S. S. Chawathe, "XSQ: A Streaming XPath Engine," ACM Trans. Databases Systems, Vol. 30, Issue 2, pp. 577-623, 2005.
 D. Megginson. SAX: A Free API for Event-based XML Parsing. Available: http://www.saxproject.org, 2005.
 C. D. Manning and H. Sch├╝tze. Foundations of Statistical Natural Language Processing. The MIT Press, 1999.