Digital Library

cab1

 
Title:      PROPOSAL AND EVALUATION OF A TECHNIQUE OF DISCOVERING XML STRUCTURES FOR EFFICIENT RETRIEVAL
Author(s):      Hiroshi Ishikawa , Hajime Takekawa , Kaoru Katayama
ISBN:      972-8924-19-4
Editors:      Pedro Isaías, Miguel Baptista Nunes and Inmaculada J. Martínez
Year:      2006
Edition:      V I, 2
Keywords:      XML, schema discovery, database, data mining, query
Type:      Full Paper
First Page:      142
Last Page:      153
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      We propose an adaptable approach to discovery of database schemas for well-formed XML data such as EDI, news, and digital libraries, which we interchange, filter, or download for future retrieval and analysis. The generated schemas usually consist of more than one table. Our approach controls the number of tables to be divided by use of statistics of XML so that the total cost of processing queries is reduced. We generate schemas appropriate for complex data such as text formatting tags and child elements with the small maximum number of occurrences in order to reduce the number of tables. To this end, we introduce three functions NULL expectation, Large Leaf Fields, and Large Child Fields for controlling the number of tables to be divided. We evaluated typical XML queries over the generated schemas and normalized schemas as another approach and measured and compared both of the costs in order to validate our approach. We describe the method for discovering appropriate schemas and the evaluation of the method in detail.
   

Social Media Links

Search

Login