Title:
|
INDEXING AND QUERYING CONTENT AND STRUCTURE OF XML DOCUMENTS ACCORDING TO THE VECTOR SPACE MODEL |
Author(s):
|
Jacques Le Maitre |
ISBN:
|
972-8924-02-X |
Editors:
|
Pedro IsaĆas and Miguel Baptista Nunes |
Year:
|
2005 |
Edition:
|
2 |
Keywords:
|
XML, Information Retrieval, Vector Space Model, XQuery, Fuzzy Queries. |
Type:
|
Short Paper |
First Page:
|
353 |
Last Page:
|
358 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
This paper presents a method to index and query content and structure of XML documents according to the vector space model. Indexing is performed in three steps: (i) choosing content elements i.e. those which refer to the semantic content of the documents, (ii) associating a vector to each terminal content element whose components are the weights associated to each indexing term, (iii) propagating these vectors bottom up along the ancestors of the terminal content elements. Querying is performed with XQuery extended by adding to it a vscore function which returns the similarity degree between a query vector and a content element vector and by integrating into it the NEXI language which is a subset of XPath, defined in the framework of the INEX initiative that we have equipped with a fuzzy semantics. |
|
|
|
|