Digital Library

cab1

 
Title:      XML DOCUMENTS CLUSTERING BASED ON STRUCTURAL SIMILARITY
Author(s):      Ali Aïtelhadj , Fatiha Souam , Mohamed Mezghiche
ISBN:      978-972-8924-93-5
Editors:      Pedro Isaías, Bebo White and Miguel Baptista Nunes
Year:      2009
Edition:      1
Keywords:      Clustering, structurally similar, hierarchical context, tree, threshold.
Type:      Full Paper
First Page:      559
Last Page:      566
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      In this paper we develop a clustering method for XML documents. Our approach is two-step. We first automatically extract the structure from each XML document to be classified. This extracted structure is thus used as a model of representation to classify the corresponding XML document. Our methodology consists in grouping similarly structured XML documents in clusters in order to reduce the response time and raise accuracy of the search engine. This is based on the idea that if the XML documents share similar structures, they are more likely to correspond to the structural part of the same query. Note that in the XML applications, queries may have a content part and a structure part. The matching of XML documents’ tree structures is based on the calculation of their similarities. Finally, for the experimentation purpose we tested our clustering algorithm on both real and synthetic data. The results clearly demonstrate the relevance of our approach.
   

Social Media Links

Search

Login