Title:
|
WTMATCHER: AN APPROACH TO DETERMINE WEB TABLES SIMILARITY |
Author(s):
|
Filipe Roberto Silva, Ronaldo dos Santos Mello |
ISBN:
|
978-989-8533-24-1 |
Editors:
|
Pedro Isaías and Bebo White |
Year:
|
2014 |
Edition:
|
Single |
Keywords:
|
Web tables, Matching, Synonyms, Similarity |
Type:
|
Full Paper |
First Page:
|
115 |
Last Page:
|
122 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
The Web is a huge information source. Large amounts of data are published daily and great part of them is available as HTML tables. Some works have proposed approaches to extract and integrate Web tables content in order to make it more accessible for human consumption. However, this is a complex task and still an open issue given that Web tables do not have a unique representation pattern. Besides, the use of synonyms and abbreviations become hard the comparison of tables content. Given that, we propose a new approach to determine similarity between Web tables which is able to deal with distinct structures and synonym terms. Related works do not deal, at the same time, with both problematic. Preliminary experiments show that our solution is promising. |
|
|
|
|