Digital Library

cab1

 
Title:      STATISTICAL WORD SENSE DISAMBIGUATION THROUGH UNSUPERVISED SENSE MARKER ENRICHMENT ON LARGE FREE TEXT CORPUSES
Author(s):      Shahzad Khan , Kenan Azam
ISBN:      972-99353-0-0
Editors:      Pedro IsaĆ­as and Nitya Karmakar
Year:      2004
Edition:      1
Keywords:      Computational Linguistics, Information Processing, Word-Sense Disambiguation, semantic ontology.
Type:      Full Paper
First Page:      543
Last Page:      550
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Word sense ambiguity is known to have a destructive effect on the performance of information retrieval and linguistic systems. The problem arises from the inherent polysemous nature of natural languages, where one word can have multiple meanings or senses. This is not a problem for humans but mapping the correct sense of a word is a daunting task for a retrieval system. This paper describes two disambiguation methodologies based on contemporary techniques that seek to enrich text with sense meta-information by identifying the correct sense for an ambiguous noun in a document. This research draws on contemporary statistical disambiguation methodologies, and attempts to make it more effective through a novel weighting scheme, which is simpler than complex schemes used by other disambiguation algorithms. This research follows two recent ground breaking research results --- that words tend to have one sense per document and one sense per collocation. In the experiments, the set of senses for each polysemous word are the same as the Wordnet 1.7 repository. However, the methodologies are generalized, and applicable to any concept repository that is built on a generalization/specialization framework. The two different methodologies are compared with each other and the results establish that this approach leads to an improvement in the disambiguation process. This paper also proposes a strategy to use the disambiguation methodology to enhance relevance feedback and information retrieval performance.
   

Social Media Links

Search

Login