Digital Library

cab1

 
Title:      IMPROVING ARABIC INFORMATION RETRIEVAL: INTRODUCING THE LOCAL STEM
Author(s):      Eiman Tamah Al-Shammari
ISBN:      978-972-8939-31-1
Editors:      Piet Kommers, Tomayess Issa and Pedro Isaías
Year:      2010
Edition:      Single
Keywords:      Pre-processing, Stemming, Algorithm, Arabic, Information Retrieval
Type:      Full Paper
First Page:      139
Last Page:      146
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Stemming is a fundamental step in processing textual data preceding the tasks of Information Retrieval (IR), text mining, and natural language processing (NLP). The common goal of stemming is to standardize words by reducing a word to its base (root or stem). However, simply removing the suffix of the word can cause stemming errors such as under-stemming or over-stemming. Sophisticated stemmers tend to weakly stem documents with very computationally expensive approaches such as dictionary lookup. This paper presents the “Educated Text Stemmer - Local Stem” (ETS-LS), a novel dictionary-free, content-based stemmer and adopts it as an indexing mechanism to study its contribution in improving monolingual Arabic IR. The experiment results showed that our approach significantly improved IR effectiveness.
   

Social Media Links

Search

Login