Digital Library

cab1

 
Title:      PBT: PERSIAN PART OF SPEECH BRILL TAGGER
Author(s):      Habib Karbasian , Parisa Rashidi
ISBN:      978-972-8924-56-0
Editors:      Nuno Guimarães and Pedro Isaías
Year:      2008
Edition:      Single
Keywords:      Persian Part of Speech Tagger, Tagging, Brill Tagger, Corpus
Type:      Short Paper
First Page:      348
Last Page:      352
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Persian is a language widely spoken in Iran and neighboring countries such as Afghanistan, Tajikistan etc. Recently there is an increasing interest in processing and retrieval of Persian (Farsi) in these countries and around the world. A rulebased tagging method has been applied to Persian language. Since tagging is a preprocessing step toward natural language processing, we have tried to alleviate this path for Persian language. In this paper, we describe initial findings in the development of a Persian part-of-speech tagger, based on Brill tagger. Because Persian is both morphologically and structurally complex, we used two different sets of rules: a lexical rule set and a contextual rule set. This tagger has been tested on a five Persian test collection. We have used this tagger to extract lexicon and tag sample texts from the corpus. A tag set with 40 basic syntactical and morphological Persian tags has been utilized in these experiments. So far the results have been encouraging about 95% accuracy on the sample texts.
   

Social Media Links

Search

Login