REGULAR LANGUAGE INFERENCE FOR DOMAIN SPECIFIC NAMED ENTITY RECOGNITION

Falk Brauer; Robert Rieger; Wojciech Barczynski; Adrian Mocan

Home

Digital Library

Visit Digital Library

Conference Proceedings

IADIS International Conference WWW/Internet - ICWI

IADIS International Conference WWW/Internet 2009

Document Info

Title:	REGULAR LANGUAGE INFERENCE FOR DOMAIN SPECIFIC NAMED ENTITY RECOGNITION
Author(s):	Falk Brauer , Robert Rieger , Wojciech Barczynski , Adrian Mocan
ISBN:	978-972-8924-93-5
Editors:	Pedro Isaías, Bebo White and Miguel Baptista Nunes
Year:	2009
Edition:	1
Keywords:	Regular Languages, Grammatical Inference, Named Entity Extraction, Information Extraction, Information Retrieval
Type:	Full Paper
First Page:	543
Last Page:	550
Language:	English
Cover:
Full Contents:	click to dowload
Paper Abstract:	Named Entity Recognition (NER) is one of the most important techniques in Information Extraction (IE) from unstructured documents. Still, regular expressions are the first choice to detect domain specific entities, such as product names, in text which follow a special syntax. In many NER scenarios a small, but representative number of entities stored in a structured form is available or can be acquired. Such data can be used by experienced developers to create and test regular expressions. However, creating such specific rules manually for a certain domain is a complex and timeconsuming task. In this paper, we introduce an approach for automated rule generation for NER, based on example instances of entities. We present an implementation, which automatically identifies patterns in small sets of example instances, tunes these patterns in order to achieve high precision and recall and applies these patterns for information extraction. The generated rules are not tuned to a specific training corpus and do not require a homogeneous document structure. The evaluation of our prototype shows very good results for three target domains of interest with an average f-measure of about 90%.

	Go Back

Social Media Links

amazon

Search

Login

Top Visited