Title:
|
ILLEGAL TEXT DETECTION FROM ELECTRONIC BULLETIN BOARDS |
Author(s):
|
Shogo Ichinose, Takashi Yukawa |
ISBN:
|
978-972-8939-73-1 |
Editors:
|
Miguel Baptista Nunes, Guo Chao Peng, Jörg Roth, Hans Weghorn and Pedro Isaías |
Year:
|
2012 |
Edition:
|
Single |
Keywords:
|
Consumer Generated Media, Illegal Text Detection, Bayesian Filtering, SVM |
Type:
|
Short Paper |
First Page:
|
100 |
Last Page:
|
104 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
Illegal information on consumer-generated media has become an object of public concern. Addressing this problem requires automatic extraction of illegal messages. In order to clarify whether the typical solution of the classifying problem is effective for this case, an illegal text extraction system using general machine learning techniques was constructed and evaluated by extracting illegal texts from electronic bulletin boards. The authors applied a support vector machine and three Bayesian filtering methods: the Paul Graham method, the Robinson method, and the Robinson-Fisher method. The results revealed that the system using the Robinson-Fisher method provides the best performance of 73.1% in F-measure. |
|
|
|
|