Title:
|
TOPIC PAGE MINING BASED ON PHRASERANK FOR ADVERTISEMENT IMAGE |
Author(s):
|
Jian Sun, Siyuan Chen, Yingju Xia, Jun Sun |
ISBN:
|
978-989-8533-09-8 |
Editors:
|
Bebo White and Pedro IsaĆas |
Year:
|
2012 |
Edition:
|
Single |
Keywords:
|
PhraseRank, OCR, validation, link analysis, convergence |
Type:
|
Short Paper |
First Page:
|
425 |
Last Page:
|
430 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
In real life, natural scene advertisement (ad) images are very useful for customers to seek the related products or services. However, text and visual information included in those images is so limited that more related information should be extracted from Internet. This paper presents a novel topic page mining method for natural scene ad images. Based on the Optical Character Recognition (OCR) results from ad images, candidate key web pages are extracted by search engines. Then, web pages highly related with the ad images are adaptively chosen by clustering and matching. Therein, a new algorithm: PhraseRank, which extracts key topic related phrases from OCR results and web pages is proposed to improve the page mining accuracy. Experiments on the collected datasets show the effectiveness of the proposed method. |
|
|
|
|