Title:
|
HEURISTIC AND AI APPROACH TO OPTIMIZE PLAGIARISM DETECTION TOOL USING A PUBLIC SEARCH ENGINE |
Author(s):
|
Ondrej Vesely, Jan Kolomaznik, Tomas Foltynek |
ISBN:
|
978-989-8533-09-8 |
Editors:
|
Bebo White and Pedro IsaĆas |
Year:
|
2012 |
Edition:
|
Single |
Keywords:
|
Plagiarism detection, search engines, information retrieval, neural networks |
Type:
|
Short Paper |
First Page:
|
399 |
Last Page:
|
403 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
The paper presents an experience with methods for efficient population of the database of possible sources for plagiarism. Each document is examined with public search engine for potential plagiarism. To ensure maximal relevance of results and maximal speed of examination, the fragments of source documents have to be chosen very carefully. We tried naive approach, heuristic and neural networks to optimize the number of queries for the public search engine. We found that neural network has no use without bigram or trigram frequency dictionary, so that context is important for querying. The most efficient way how to speed up the matching is to learn how to estimate the plagiarism probability for each part of the document and use it for building the queries for the search engine. |
|
|
|
|