SEQUENTIAL PATTERN MINING WITH APPROXIMATED CONSTRAINTS

Cláudia Antunes; Arlindo L. Oliveira

Home

Digital Library

Visit Digital Library

Conference Proceedings

IADIS International Conference Applied Computing - AC

IADIS International Conference Applied Computing 2004

Document Info

Title:	SEQUENTIAL PATTERN MINING WITH APPROXIMATED CONSTRAINTS
Author(s):	Cláudia Antunes , Arlindo L. Oliveira
ISBN:	972-98947-3-6
Editors:	Nuno Guimarães and Pedro Isaías
Year:	2004
Edition:	Single
Keywords:	Data Mining, Pattern Mining, Sequential Pattern Mining, Constraints, Constraint Relaxations, Deterministic Finite Automata.
Type:	Full Paper
First Page:	1131
Last Page:	1138
Language:	English
Cover:
Full Contents:	click to dowload
Paper Abstract:	The lack of focus that is a characteristic of unsupervised pattern mining in sequential data represents one of the major limitations of this approach. This lack of focus is due to the inherently large number of rules that is likely to be discovered in any but the more trivial sets of sequences. Several authors have promoted the use of constraints to reduce that number, but those constraints approximate the mining task to a hypothesis test task. In this paper, we propose the use of constraint approximations to guide the mining process, reducing the number of discovered patterns without compromising the prime goal of data mining: to discover unknown information. We show that existent algorithms, that use regular languages as constraints, can be used with minor adaptations. We propose a simple algorithm (ε-accepts) that verifies if a sequence is approximately accepted by a given regular language.The lack of focus that is a characteristic of unsupervised pattern mining in sequential data represents one of the major limitations of this approach. This lack of focus is due to the inherently large number of rules that is likely to be discovered in any but the more trivial sets of sequences. Several authors have promoted the use of constraints to reduce that number, but those constraints approximate the mining task to a hypothesis test task. In this paper, we propose the use of constraint approximations to guide the mining process, reducing the number of discovered patterns without compromising the prime goal of data mining: to discover unknown information. We show that existent algorithms, that use regular languages as constraints, can be used with minor adaptations. We propose a simple algorithm (ε-accepts) that verifies if a sequence is approximately accepted by a given regular language.

	Go Back

Social Media Links

amazon

Search

Login

Top Visited