Title:
|
PREPROCESSING STRATEGY IN WEB-MINING:
RECOMMENDED OR INEVITABLE? |
Author(s):
|
Dror Ben-Ami |
ISBN:
|
978-989-8533-87-6 |
Editors:
|
Miguel Baptista Nunes, Pedro IsaĆas, Philip Powell, Pascal Ravesteijn and Guido Ongena |
Year:
|
2019 |
Edition:
|
Single |
Keywords:
|
Internet-society, Data/Web Mining, Preprocessing, Machine Learning, I2K (Information to Knowledge) |
Type:
|
Reflection |
First Page:
|
313 |
Last Page:
|
316 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
Web-browsing users' behavior is one of the fascinating, attractive and interesting subjects, specially from
socio-technological perspectives. Companies and learning-organizations are investing huge human resources, efforts and
capital to follow their users' behavior, trying to find out and plot their users' profiles. Knowing, understanding and
predicting users' tangible and intangible behavior help these organizations to focus on users' needs and interests. Thus,
companies and learning-organizations can generate significant intellectual capital and use this asset efficiently and
effectively later. The intellectual capital asset can target into business-based and knowledge-based activities, such as
pointed-newsletters, targeted markets, crystalizing pricing policies and much more. The users' profiles usually
categorized into different classifications, such as social aspects, personality traits, purchase behavior, cognitive behavior,
and more. The raw data for this research was collected from web users. Around one hundred thousand internet-society
users from one the OECD countries were the basis of the research population. The research examines users' behavior
through the net, after collecting wide range of data elements during few months period, using progressive data analysis
tools and techniques, afterwards. The case study relies of real work (project), conducted from July-2018 till Feb-2019.
The purpose was to recover and re-analyze the collected data after performing improper and incorrect analysis procedures
by the original staff. The paper deals with the possible missteps which can happen during the data preprocessing steps in
data-intensive projects and tries to understand the consequences of those early missteps can have on the end result.
Specifically, the paper recounts those missteps based on real experience during a web mining project. The paper presents
a guide on the procedures and decisions which should be taken to avoid or at least minimize critical mistakes during the
data preprocessing step. |
|
|
|
|