Title:
|
FULL MODEL SELECTION IN HUGE DATASETS UNDER THE MAPREDUCE PARADIGM |
Author(s):
|
Angel Díaz-Pacheco, Jesús A. Gonzalez-Bernal, Carlos Alberto Reyes-García and Hugo Jair Escalante-Balderas |
ISBN:
|
978-989-8533-66-1 |
Editors:
|
Yingcai Xiao and Ajith P. Abraham |
Year:
|
2017 |
Edition:
|
Single |
Keywords:
|
Big Data, Model Selection, Machine learning |
Type:
|
Full Paper |
First Page:
|
239 |
Last Page:
|
246 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
The analysis of large amounts of data has become an important task in science and business that led to the emergence of the Big Data paradigm. This paradigm owes its name to data objects too large to be processed by standard hardware and algorithms. Many data analysis tasks involve the use of machine learning techniques during the model creation step. The goal of predictive models consists on achieving the highest possible accuracy to predict new samples, and for this reason there is high interest in selecting the most suitable algorithm for a specific dataset. This trend is known as model selection and it has been widely studied in datasets of common size, but poorly explored in the Big Data context. As an effort to explore in this direction this work propose an algorithm for model selection in very large datasets. |
|
|
|
|