FULL MODEL SELECTION IN HUGE DATASETS UNDER THE MAPREDUCE PARADIGM

Home

Document Info

Title:	FULL MODEL SELECTION IN HUGE DATASETS UNDER THE MAPREDUCE PARADIGM
Author(s):	Angel Díaz-Pacheco, Jesús A. Gonzalez-Bernal, Carlos Alberto Reyes-García and Hugo Jair Escalante-Balderas
ISBN:	978-989-8533-66-1
Editors:	Yingcai Xiao and Ajith P. Abraham
Year:	2017
Edition:	Single
Keywords:	Big Data, Model Selection, Machine learning
Type:	Full Paper
First Page:	239
Last Page:	246
Language:	English
Cover:
Full Contents:	click to dowload
Paper Abstract:	The analysis of large amounts of data has become an important task in science and business that led to the emergence of the Big Data paradigm. This paradigm owes its name to data objects too large to be processed by standard hardware and algorithms. Many data analysis tasks involve the use of machine learning techniques during the model creation step. The goal of predictive models consists on achieving the highest possible accuracy to predict new samples, and for this reason there is high interest in selecting the most suitable algorithm for a specific dataset. This trend is known as model selection and it has been widely studied in datasets of common size, but poorly explored in the Big Data context. As an effort to explore in this direction this work propose an algorithm for model selection in very large datasets.

	Go Back