Title:
|
A STUDY ON BIG DATA FRAMEWORKS AND MACHINE
LEARNING TOOL KITS |
Author(s):
|
Imad Sassi and Samir Anter |
ISBN:
|
978-989-8533-92-0 |
Editors:
|
Ajith P. Abraham and Jörg Roth |
Year:
|
2019 |
Edition:
|
Single |
Keywords:
|
Big Data, Hadoop, Spark, Machine Learning, Deep Learning |
Type:
|
Full Paper |
First Page:
|
61 |
Last Page:
|
68 |
Language:
|
English |
Cover:
|
|
Full Contents:
|
click to dowload
|
Paper Abstract:
|
Big Data is an extremely large amount of structured and unstructured data, gathered from a wide range of sources which
often require a fast processing and real time analysis. In this new context, the performances of the traditional techniques
are limited. However, to handle these bulky quantities of data, new technologies emerged, called Big Data technologies.
In fact, the characteristics of Big Data made the exploration process of these data a painful task. This process is called
Big Data Analytics. One of the important challenges of Big Data is to search new technologies or to improve and extend
the existing platforms, infrastructures and standard techniques to manage the Big Data. Hadoop / MapReduce paradigm
and the Spark framework are among the most prominent solutions for large-scale parallel distributed data processing
alongside Machine Learning techniques, in particularly, Deep Learning for performing powerful statistical and predictive
analysis. In this paper, we first, give an overview, a classification and a comparison of main Big Data technologies. Then,
we focus in particular on Machine Learning platforms and libraries, especially those for Deep Learning. The results show
that Spark is a general-purpose computation engine thanks to its very generalized solutions. |
|
|
|
|