from job optimization to physical data organization like data layouts and indexes. Throughout this tutorial, we will highlight the similarities and differences between Hadoop MapReduce and Parallel DBMS. Furthermore, we will point out unresolved research problems and open issues. 1. INTRODUCTION Nowadays, dealing with datasets in the order of terabytes or even petabytes is a reality [24, 23, 19]. Therefore, processing such big datasets in an efficient way is a clear need for many users…
Words 1765 - Pages 8