Hadoop, Mapreduce and More – Part 1

Sep 27, 2014

∙ Paid

It has been some time since I was stalling learning Hadoop. Finally got some free time and realized that Hadoop may not be so difficult after all. What I understood finally is that Hadoop is basically comprised of 3 elements:

A File System
Map – Reduce
Its many individual Components.

Let’s go through each of them one by one.

1. Hadoop as a File System:

One of the main things that Hadoop provides is cheap data storage. What happens intrinsically is that the Hadoop system takes a file, cuts it into chunks and keeps those chunks at different places in a cluster. Suppose you have a big big file in your local system and you want that file to be:

On the cloud for easy access
Processable in human time

The one thing you can look forward to is Hadoop.

Assuming that you have got hadoop installed on the amazon cluster you are working on.

Start the Hadoop Cluster:

You need to run the following commands to start the hadoop cluster(Based on location of hadoop installation directory):

cd /usr/local/hadoop/
bin/s…

Keep reading with a 7-day free trial

Subscribe to MLWhiz | AI Unwrapped to keep reading this post and get 7 days of free access to the full post archives.