Hadoop 1.2.1 Installation and Configuration on Multiple Nodes
There are few changes one has to make for multi-node setup from single node. First you need to complete single node setup up to DFS formatting step.
Mainly there are five steps to follow for multi-node setup from single node:
STEPS:
- SSH COPY ID to all nodes
- Configure masters and slaves
- Configure CORE-SITE.XML and MAPRED-SITE.XML
- Format DFS
- START-ALL.SH
Now I am going to explain this steps in detail:
Step-1 SSH COPY ID to all nodes:
From NAME NODE, We need to generate SSH KEY and distribute it to all the SLAVE NODES and also SECONDARY NAME NODE (if any)
Command:
ssh-copy-id -i $HOME/.ssh/id_rsa.pub hadoop@coed159
here “hadoop” is an user name and “coed159” is a system name, which you need to change according to your setup.
COPY FINGERPRINT : GIVE YES
Do the same for all DATA NODES and for SECONDARY NAME NODE (if any)
Check whether it is successfully copied or not
ssh coed159
it should not ask for password
Step-2 Configure masters and slaves:
We need to do it on NAME NODE alone (not on the DATA NODES and SECONDARY NAME NODE)
Go to NAME NODE
Command:
cd /usr/local/hadoop/conf
Find the two files: masters, and slaves
Masters for NAME NODE and SECONDARY NAME NODE
Slaves for DATA NODES
Command:
sudo nano /usr/local/hadoop/conf/masters
By default it contains ‘localhost’, Change it to the name of NAME NODE (i.e. coed161 in my case)
Ctrl + o to save
Enter
Ctrl + x to exit
sudo nano /usr/local/hadoop/conf/slaves
By default it contains ‘localhost’, Change it to contain names of all DATA NODES one per line, in my case
coed159
coed160
coed162
coed163
Ctrl + o to save
Enter
Ctrl + x to exit
Step-3 Configure CORE-SITE.XML and MAPRED-SITE.XML
go to SLAVES/SECONDARY NAME NODE and we need to make them point to the master
Command:
sudo nano /usr/local/hadoop/conf/core-site.xml
Check whether it is pointing to NAME NODE (i.e. coed161 in my case) in ‘FS.DEFAULT.NAME’, if it is pointing to localhost:10001, update localhost with coed161
Ctrl + o to save
Enter
Ctrl + x to exit
Same way for MAPRED-SITE.XML
Command:
sudo nano /usr/local/hadoop/conf/mapred-site.xml
Check whether it is pointing to the JOB TRACKER / NAME NODE (i.e. coed161, in my case)
If it is ‘localhost:10002’, update it as ‘coed161:10002’
Remove LOCAL HOST entries from /ETC/HOSTS file
Command:
sudo nano /etc/hosts
remove localhost and entries for 127.0.0.1
Step-4 Format DFS:
If converting the existing single node installation then you must delete the /USR/LOCAL/HADOOP/TMP and then create it again in all the nodes and then format it from NAME NODE alone. skip up to formatting steps if you haven’t formatted your HDFS with single node setup.
Command:
To remove directory:
sudo rm -r /usr/local/hadoop/tmp
Create tmp directory
sudo mkdir /usr/local/hadoop/tmp
Changing ownership of tmp as well as hadoop directory
sudo chown hadoop /usr/local/hadoop/tmp
sudo chown hadoop /usr/local/hadoop
Format NAME NODE
hadoop namenode -format
Check for ‘name node successfully formatted’ message
Step-5 START-ALL.SH
To start hadoop cluster with multi-node, we have to run this command from NAME NODE and it starts respective services on all NODES
Command:
start-all.sh
jps
check each system separately to find specific JVMs running on them
Check number of live nodes in web GUI (it will take few minuets)
stop-all.sh
For any queries you can write in a comment or mail me at: “brijeshbmehta@gmail.com”
Courtesy: Mr. Anand Kumar, NIT, Trichy
Trackbacks & Pingbacks