Install Hadoop 2.6 on Ubuntu 14.04 as Single-Node Cluster
apt-get update
cat /proc/sys/net/ipv6/conf/all/disable_ipv6
0
Installing Java
Hadoop framework is written in Java!!
apt-get install openjdk-7-jdk
update-alternatives --config java
There is only one alternative in link group java (providing /usr/bin/java): /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
Nothing to configure.
Adding a dedicated Hadoop user
addgroup hadoop
Adding group `hadoop' (GID 1000) ...
Done.
adduser --ingroup hadoop hduser
Adding user `hduser' ...
Adding new user `hduser' (1000) with group `hadoop' ...
Creating home directory `/home/hduser' ...
Copying files from `/etc/skel' ...
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for hduser
Enter the new value, or press ENTER for the default
Full Name []:
Room Number []:
Work Phone []:
Home Phone []:
Other []:
Is the information correct? [Y/n] y
Create and Setup SSH Certificates
apt-get install ssh
su hduser
ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
39:ac:ed:aa:cd:b0:34:7f:97:31:c2:18:e2:1c:cf:ea hduser@server1
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| |
| o .. . |
| o = +S |
| o +oo.o |
| +.. .. + |
| ..B .. o |
| .E.=o.. |
+-----------------+
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is 7f:f0:56:ee:e4:f1:f6:1e:04:30:2d:f8:e8:e7:f4:8e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Password:
Welcome to Ubuntu 14.04.3 LTS (GNU/Linux 3.13.0-74-generic x86_64)
* Documentation: https://help.ubuntu.com/
The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.
Install Hadoop
cd /usr/local/src
wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
tar -xzvf hadoop-2.6.0.tar.gz
sudo adduser hduser sudo
Adding user `hduser' to group `sudo' ...
Adding user hduser to group sudo
Done.
sudo su hduser
sudo mkdir /usr/local/hadoop
sudo mv * /usr/local/hadoop
sudo chown -R hduser:hadoop /usr/local/hadoop
Setup Configuration Files
The following files will have to be modified to complete the Hadoop setup:
~/.bashrc
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
/usr/local/hadoop/etc/hadoop/core-site.xml
/usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/hdfs-site.xml
vi ~/.bashrc
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END
source ~/.bashrc
javac -version
javac 1.7.0_95
which javac
/usr/bin/javac
readlink -f /usr/bin/javac
/usr/lib/jvm/java-7-openjdk-amd64/bin/javac
vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
sudo mkdir -p /app/hadoop/tmp
sudo chown hduser:hadoop /app/hadoop/tmp
vi /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
vi /usr/local/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hadoop_store
vi /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Format the New Hadoop Filesystem
Reboot the machine before format hadoop
cd /usr/local/hadoop/bin
su hduser
hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
16/03/14 22:52:05 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = server1.ussg.com/119.81.98.115
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.0
STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop:....
............................................................
...........................................................
16/03/14 22:52:06 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at server1.ussg.com/119.81.98.115
************************************************************/
Starting Hadoop
cd /usr/local/hadoop/sbin
sudo su hduser
hduser@ubuntu-hadoop:/usr/local/hadoop/bin$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
16/03/15 13:01:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-ubuntu-hadoop.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-ubuntu-hadoop.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is 42:68:e7:3d:d7:9d:ba:97:d3:9d:cf:1f:f3:c1:df:82.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-ubuntu-hadoop.out
16/03/15 13:01:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-ubuntu-hadoop.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-ubuntu-hadoop.out
jps
1665 DataNode
2040 ResourceManager
2175 NodeManager
1532 NameNode
2212 Jps
1895 SecondaryNameNode
netstat -plten | grep java
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 127.0.0.1:38612 0.0.0.0:* LISTEN 1001 16153 1665/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 1001 15012 1532/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 1001 16150 1665/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 1001 16558 1665/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 1001 16563 1665/java
tcp 0 0 127.0.0.1:54310 0.0.0.0:* LISTEN 1001 15484 1532/java
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 1001 17713 1895/java
tcp6 0 0 :::49138 :::* LISTEN 1001 21511 2175/java
tcp6 0 0 :::8088 :::* LISTEN 1001 21523 2040/java
tcp6 0 0 :::8030 :::* LISTEN 1001 18902 2040/java
tcp6 0 0 :::8031 :::* LISTEN 1001 18895 2040/java
tcp6 0 0 :::8032 :::* LISTEN 1001 21507 2040/java
tcp6 0 0 :::8033 :::* LISTEN 1001 21536 2040/java
tcp6 0 0 :::8040 :::* LISTEN 1001 21518 2175/java
tcp6 0 0 :::8042 :::* LISTEN 1001 21522 2175/java
Stopping Hadoop
hduser@ubuntu-hadoop:/usr/local/hadoop/bin$ stop-all.sh
This script is Deprecated. Instead use stop-dfs.sh and stop-yarn.sh
16/03/15 13:03:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Stopping namenodes on [localhost]
localhost: stopping namenode
localhost: stopping datanode
Stopping secondary namenodes [0.0.0.0]
0.0.0.0: stopping secondarynamenode
16/03/15 13:04:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
stopping yarn daemons
stopping resourcemanager
localhost: stopping nodemanager
no proxyserver to stop
Hadoop Web Interfaces
NameNode
http://localhost:50070
DataNode
http://localhost:50070
SecondaryNameNode
http://localhost:50090
Ref :- http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php
https://www.youtube.com/watch?v=SaVFs_iDMPo