环境说明:
系统都为CentOS 6.0
192.168.255.128 server01
192.168.255.130 server02
192.168.255.131 server03
1. server01设置hosts文件
- vi /etc/hosts
- # 添加以下主机和IP的对应关系
- # hadoop
- 192.168.255.128 server01
- 192.168.255.130 server02
- 192.168.255.131 server03
2. server01安装java
- yum groupinstall "Additional Development"
- # 或者
- yum install java-1.6.0-openjdk-devel
在"Additional Development"组总包含了openjdk的环境。
3. server01设置java环境
- vi /etc/profile.d/java.sh
- # hadoop
- export JAVA_HOME="/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64"
- export PATH=$PATH:$JAVA_HOME/bin
- export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar
4. server01验证JDK是否成功
- java -version
- java version "1.6.0_22"
- OpenJDK Runtime Environment (IcedTea6 1.10.4) (rhel-1.42.1.10.4.el6_2-x86_64)
- OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
5. server01添加hadoop用户,并设置密码
- useradd hadoop
- passwd hadoop
6. 创建目录
- mkdir -p /data/hadoop/{tmp,hdfs/{name,data}}
- chown -R hadoop:hadoop /data/hadoo
目录的所有这为hadoop。
7. server01创建.ssh目录和密钥并添加环境变量
- su - hadoop
- ssh-keygen -t rsa
hadoop用户添加环境变量
- vi .bashrc
- # hadoop
- export HADOOP_HOME=/usr/local/hadoop
- export PATH=$PATH:$HADOOP_HOME/bin
8. 在server02和server03上重复此操作
可以使用Xshell的"to all sessions"的功能同时执行以上命令。
9. 配置server01密钥访问server01,server02和server03
a. 配置server01访问server01
- su - hadoop
- cd .ssh
- # 复制server01公钥的内容
- vi id_rsa.pub
- ...
- # 创建authorized_keys2
- vi authorized_keys2
- # 粘贴server01的id_rsa.pub文件的内容
- chmod 0400 authorized_keys2
登录测试
- whoami
- hadoop
- ssh hadoop@server01
b. 配置server01访问server02
- su - hadoop
- cd .ssh
- vi authorized_keys2
- # 粘贴server01的id_rsa.pub文件的内容
- chmod 0400 authorized_keys2
测试从server01访问server02
- whoami
- hadoop
- ssh hadoo@server02
c. 配置server01访问server03
重复b步骤
10. server01下载hadoop包(注意,这些步骤只要在server01上执行,之后通过scp复制到server02和server03上即可)
- wget http://mirror.bjtu.edu.cn/apache/hadoop/common/hadoop-0.20.205.0/hadoop-0.20.205.0.tar.gz -P /usr/local/src
解压,安装
- cd /usr/local/src
- tar zxvf hadoop-0.20.205.0.tar.gz
- mv hadoop-0.20.205.0 /usr/local/hadoop
- chown -R hadoop:hadoop /usr/local/hadoop
11. server01配置hadoop配置文件
- su - hadoop
- whoami
- # 确认是hadoop用户
- hadoop
a. 设置hadoop环境变量
- vi conf/hadoop-env.sh
- export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64
配置文件分为core-site.xml,hdfs-site.xml和mapred-site.xml
b. 配置core-site.xml
- <configuration>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://server01:9000</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/data/hadoop/tmp</value>
- </property>
- </configuration>
c. 配置hdfs-site.xml
- vi conf/hdfs-site.xml
- <configuration>
- <property>
- <name>dfs.replication</name>
- <value>3</value>
- </property>
- <property>
- <name>dfs.name.dir</name>
- <value>/data/hadoop/hdfs/name</value>
- </property>
- <property>
- <name>dfs.data.dir</name>
- <value>/data/hadoop/hdfs/data</value>
- </property>
- </configuration>
d. 设置mapred-site.xml
- vi conf/mapred-site.xml
- <configuration>
- <property>
- <name>mapred.job.tracker</name>
- <value>server01:9001</value>
- </property>
- </configuration>
e. 设置master和slave文件
- vi conf/master
- server01
- vi conf/slave
- server02
- server03
注意以上步骤只要在server01上执行即可。
11. server02和server03安装hadoop包
- cd /usr/local/
- tar czvf ~/hadoop.tar.gz hadoop/
- # 复制包到server02和server03的主目录
- scp ~/hadoop.tar.gz hadoop@server02:~/
- scp ~/hadoop.tar.gz hadoop@server03:~/
解压安装
- ssh hadoop@server02(or server03)
- tar zxvf hadoop.tar.gz
- su -
- mv /home/hadoop/hadoop /usr/local
- chown -R hadoop:hadoop /usr/local/hadoop
12. server02上在hadoop用户下启动hadoop
- start-all.sh
会显示信息。
namenode初始化
- hadoop namenode -format
重新启动hadoop
- stop-all.sh
- start-all.sh
13. 测试
- hadoop dfs -mkdir test
- hadoop dfs -copyFromLocal $HADOOP_HOME/conf test
14. Web访问
设置端口隧道
访问
http://localhost:50030
Hadoop Map/Reduce Administration
http://localhost:50070
- 192.168.255.128 server01
- 192.168.255.130 server02
- 192.168.255.131 server03