hadoop 3.2.1 僞分布與分布式環境建構
标簽(空格分隔): 大資料運維專欄
- 一:環境準備
- 二:安裝Hadoop3.2.1
- 三:運作一個wordcount 測試
- 四:Hadoop 分布式建構與測試
一:環境準備:
1.1 系統環境介紹
系統:CentOS7.5x64
主機名:
cat /etc/hosts
------
192.168.11.192 fat01.flyfish.com
192.168.11.195 fat02.flyfish.com
192.168.11.197 fat03.flyfish.com
-------
jdk版本: jdk1.8.181
hadoop版本 : hadoop3.2.1
1.2 部署jdk1.8.181
rpm -ivh jdk-8u181-linux-x64.rpm
vim /etc/profile
---
## JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64
export CLASSPATH=$JAVA_HOME/jre/lib:$JAVA_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
---
source /etc/profile
java -version
二: 安裝Hadoop3.2.1 配置僞分布
2.1 準備下載下傳Hadoop3.2.1 版本
下載下傳:
wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
解壓:
tar -zxvf hadoop-3.2.1.tar.gz
2.2 設定Hadoop 安裝目錄
mkdir -p /opt/bigdata ## 将Hadoop 安裝到 /opt/bigdata 目錄下面
mv hadoop-3.2.1 /opt/bigdata/hadoop
2.3 配置Hadoop 的配置檔案
2.3.1 編譯core-site.xml
編譯 core-site.xml
cd /opt/bigdata/hadoop/etc/hadoop/
vim core-site.xml
-----
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/bigdata/hadoop/data</value>
<description>hadoop_temp</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://fat01.flyfish.com:8020</value>
<description>hdfs_derect</description>
</property>
</configuration>
----
2.3.2 編輯hdfs-site.xml
vim hdfs-site.xml
---
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>num</description>
<name>dfs.namenode.http-address</name>
<value>fat01.flyfish.com:50070</value>
</property>
</configuration>
---
2.3.3 編輯mapred-site.xml
vim mapred-site.xml
----
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>fat01.flyfish.com:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/opt/bigdata/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/opt/bigdata/hadoop</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/opt/bigdata/hadoop</value>
</property>
</configuration>
-----
2.3.4 編輯yarn-site.xml
vim yarn-site.xml
----
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
-----
2.4 關于jdk 的環境變量配置
echo "export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64" >> hadoop-env.sh
echo "export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64" >> mapred-env.sh
echo "export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64" >> yarn-env.sh
2.5 格式化 namenode檔案系統:
cd /opt/bigdata/hadoop
bin/hdfs namenode -format
2.6 啟動Hadoop 的 hdfs
bin/hdfs --daemon start namenode
bin/hdfs --daemon start datanode
浏覽器通路:
http://192.168.11.192:50070
2.7 啟動Hadoop的yarn
bin/yarn --daemon start resourcemanager
bin/yarn --daemon start nodemanager
2.8 啟動historyserver
bin/mapred --daemon start historyserver
cd /opt/bigdata/hadoop
bin/hdfs dfs -mkdir /input
bin/hdfs dfs -put word.txt /input
bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /input /output1
bin/hdfs dfs -get /output1
四:Hadoop 分布建構
4.1 分布建構
承接上文僞分布基礎上部署:
角色配置設定圖:
cd /opt/bigdata/hadoop/etc/
vim core-site.xml
----
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/bigdata/hadoop/data</value>
<description>hadoop_temp</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://fat01.flyfish.com:8020</value>
<description>hdfs_derect</description>
</property>
</configuration>
----
編輯hdfs-site.xml 檔案:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.http-address</name>
<value>fat01.flyfish.com:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>fat02.flyfish.com:50090</value>
</property>
</configuration>
編輯mared-site.xml
vim mared-site.xml
-----
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>fat02.flyfish.com:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>fat02.flyfish.com:19888</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/opt/bigdata/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/opt/bigdata/hadoop</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/opt/bigdata/hadoop</value>
</property>
</configuration>
-----
編輯yarn-site.xml
vim yarn-site.xml
----
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>fat02.flyfish.com</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
----
編輯hadoop-env.sh 檔案:
vim hadoop-env.sh
---
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64
export HADOOP_PID_DIR=/opt/bigdata/hadoop/data/tmp
export HADOOP_SECURE_DN_PID_DIR=/opt/bigdata/hadoop/data/tmp
----
編輯mapred-env.sh 檔案:
vim mapred-env.sh
----
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64
export HADOOP_MAPRED_PID_DIR=/opt/bigdata/hadoop/data/tmp
----
編輯yarn-env.sh 檔案:
vim yarn-env.sh
----
export JAVA_HOME=/usr/java/jdk1.8.0_181-amd64
-----
編輯works 檔案
vim works
---
fat01.flyfish.com
fat02.flyfish.com
fat03.flyfish.com
---
同步所有節點
cd /opt/
tar –zcvf bigdata.tar.gz bigdata
scp bigdata.tar.gz [email protected]:/opt/
scp bigdata.tar.gz [email protected]:/opt/
然後到每個節點上面去 解壓這個 bigdata.tar.gz
删掉原有節點上面格式資料
cd /opt/bigdata/hadoop/data
rm -rf *
4.2 格式化資料節點
fat01.flyfish.com 主機上執行:
cd /opt/bigdata/hadoop/bin
./hdfs namenode –format
啟動之前先決定 啟動使用者是是是什麼這邊預設是root
cd /opt/bigdata/hadoop/sbin
對于start-dfs.sh和stop-dfs.sh檔案,添加下列參數:
----
#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
----
vim start-yarn.sh
---
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
----
4.3 啟動角色
啟動hdfs
fat01.flyfish.com 主機上執行:
cd hadoop/sbin/
./start-dfs.sh
打開 hdfs web
http://192.168.11.192:50070
啟動yarn
fat02.flyfish.com
cd hadoop/sbin/
start-yarn.sh
啟動historyserver
mapred --daemon start jobhistoryserver
打開yarn的界面:
http://192.168.11.195:8088
打開jobhistoryserver 的頁面
http://192.168.11.195:19888
啟動角色對比
fat01.flyfish.com
jps
fat02.flyfish.com
jps
fat03.flyfish.com
jps
至此 Hadoop 分布式環境搭建完成
4.4 關于啟動job測試
cd /opt/bigdata/hadoop
bin/hdfs dfs -mkdir /input
bin/hdfs dfs -put word.txt /input
bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /input /output1
hdfs dfs -ls /input
hdfs dfs -cat /output1/part-r-00000