天天看點

高可用環境搭建hadoop配置zookeeper,hdfs,yarn啟動

文章目錄

    • 軟體包,hadoop使用者準備
    • 多台機器無密碼通路(傳檔案需要輸入密碼麻煩)
    • zookeeper部署
  • hadoop配置
    • core-site
    • hdfs-site
    • slaves
    • mapred-site
    • yarn-site
  • zookeeper,hdfs,yarn啟動
    • 啟動hadoop
    • web界面檢視
    • 啟動和停止叢集順序

軟體包,hadoop使用者準備

此次實驗使用阿裡雲3台雲主機,指令前沒有機名的是對3台機同時做操作。

對于三台機都建立hadoop使用者作為我們高可用環境的使用者,在software下放軟體包

[[email protected] ~]# useradd hadoop
[[email protected] ~]# su - hadoop
[[email protected] ~]$ mkdir software app data lib source
[[email protected] ~]$ ll
total 20
drwxrwxr-x 2 hadoop hadoop 4096 Nov 26 16:30 app     放安裝好的軟體
drwxrwxr-x 2 hadoop hadoop 4096 Nov 26 16:30 data    測試資料
drwxrwxr-x 2 hadoop hadoop 4096 Nov 26 16:30 lib       依賴包
drwxrwxr-x 2 hadoop hadoop 4096 Nov 26 16:30 software    軟體安裝包
drwxrwxr-x 2 hadoop hadoop 4096 Nov 26 16:30 source       源代碼

           

接下來上傳win下載下傳的軟體包到linux,上傳要用rz指令,安裝這個指令要在root使用者下

[[email protected] ~]# yum install -y lrzsz
[[email protected] ~]$ rz
[[email protected] ~]$ mv hadoop-2.6.0-cdh5.7.0.tar.gz jdk-8u45-linux-x64.gz zookeeper-3.4.6.tar.gz ./software/

           

其他機器也要上傳這些安裝包,先檢視另外兩台機的ip

[[email protected] ~]$ hostname -i
172.26.165.126
[[email protected] ~]$ hostname
hadoop002

 上傳到該ip的root使用者下的目錄裡,如果不指定,就是hadoop(就是取資料源目前操作使用者)
[[email protected] software]$ scp * [email protected]:/home/hadoop/software/
 上傳到hadoop003
[[email protected] software]$ scp * [email protected]:/home/hadoop/software/
           

3台機安裝包所屬的使用者是root,修改為hadoop

exit 退出到root
 更改包使用者和使用者組
chown -R hadoop:hadoop /home/hadoop/software/*
 清屏
clear
           

配置etc/hosts

[[email protected] ~]# vi /etc/hosts
 配置結果如下圖所示,就是把3台機的ip和機器名的對應關系寫在一個檔案裡。
 然後傳給另外兩台機器
[[email protected] ~]# scp /etc/hosts 172.26.165.126:/etc/hosts
[[email protected] ~]# scp /etc/hosts 172.26.165.128:/etc/hosts
           
高可用環境搭建hadoop配置zookeeper,hdfs,yarn啟動

多台機器無密碼通路(傳檔案需要輸入密碼麻煩)

su - hadoop
 
rm -rf .ssh
3台機器生成密鑰檔案
ssh-keygen

 進入密鑰路徑
 cd .ssh
[[email protected] .ssh]$ ll
total 8
-rw------- 1 hadoop hadoop 1671 Nov 26 18:24 id_rsa
-rw-r--r-- 1 hadoop hadoop  398 Nov 26 18:24 id_rsa.pub

 選hadoop001作為主機,把另外兩台機的公鑰檔案發到主機
 [[email protected] .ssh]$ scp id_rsa.pub [email protected]:/home/hadoop/.ssh/id_rsa.pub2
[[email protected] .ssh]$ scp id_rsa.pub [email protected]:/home/hadoop/.ssh/id_rsa.pub3

[[email protected] .ssh]$ ll
total 16
-rw------- 1 hadoop hadoop 1671 Nov 26 18:24 id_rsa
-rw-r--r-- 1 hadoop hadoop  398 Nov 26 18:24 id_rsa.pub
-rw-r--r-- 1 root   root    398 Nov 26 18:44 id_rsa.pub2
-rw-r--r-- 1 root   root    398 Nov 26 18:45 id_rsa.pub3

 彙集3機生成一個密鑰
[[email protected] .ssh]$ cat id_rsa.pub >> authorized_keys
[[email protected] .ssh]$ cat id_rsa.pub2 >> authorized_keys
[[email protected] .ssh]$ cat id_rsa.pub3 >> authorized_keys

 将生成的這個3機密鑰傳到另外兩台機
[[email protected] .ssh]$ scp authorized_keys [email protected]:/home/hadoop/.ssh/
[[email protected] .ssh]$ scp authorized_keys [email protected]:/home/hadoop/.ssh/

 改權限使用者組
exit  退回到root使用者
chown -R hadoop:hadoop /home/hadoop/.ssh/*
chown -R hadoop:hadoop /home/hadoop/.ssh
su - hadoop
cd .ssh
 3機密鑰權限修改
chmod 600 authorized_keys

 确認互相信任關系,相當于登陸到那台機,執行date
 ssh hadoop001 date
 ssh hadoop002 date
 ssh hadoop003 date
           

部署java

exit  到root使用者

 建立java存放的檔案夾,然後解壓過來
 mkdir /usr/java
 tar -xzvf /home/hadoop/software/jdk-8u45-linux-x64.gz -C /usr/java
 
 注意要修改解壓後的java使用者和使用者組
 [[email protected] java]# chown -R root:root /usr/java/jdk1.8.0_45

 配置java環境變量
 vi /etc/profile
 
 #env
export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$JAVA_HOME/bin:$PATH

 然後
[[email protected] java]# source /etc/profile
[[email protected] java]# java -version
           

解壓hadoop和zookeeper

su - hadoop
cd software
tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz -C ../app/
tar -xzvf zookeeper-3.4.6.tar.gz -C ../app/

           

修改hadoop目錄

cd   傳回家目錄
vi .bash_profile

export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper-3.4.6
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$PATH

source .bash_profile

 看看能不能切,能切說明正常
cd $HADOOP_HOME

 建幾個檔案夾
mkdir $HADOOP_HOME/data  && mkdir $HADOOP_HOME/logs &&mkdir $HADOOP_HOME/tmp

hadoop臨時目錄
chmod -R 777 $HADOOP_HOME/tmp

           

zookeeper部署

cd zookeeper-3.4.6/
cd conf
cp zoo_sample.cfg zoo.cfg

[[email protected] conf]$ vi zoo.cfg

 dataDir是日志問夾路徑
dataDir=/home/hadoop/app/zookeeper-3.4.6/data
zookeeper叢集所在設定,server.1,1代表id,就是下面myid設定的,2888端口和3888端口,内部通信端口,zookeeper之間互相通路,core-site裡面是外部組建通路端口
server.1=hadoop001:2888:3888
server.2=hadoop002:2888:3888
server.3=hadoop003:2888:3888

[[email protected] conf]$ scp zoo.cfg hadoop002:/home/hadoop/app/zookeeper-3.4.6/conf/
[[email protected] conf]$ scp zoo.cfg hadoop003:/home/hadoop/app/zookeeper-3.4.6/conf/

 呼應上面的zoo.cfg,配置機器對應的zookeeperid
 cd ../
 mkdir data
 touch data/myid
 注意>左邊要有空格
 [[email protected] zookeeper-3.4.6]$ echo 1 >data/myid
 [[email protected] zookeeper-3.4.6]$ echo 2 >data/myid
 [[email protected] zookeeper-3.4.6]$ echo 3 >data/myid
           

hadoop配置

cd hadoop-2.6.0-cdh5.7.0/etc/hadoop

hadoop依賴的java環境
[[email protected] hadoop]$ vi hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.8.0_45

[[email protected] hadoop]$ scp hadoop-env.sh hadoop002:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop
[[email protected] hadoop]$ scp hadoop-env.sh hadoop003:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop

 先删了
rm -f slaves core-site.xml hdfs-site.xml yarn-site.xml
 然後都rz 5個檔案,檔案配置如下
           

core-site

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" ?>
<configuration>
	<!--Yarn 需要使用 fs.defaultFS 指定NameNode URI -->
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://ruozeclusterg5</value>
        </property>
        <!--==============================Trash機制======================================= -->
        <property>
                <!--資源回收筒,多長時間建立CheckPoint NameNode截點上運作的CheckPointer 從Current檔案夾建立CheckPoint;預設:0 由fs.trash.interval項指定 -->
                <name>fs.trash.checkpoint.interval</name>
                <value>0</value>
        </property>
        <property>
                <!--資源回收筒,多少分鐘.Trash下的CheckPoint目錄會被删除,該配置伺服器設定優先級大于用戶端,預設:0 不删除 -->
                <name>fs.trash.interval</name>
                <value>1440</value>
        </property>

         <!--指定hadoop臨時目錄, hadoop.tmp.dir 是hadoop檔案系統依賴的基礎配置,很多路徑都依賴它。如果hdfs-site.xml中不配 置namenode和datanode的存放位置,預設就放在這>個路徑中 -->
        <property>   
                <name>hadoop.tmp.dir</name>
                <value>/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/tmp</value>
        </property>

         <!-- 指定zookeeper位址 -->
        <property>
                <name>ha.zookeeper.quorum</name>
                <value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
        </property>
         <!--指定ZooKeeper逾時間隔,機關毫秒 -->
        <property>
                <name>ha.zookeeper.session-timeout.ms</name>
                <value>2000</value>
        </property>

        <property>
           <name>hadoop.proxyuser.hadoop.hosts</name>
           <value>*</value> 
        </property> 
        <property> 
            <name>hadoop.proxyuser.hadoop.groups</name> 
            <value>*</value> 
       </property> 


      <property>
		  <name>io.compression.codecs</name>
		  <value>org.apache.hadoop.io.compress.GzipCodec,
			org.apache.hadoop.io.compress.DefaultCodec,
			org.apache.hadoop.io.compress.BZip2Codec,
			org.apache.hadoop.io.compress.SnappyCodec
		  </value>
      </property>
</configuration>
           

hdfs-site

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" ?>
<configuration>
	<!--HDFS超級使用者 -->
	<property>
		<name>dfs.permissions.superusergroup</name>
		<value>hadoop</value>
	</property>

	<!--開啟web hdfs -->
	<property>
		<name>dfs.webhdfs.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/name</value>
		<description> namenode 存放name table(fsimage)本地目錄(需要修改)</description>
	</property>
	<property>
		<name>dfs.namenode.edits.dir</name>
		<value>${dfs.namenode.name.dir}</value>
		<description>namenode存放 transaction file(edits)本地目錄(需要修改)</description>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/data</value>
		<description>datanode存放block本地目錄(需要修改)</description>
	</property>
	<property>
		<name>dfs.replication</name>
		<value>3</value>
	</property>
	<!-- 塊大小256M (預設128M) -->
	<property>
		<name>dfs.blocksize</name>
		<value>268435456</value>
	</property>
	<!--======================================================================= -->
	<!--HDFS高可用配置 -->
	<!--指定hdfs的nameservice為ruozeclusterg5,需要和core-site.xml中的保持一緻 -->
	<property>
		<name>dfs.nameservices</name>
		<value>ruozeclusterg5</value>
	</property>
	<property>
		<!--設定NameNode IDs 此版本最大隻支援兩個NameNode -->
		<name>dfs.ha.namenodes.ruozeclusterg5</name>
		<value>nn1,nn2</value>
	</property>

	<!-- Hdfs HA: dfs.namenode.rpc-address.[nameservice ID] rpc 通信位址 -->
	<property>
		<name>dfs.namenode.rpc-address.ruozeclusterg5.nn1</name>
		<value>hadoop001:8020</value>
	</property>
	<property>
		<name>dfs.namenode.rpc-address.ruozeclusterg5.nn2</name>
		<value>hadoop002:8020</value>
	</property>

	<!-- Hdfs HA: dfs.namenode.http-address.[nameservice ID] http 通信位址 -->
	<property>
		<name>dfs.namenode.http-address.ruozeclusterg5.nn1</name>
		<value>hadoop001:50070</value>
	</property>
	<property>
		<name>dfs.namenode.http-address.ruozeclusterg5.nn2</name>
		<value>hadoop002:50070</value>
	</property>

	<!--==================Namenode editlog同步 ============================================ -->
	<!--保證資料恢複 -->
	<property>
		<name>dfs.journalnode.http-address</name>
		<value>0.0.0.0:8480</value>
	</property>
	<property>
		<name>dfs.journalnode.rpc-address</name>
		<value>0.0.0.0:8485</value>
	</property>
	<property>
		<!--設定JournalNode伺服器位址,QuorumJournalManager 用于存儲editlog -->
		<!--格式:qjournal://<host1:port1>;<host2:port2>;<host3:port3>/<journalId> 端口同journalnode.rpc-address -->
		<name>dfs.namenode.shared.edits.dir</name>
		<value>qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/ruozeclusterg5</value>
	</property>

	<property>
		<!--JournalNode存放資料位址 -->
		<name>dfs.journalnode.edits.dir</name>
		<value>/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/data/dfs/jn</value>
	</property>
	<!--==================DataNode editlog同步 ============================================ -->
	<property>
		<!--DataNode,Client連接配接Namenode識别選擇Active NameNode政策 -->
                             <!-- 配置失敗自動切換實作方式 -->
		<name>dfs.client.failover.proxy.provider.ruozeclusterg5</name>
		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<!--==================Namenode fencing:=============================================== -->
	<!--Failover後防止停掉的Namenode啟動,造成兩個服務 -->
	<property>
		<name>dfs.ha.fencing.methods</name>
		<value>sshfence</value>
	</property>
	<property>
		<name>dfs.ha.fencing.ssh.private-key-files</name>
		<value>/home/hadoop/.ssh/id_rsa</value>
	</property>
	<property>
		<!--多少milliseconds 認為fencing失敗 -->
		<name>dfs.ha.fencing.ssh.connect-timeout</name>
		<value>30000</value>
	</property>

	<!--==================NameNode auto failover base ZKFC and Zookeeper====================== -->
	<!--開啟基于Zookeeper  -->
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
	<!--動态許可datanode連接配接namenode清單 -->
	 <property>
	   <name>dfs.hosts</name>
	   <value>/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop/slaves</value>
	 </property>
</configuration>
           

slaves

hadoop001
hadoop002
hadoop003

           

yarn方面

mapred-site

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" ?>
<configuration>
	<!-- 配置 MapReduce Applications -->
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<!-- JobHistory Server ============================================================== -->
	<!-- 配置 MapReduce JobHistory Server 位址 ,預設端口10020 -->
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>hadoop001:10020</value>
	</property>
	<!-- 配置 MapReduce JobHistory Server web ui 位址, 預設端口19888 -->
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>hadoop001:19888</value>
	</property>

<!-- 配置 Map段輸出的壓縮,snappy-->
  <property>
      <name>mapreduce.map.output.compress</name> 
      <value>true</value>
  </property>
              
  <property>
      <name>mapreduce.map.output.compress.codec</name> 
      <value>org.apache.hadoop.io.compress.SnappyCodec</value>
   </property>

</configuration>
           

yarn-site

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl" target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow"  target="_blank" rel="external nofollow" ?>
<configuration>
	<!-- nodemanager 配置 ================================================= -->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<property>
		<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
		<value>org.apache.hadoop.mapred.ShuffleHandler</value>
	</property>
	<property>
		<name>yarn.nodemanager.localizer.address</name>
		<value>0.0.0.0:23344</value>
		<description>Address where the localizer IPC is.</description>
	</property>
	<property>
		<name>yarn.nodemanager.webapp.address</name>
		<value>0.0.0.0:23999</value>
		<description>NM Webapp address.</description>
	</property>

	<!-- HA 配置 =============================================================== -->
	<!-- Resource Manager Configs -->
	<property>
		<name>yarn.resourcemanager.connect.retry-interval.ms</name>
		<value>2000</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
	<!-- 使嵌入式自動故障轉移。HA環境啟動,與 ZKRMStateStore 配合 處理fencing -->
	<property>
		<name>yarn.resourcemanager.ha.automatic-failover.embedded</name>
		<value>true</value>
	</property>
	<!-- 叢集名稱,確定HA選舉時對應的叢集 -->
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>yarn-cluster</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>


    <!--這裡RM主備結點需要單獨指定,(可選)
	<property>
		 <name>yarn.resourcemanager.ha.id</name>
		 <value>rm2</value>
	 </property>
	 -->

	<property>
		<name>yarn.resourcemanager.scheduler.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
	</property>
	<property>
		<name>yarn.resourcemanager.recovery.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
		<value>5000</value>
	</property>
	<!-- ZKRMStateStore 配置 -->
	<property>
		<name>yarn.resourcemanager.store.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk.state-store.address</name>
		<value>hadoop001:2181,hadoop002:2181,hadoop003:2181</value>
	</property>
	<!-- Client通路RM的RPC位址 (applications manager interface) -->
	<property>
		<name>yarn.resourcemanager.address.rm1</name>
		<value>hadoop001:23140</value>
	</property>
	<property>
		<name>yarn.resourcemanager.address.rm2</name>
		<value>hadoop002:23140</value>
	</property>
	<!-- AM通路RM的RPC位址(scheduler interface) -->
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm1</name>
		<value>hadoop001:23130</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address.rm2</name>
		<value>hadoop002:23130</value>
	</property>
	<!-- RM admin interface -->
	<property>
		<name>yarn.resourcemanager.admin.address.rm1</name>
		<value>hadoop001:23141</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address.rm2</name>
		<value>hadoop002:23141</value>
	</property>
	<!--NM通路RM的RPC端口 -->
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
		<value>hadoop001:23125</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
		<value>hadoop002:23125</value>
	</property>
	<!-- RM web application 位址 -->
	<property>
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>hadoop001:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>hadoop002:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.https.address.rm1</name>
		<value>hadoop001:23189</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.https.address.rm2</name>
		<value>hadoop002:23189</value>
	</property>

	<property>
	   <name>yarn.log-aggregation-enable</name>
	   <value>true</value>
	</property>
	<property>
		 <name>yarn.log.server.url</name>
		 <value>http://hadoop001:19888/jobhistory/logs</value>
	</property>


	<property>
		<name>yarn.nodemanager.resource.memory-mb</name>
		<value>2048</value>
	</property>
	<property>
		<name>yarn.scheduler.minimum-allocation-mb</name>
		<value>1024</value>
		<discription>單個任務可申請最少記憶體,預設1024MB</discription>
	 </property>

  
  <property>
	<name>yarn.scheduler.maximum-allocation-mb</name>
	<value>2048</value>
	<discription>單個任務可申請最大記憶體,預設8192MB</discription>
  </property>

   <property>
       <name>yarn.nodemanager.resource.cpu-vcores</name>
       <value>2</value>
    </property>

</configuration>
           

zookeeper,hdfs,yarn啟動

先啟動zookeeper

$ZOOKEEPER_HOME/bin/zkServer.sh start
zkServer.sh status
 如果是兩個follower,1個leader,則成功
           

啟動journalnode

cd app/hadoop-2.6.0-cdh5.7.0
sbin/hadoop-daemon.sh start journalnode

[[email protected] hadoop-2.6.0-cdh5.7.0]$ jps
2899 JournalNode
2950 Jps
2782 QuorumPeerMain  這是zookeeper程序名


           

啟動hadoop

第一次啟動先格式化一下,注意兩個namenode隻選取一台做hadoop格式化
 [[email protected] hadoop-2.6.0-cdh5.7.0]$ hadoop namenode -format
 然後将格式化後的檔案(datanode和namenode所在)覆寫第二個namenode所在機器,同步namenode中繼資料
 [[email protected] hadoop-2.6.0-cdh5.7.0]$ scp -r data hadoop002:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
 
 初始化zkfc,隻在hadoop001做,注意,因為一個命名空間裡面包括了hadoop001和hadoop002的hdfs位址
 [[email protected] hadoop-2.6.0-cdh5.7.0]$ hdfs zkfc -formatZK

Successfully created /hadoop-ha/ruozeclusterg5 in ZK.

 啟動hdfs
 [[email protected] hadoop-2.6.0-cdh5.7.0]$ start-dfs.sh

 報錯,slaves是dos形式,适用于win,要轉格式
           
高可用環境搭建hadoop配置zookeeper,hdfs,yarn啟動
[[email protected] hadoop-2.6.0-cdh5.7.0]$ stop-dfs.sh
 
 安裝轉格式的插件
yum install -y dos2unix
dos2unix slaves
 
 注意啟動順序
[[email protected] hadoop]$ start-dfs.sh
18/11/27 10:18:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop001 hadoop002]
hadoop001: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop001.out
hadoop002: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
hadoop002: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
hadoop001: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop001.out
hadoop003: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop003.out
Starting journal nodes [hadoop001 hadoop002 hadoop003]
hadoop002: starting journalnode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-journalnode-hadoop002.out
hadoop001: starting journalnode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-journalnode-hadoop001.out
hadoop003: starting journalnode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-journalnode-hadoop003.out
18/11/27 10:18:53 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [hadoop001 hadoop002]
hadoop002: starting zkfc, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-zkfc-hadoop002.out
hadoop001: starting zkfc, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-zkfc-hadoop001.out

 啟動yarn
 [[email protected] hadoop]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-hadoop001.out
hadoop002: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop002.out
hadoop003: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop003.out
hadoop001: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop001.out

 第二個resourcemanager需要手動啟動
 [[email protected] hadoop-2.6.0-cdh5.7.0]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-hadoop002.out
           

web界面檢視

先配置雲主機出入方向的安全組規則

高可用環境搭建hadoop配置zookeeper,hdfs,yarn啟動
高可用環境搭建hadoop配置zookeeper,hdfs,yarn啟動
高可用環境搭建hadoop配置zookeeper,hdfs,yarn啟動
如此這般,便可在網頁通路
 通路公網ip
 hadoop
 http://47.92.250.235:50070
 
 yarn
 http://47.92.250.235:50070:8088 (active)
 http://47.92.250.236:50070:8088/cluster/cluster(standby)
  
 啟動jobhistory,yarn存儲的記錄有限
 [[email protected] hadoop]$ $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver

 jobhistory在端口号19888
 
           

啟動和停止叢集順序

啟動
 zkServer.sh start
 [[email protected] sbin]# start-dfs.sh
 [[email protected] sbin]# start-yarn.sh
 [[email protected] sbin]# yarn-daemon.sh start resourcemanager
 [[email protected] ~]# $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver


 停止
 [[email protected] sbin]# stop-yarn.sh
 [[email protected] sbin]# yarn-daemon.sh stop resourcemanager
 [[email protected] sbin]# stop-dfs.sh
 zkServer.sh stop
           

繼續閱讀