一、zookeeper的概念
zookeeper是一个开源的分布式应用程序协调服务
zookeeper是用来保证数据在集群间的事务一致性
zookeeper角色与特性:
leader:接受所有Follower的提案请求并同一协调发起提案的投票,负责与所有的Follower进行内部数据交互
Follower:直接为客户端服务并参与提案的投票,同时与leader进行数据交换
Observer:直接为客户端服务但并不参与提案的投票,同时也与leader进行数据交换
zookeeper集群实验
1.1 按照zookeeper
[[email protected] ~]# tar -xf hadoop/zookeeper-3.4.13.tar.gz
[[email protected] ~]# mv zookeeper-3.4.13 /usr/local/zookeeper
[[email protected] conf]# vim zoo.cfg
server.1=hadoop-0002:2888:3888
server.2=hadoop-0003:2888:3888
server.3=hadoop-0004:2888:3888
server.4=hadoop-0001:2888:3888:observer
[[email protected] ~]# ansible-playbook 111.yml //同步zookeeper配置文件
[[email protected] ~]# mkdir /tmp/zookeeper //zoo.cfg中指定的目录
[[email protected] ~]# ansible node -m shell -a 'mkdir /tmp/zookeeper'
创建 myid 文件,id 必须与配置文件里主机名对应的 server.(id) 一致
[[email protected] ~]# echo 4 >/tmp/zookeeper/myid
[[email protected] ~]# ssh hadoop-0002 'echo 1 >/tmp/zookeeper/myid'
[[email protected] ~]# ssh hadoop-0003 'echo 2 >/tmp/zookeeper/myid'
[[email protected] ~]# ssh hadoop-0004 'echo 3 >/tmp/zookeeper/myid'
启动服务,并查看状态
[[email protected]op-0001 ~]# /usr/local/zookeeper/bin/zkServer.sh start
[[email protected] ~]# ansible node -m shell -a '/usr/local/zookeeper/bin/zkServer.sh start'
[[email protected] ~]# ansible node -m shell -a '/usr/local/zookeeper/bin/zkServer.sh status'
二、kafka
2.1 kafka是什么
2.2 Kafka集群实验
利用Zookeeper搭建一个Kafka集群
创建一个topic
模拟生产者发布消息
模拟消费者接收消息
2.2.1 安装kafka
[[email protected] ~]# tar -xf hadoop/kafka_2.12-2.1.0.tgz
[[email protected] ~]# mv kafka_2.12-2.1.0 /usr/local/kafka
[[email protected] ~]# vim /usr/local/kafka/config/server.properties
...............
broker.id=4 //范围1-255,保证数字不重复
..............
[[email protected] ~]# for i in 71 72 73; do rsync -aSH --delete /usr/local/kafka 192.168.1.$i:/usr/local/; done //拷贝到其他机器上
其他机器也更改配置,保证数字唯一
2.2.2 启动 kafka 集群,并验证
在hadoop-0002、hadoop-0003、hadoop-00014启动
[[email protected] ~]# ansible node -m shell -a '/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties'
[[email protected] ~]# jps
[[email protected] ~]# /usr/local/kafka/bin/kafka-topics.sh --create --partitions 1 --replication-factor 1 --zookeeper node3:2181 --topic aa //创建一个 topic
[[email protected] ~]# /usr/local/kafka/bin/kafka-console-producer.sh \
--broker-list node2:9092 --topic aa //模拟生产者,写一个数据
[[email protected] ~]# /usr/local/kafka/bin/kafka-console-consumer.sh \
--bootstrap-server node1:9092 --topic aa //模拟消费者,接收消息,这边会直接同步
三、Hadoop高可用
3.1 停止所有服务, kafka的实验做完之后就已经停止
3.2 新加一台机器namenode2,这里之前有一台namenode,实现高可用
把hadoop配置文件拷贝过去,同步/etc/hosts,给私钥(ansible部署)
[[email protected] ~]# rm -rf /var/hadoop/* //所有的主机删除/var/hadoop/*
[[email protected] ~]# ansible node -m shell -a 'rm -rf /var/hadoop/*'
3.3 配置 core-site
[[email protected] ~]# vim /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://nsd1911</value>
<description>use file system</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/hadoop</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-0002:2181,hadoop-0003:2181,hadoop-0004:2181</value>
</property>
<property>
<name>hadoop.proxyuser.nfsuser.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.nfsuser.hosts</name>
<value>*</value>
</property>
</configuration>
3.4 配置 hdfs-site
[[email protected] ~]# vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>nsd1911</value>
</property>
<property>
<name>dfs.ha.namenodes.nsd1911</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nsd1911.nn1</name>
<value>hadoop-0001:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nsd1911.nn2</name>
<value>namenode2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.nsd1911.nn1</name>
<value>hadoop-0001:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.nsd1911.nn2</name>
<value>namenode2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-0002:8485;hadoop-0003:8485;hadoop-0004:8485/nsd1911</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/var/hadoop/journal</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.nsd1911</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
3.5 配置yarn文件
[[email protected] ~]# vim /usr/local/hadoop/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop-0002:2181,hadoop-0003:2181,hadoop-0004:2181</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-ha</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop-0001</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>namenode2</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
同步到所有机器上(是由ansible同步)
3.6 验证
确保所有机器上的/var/hadoop/*都删除数据
初始化
[[email protected] hadoop]# ./bin/hdfs zkfc -formatZK
[[email protected] hadoop]# ansible node -m shell -a '/usr/local/hadoop/sbin/hadoop-daemon.sh start journalnode'
[[email protected] hadoop]# ./bin/hdfs namenode -format
[[email protected] hadoop]# scp -r /var/hadoop/dfs/ 192.168.1.76:/var/hadoop/
[[email protected] hadoop]# ./bin/hdfs namenode -initializeSharedEdits
[[email protected] hadoop]# ansible node -m shell -a '/usr/local/hadoop/sbin/hadoop-daemon.sh stop journalnode'
启动集群
[[email protected] hadoop]# ./sbin/start-dfs.sh
[[email protected] hadoop]# ./sbin/start-yarn.sh
[[email protected] hadoop]# ./sbin/yarn-daemon.sh start resourcemanager
查看状态
[[email protected] hadoop]# ./bin/hdfs haadmin -getServiceState nn2
[[email protected] hadoop]# ./bin/hdfs haadmin -getServiceState nn1
[[email protected] hadoop]# ./bin/yarn rmadmin -getServiceState rm1
[[email protected] hadoop]# ./bin/yarn rmadmin -getServiceState rm
[[email protected] hadoop]# ./bin/hdfs dfsadin -report //能看到三个
[[email protected] hadoop]# ./bin/yarn node -list //能看到三个
[[email protected] hadoop]# ./bin/hadoop fs -mkdir /input //访问集群文件
[[email protected] hadoop]# ./bin/hadoop fs -ls /
[[email protected] hadoop]# ./sbin/hadoop-daemon.sh stop namenode //主从切换Activete