天天看点

cloudera manager 恢复_误删clouderascmagent恢复

cloudera manager 恢复_误删clouderascmagent恢复

之前在测试集群在折腾Cloudera Manager,有一次误把cloudera-scm-agent给删了。原因是卸载httpd的时候,没有发现cloudera-scm-agent依赖http服务,卸载的时候连同cloudera-scm-agent一起给删了。那次我重新安装了cloudera-manager-agent,反复折腾,CM就是无法发现这台主机。无奈之下,由于是测试集群,我就重装了一遍Cloudera Manager。

仔细一想,分布式集群,挂了一台从节点,按道理从节点恢复后,根据IP或者主机名,从节点应该能连接上主结点的,不可能需要重装。难道出在连接IP或者主机名的过程中。

后来仔细看了这个节点的cloudera-scm-agent.log日志,发现原来真是IP的问题

[13/Sep/2020 05:01:33 +0800] 22503 MainThread agent        ERROR    Heartbeating to localhost:7182 failed.Traceback (most recent call last):  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1390, in _send_heartbeat    self.cfg.master_port)  File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__    self.conn.connect()  File "/usr/lib64/python2.7/httplib.py", line 833, in connect    self.timeout, self.source_address)  File "/usr/lib64/python2.7/socket.py", line 571, in create_connection    raise errerror: [Errno 111] Connection refused[13/Sep/2020 05:01:55 +0800] 22503 MainThread heartbeat_tracker INFO     HB stats (seconds): num:1 LIFE_MIN:0.00 min:0.00 mean:0.00 max:0.00 LIFE_MAX:0.00
           

单独启动cloudera-scm-agent后,连接的是 localhost:7182 而不是 server端的ip

于是我们需要修改cloudera-scm-agent连接的cloudera-scm-server配置

[[email protected] cloudera-scm-agent]# vim /etc/cloudera-scm-agent/config.ini# Configuration file for cloudera-scm-agent.# Please note that this file supports multi-line values.  Multi-line# values are indicated by indenting following lines with a space.## If you have whitespace in front of a parameter name, it will be# read as a continuation of the previous parameter value.  Please# be careful not to leave spaces in front of parameter names.## To check if this file has spaces in front of parameters names# you can do a grep like this:#  grep '^[[:blank:]]' /etc/cloudera-scm-agent/config.ini[General]# Hostname of the CM server.server_host=192.168.0.171# Port that the CM server is listening on.server_port=7182
           

然后重启cloudera-scm-agent就可以了

systemctl restart cloudera-scm-agent