场景:
同事操作失误,将agent节点误删了
解决方法
1.对比正常agent节点与被删除节点安装包差异
yum list installed |grep cloudera
2.通过和正常的服务器对比我们发现,丢失的只有cloudera-manager-agent.x86_64
3.查看yum源中所有cloudera-manager组件:
yum search cloudera-manager
我们可以发现 cloudera-manager-agent.x86_64 是我们删除的agent包
4.安装:cloudera-manager-agent.x86_64
yum install cloudera-manager-agent.x86_64
5.启动agent节点
systemctl start cloudera-scm-agent
启动失败,查看日志,我们发现他连接的是 localhost:7182 而不是 server端的ip
[14/Jan/2020 14:02:14 +0000] 10599 MainThread agent ERROR Heartbeating to localhost:7182 failed.
Traceback (most recent call last):File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1390, in _send_heartbeatself.cfg.master_port)File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__self.conn.connect()File "/usr/lib64/python2.7/httplib.py", line 833, in connectself.timeout, self.source_address)File "/usr/lib64/python2.7/socket.py", line 571, in create_connectionraise err
error: [Errno 111] Connection refused
[14/Jan/2020 14:02:14 +0000] 10599 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.00 min:0.00 mean:0.00 max:0.00 LIFE_MAX:0.00
6.修改 cloudera-scm-agent 连接的 cloudera-scm-server 配置
vi /etc/cloudera-scm-agent/config.ini
# 修改cm的ip。
server_host=192.168.2.111
7.重启agent服务
systemctl restart cloudera-scm-agent