Hadoop集群部署教程-P6
Hadoop集群部署教程(续)
第二十一章:监控与告警系统集成
21.1 Prometheus监控体系搭建
-
Exporter部署:
# 部署HDFS Exporter wget https://github.com/prometheus/hdfs_exporter/releases/download/v1.1.6/hdfs_exporter-1.1.6.linux-amd64.tar.gz tar -xzf hdfs_exporter-1.1.6.linux-amd64.tar.gz nohup ./hdfs_exporter --namenode.address=master:9870 &
-
关键监控指标:
- HDFS存储容量使用率
- DataNode存活状态
- YARN资源分配率
21.2 Grafana可视化配置
-
仪表盘模板导入:
# 导入Hadoop官方模板(ID: 12239) grafana-cli plugins install grafana-piechart-panel
-
告警规则示例:
# alert_rules.yml groups: - name: HDFS-Alertsrules:- alert: HDFSStorageCriticalexpr: hdfs_capacity_used_percent > 90for: 5mlabels:severity: critical
第二十二章:备份与灾难恢复
22.1 元数据备份方案
-
NameNode元数据备份:
# 创建检查点备份 hdfs dfsadmin -fetchImage /backup/namenode/latest.fsimage # 定期合并edits日志[^1] hdfs dfsadmin -rollEdits
-
自动化备份脚本:
#!/bin/bash BACKUP_DIR="/backup/<span class="katex--inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>d</mi><mi>a</mi><mi>t</mi><mi>e</mi><mo>+</mo><mi>m</mi><mi>k</mi><mi>d</mi><mi>i</mi><mi>r</mi><mo>−</mo><mi>p</mi></mrow><annotation encoding="application/x-tex">(date +%Y%m%d)" mkdir -p </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"><span class="mopen">(</span><span class="mord mathnormal">d</span><span class="mord mathnormal">a</span><span class="mord mathnormal">t</span><span class="mord mathnormal">e</span><span class="mspace" style="margin-right:0.2222em;"><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="base"><span class="strut" style="height:0.7778em;vertical-align:-0.0833em;"><span class="mord mathnormal" style="margin-right:0.03148em;">mk</span><span class="mord mathnormal">d</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mspace" style="margin-right:0.2222em;"><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"><span class="mord mathnormal">p</span></span></span></span></span>BACKUP_DIR hdfs dfsadmin -fetchImage <span class="katex--inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>B</mi><mi>A</mi><mi>C</mi><mi>K</mi><mi>U</mi><msub><mi>P</mi><mi>D</mi></msub><mi>I</mi><mi>R</mi><mi>s</mi><mi>c</mi><mi>p</mi><mo>−</mo><mi>r</mi></mrow><annotation encoding="application/x-tex">BACKUP_DIR scp -r </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"><span class="mord mathnormal" style="margin-right:0.05017em;">B</span><span class="mord mathnormal">A</span><span class="mord mathnormal" style="margin-right:0.07153em;">C</span><span class="mord mathnormal" style="margin-right:0.07153em;">K</span><span class="mord mathnormal" style="margin-right:0.10903em;">U</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.13889em;">P</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.02778em;">D</span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.07847em;">I</span><span class="mord mathnormal" style="margin-right:0.00773em;">R</span><span class="mord mathnormal">sc</span><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em;"><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="base"><span class="strut" style="height:0.4306em;"><span class="mord mathnormal" style="margin-right:0.02778em;">r</span></span></span></span></span>BACKUP_DIR secondary_nn:/remote_backup/ </span></span></span></span></span></span></span></span></span></span></span></span></span>
22.2 数据恢复流程
-
灾难恢复步骤:
graph TD A[停止集群服务] --> B[恢复fsimage] B --> C[应用edits日志] C --> D[启动NameNode] D --> E[验证数据完整性]
第二十三章:性能调优实战
23.1 MapReduce参数优化
-
内存配置公式:
mapreduce.map.memory.mb = min(yarn.nodemanager.resource.memory-mb / containers-per-node,8GB # 经验值上限 )
-
Shuffle阶段优化:
<!-- mapred-site.xml --> <property><name>mapreduce.task.io.sort.mb</name><value>512</value> <!-- 提高排序内存 --> </property> <property><name>mapreduce.reduce.shuffle.parallelcopies</name><value>20</value> <!-- 增加并行拷贝数 --> </property>
23.2 硬件级优化建议
-
|磁盘配置方案:|||
磁盘类型 适用场景 RAID级别 SSD JournalNode RAID1 HDD DataNode存储 JBOD -
网络拓扑优化:
# 配置机架感知 /etc/hadoop/conf/topology.sh
第二十四章:版本迁移指南
24.1 滚动升级流程
-
兼容性检查清单:
# 验证HDFS版本 hdfs dfsadmin -report | grep 'Storage type' # 检查API兼容性 hadoop checknative
-
分阶段升级步骤:
# 第一阶段:升级工具节点 sudo yum upgrade hadoop-client # 第二阶段:升级DataNodes pdsh -w datanode[1-10] "sudo yum upgrade hadoop-hdfs-datanode"
24.2 回滚机制
-
版本回退操作:
# 停止服务 systemctl stop hadoop-yarn-resourcemanager # 降级安装 yum downgrade hadoop-3.3.1 -y # 恢复配置 cp /backup/core-site.xml /etc/hadoop/conf/