阿里云服务器部署安装hadoop与elasticsearch踩坑笔记

阿里云服务器部署安装hadoop与elasticsearch踩坑笔记

news/2025/7/9 4:22:53/文章来源:https://blog.csdn.net/weixin_44949135/article/details/132863886

2023-09-12 14:00——2023.09.13 20:06

目录

00、软件版本

01、阿里云服务器部署hadoop

1.1、修改四个配置文件

1.1.1、core-site.xml

1.1.2、hdfs-site.xml

1.1.3、mapred-site.xml

1.1.4、yarn-site.xml

1.2、修改系统/etc/hosts文件与系统变量

1.2.1、修改主机名解析文件/etc/hosts

1.2.2、修改系统环境变量/etc/profile.d/my_env.sh

02、阿里云服务器部署elasticsearch

2.1、三节点的同样操作

2.2、修改es的elasticsearch.yml文件

00、软件版本

环境及软件版本：

centOS 7
jdk-1.8
hadoop-3.3.4
elasticsearch-7.17.6

01、阿里云服务器部署hadoop

按照尚硅谷的教程安装hadoop-3.3.4，尚硅谷大数据技术之Hadoop.docx。

1.1、修改四个配置文件

/opt/module/hadoop/hadoop-3.3.4/etc/hadoop

1.1.1、core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
--><!-- Put site-specific property overrides in this file. --><configuration>
<!-- 指定NameNode的地址 --><property><name>fs.defaultFS</name><value>hdfs://bd1:8020</value>
</property>
<!-- 指定hadoop数据的存储目录 --><property><name>hadoop.tmp.dir</name><value>/opt/module/hadoop/hadoop-3.3.4/data</value>
</property><!-- 配置HDFS网页登录使用的静态用户为atguigu --><property><name>hadoop.http.staticuser.user</name><value>xxh</value>
</property><!-- 配置该atguigu(superUser)允许通过代理访问的主机节点 --><property><name>hadoop.proxyuser.xxh.hosts</name><value>*</value>
</property>
<!-- 配置该atguigu(superUser)允许通过代理用户所属组 --><property><name>hadoop.proxyuser.xxh.groups</name><value>*</value>
</property>
<!-- 配置该atguigu(superUser)允许通过代理的用户--><property><name>hadoop.proxyuser.xxh.users</name><value>*</value>
</property>
</configuration>

1.1.2、hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
--><!-- Put site-specific property overrides in this file. --><configuration>
<!-- nn web端访问地址--><property><name>dfs.namenode.http-address</name><value>bd1:9870</value></property><!-- 2nn web端访问地址--><property><name>dfs.namenode.secondary.http-address</name><value>bd3:9868</value></property><!-- 测试环境指定HDFS副本的数量1 --><property><name>dfs.replication</name><value>3</value>
</property><!-- 关闭 hdfs 文件权限检查 -->
<property><name>dfs.permissions</name><value>false</value>
</property>
</configuration>

1.1.3、mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
--><!-- Put site-specific property overrides in this file. --><configuration>
<!-- 指定MapReduce程序运行在Yarn上 --><property><name>mapreduce.framework.name</name><value>yarn</value>
</property>
<!-- 历史服务器端地址 -->
<property><name>mapreduce.jobhistory.address</name><value>bd1:10020</value>
</property><!-- 历史服务器web端地址 -->
<property><name>mapreduce.jobhistory.webapp.address</name><value>bd1:19888</value>
</property>
</configuration>

1.1.4、yarn-site.xml

<?xml version="1.0"?>
<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License. See accompanying LICENSE file.
-->
<configuration><!-- Site specific YARN configuration properties -->
<!-- Site specific YARN configuration properties -->
<!-- 指定MR走shuffle --><property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property><!-- 指定ResourceManager的地址--><property><name>yarn.resourcemanager.hostname</name><value>bd2</value></property><!-- 环境变量的继承 --><property><name>yarn.nodemanager.env-whitelist</name><value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value></property><!-- yarn容器允许分配的最大最小内存 --><property><name>yarn.scheduler.minimum-allocation-mb</name><value>512</value></property><property><name>yarn.scheduler.maximum-allocation-mb</name><value>4096</value></property><!-- yarn容器允许管理的物理内存大小 --><property><name>yarn.nodemanager.resource.memory-mb</name><value>4096</value></property><!-- 关闭yarn对物理内存和虚拟内存的限制检查 --><property><name>yarn.nodemanager.pmem-check-enabled</name><value>false</value></property><property><name>yarn.nodemanager.vmem-check-enabled</name><value>false</value>
</property>
<!-- 开启日志聚集功能 -->
<property><name>yarn.log-aggregation-enable</name><value>true</value>
</property><!-- 设置日志聚集服务器地址 -->
<property><name>yarn.log.server.url</name><value>http://bd1:19888/jobhistory/logs</value>
</property><!-- 设置日志保留时间为7天 -->
<property><name>yarn.log-aggregation.retain-seconds</name><value>604800</value>
</property>                                                                                                                                                                                                                                                               
</configuration>

1.2、修改系统/etc/hosts文件与系统变量

1.2.1、修改主机名解析文件/etc/hosts

[root@bd1 ~]# vim /etc/hosts

# 外网ip地址
x.x.x.x bd1 
x.x.x.x bd2
x.x.x.x bd3# 内网ip地址（使用命令ifconfig命令进行查看）
x.x.x.x bd1
x.x.x.x bd2 
x.x.x.x bd3

1.2.2、修改系统环境变量/etc/profile.d/my_env.sh

[root@bd1 ~]# vim /etc/profile.d/my_env.sh

# HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop/hadoop-3.3.4
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

# HADOOP相关配置【重中之重，使得root用户可以直接运行hadoop】
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root

# JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_212
export PATH=$PATH:$JAVA_HOME/bin# HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop/hadoop-3.3.4
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin# zookeeper
export ZK_HOME=/opt/module/zookeeper
export PATH=$ZK_HOME/bin:$PATH# kafka
#KAFKA_HOME
export KAFKA_HOME=/opt/module/kafka
export PATH=$PATH:$KAFKA_HOME/binexport PATH=$PATH:/opt/software/tool# HADOOP相关配置
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root

02、阿里云服务器部署elasticsearch

es安装教程

Linux搭建es集群详细教程（最终版）_es集群搭建_Nick丶Xin的博客-CSDN博客
Linux安装elk_upward337的博客-CSDN博客
[2020-04-06T12:57:13,793][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-1] uncaught exce_Lan_Se_Tian_Ma的博客-CSDN博客

2.1、三节点的同样操作

三节点集群的服务器，每台服务器都需要：

创建es用户，useradd es、passwd es
安装elasticsearch，tar -zxvf elasticsearch-7.17.6-linux-x86_64.tar.gz -C /opt/module/es/
修改elasticsearch文件夹权限，chown -R es:es /opt/module/es/
修改/etc/...目录下的若干配置文件，vi /etc/security/limits.conf、vi /etc/security/limits.d/20-nproc.conf、vi /etc/sysctl.conf
修改/opt/module/es/elasticsearch-7.17.6/config/jvm.options文件。

启动elasticsearch时，需要切换到es用户，使用如下命令在后台启动es：

[es@bd1 root]$ nohup /opt/module/es/elasticsearch-7.17.6/bin/elasticsearch & # 后台运行elasticsearch
[es@bd2 root]$ nohup /opt/module/es/elasticsearch-7.17.6/bin/elasticsearch & # 后台运行elasticsearch
[es@bd3 root]$ nohup /opt/module/es/elasticsearch-7.17.6/bin/elasticsearch & # 后台运行elasticsearch

2.2、修改es的elasticsearch.yml文件

修改每台服务器的elasticsearch.yml文件（/opt/module/es/elasticsearch-7.17.6/config/elasticsearch.yml），如下两个参数的配置每台服务器都不一样：

node.name: node-1 # 节点名称，每个节点的名称不能重复
network.host: 内网ip地址 # 内网ip地址，每个节点的地址不能重复

# /opt/module/es/elasticsearch-7.17.6/config/elasticsearch.yml#es加入如下配置#集群名称
cluster.name: cluster-es-7.17.6
#节点名称，每个节点的名称不能重复
node.name: node-1
#内网ip地址，每个节点的地址不能重复
network.host: 内网ip地址
#是不是有资格主节点
node.master: true
node.data: true#http端口
http.port: 9200
# 服务通信端口
transport.port: 9300# 数据文件及日志存储路径
path.data: /opt/module/es/elasticsearch-7.17.6/data
path.logs: /opt/module/es/elasticsearch-7.17.6/logs# head 插件需要这打开这两个配置
http.cors.allow-origin: "*"
http.cors.enabled: true
http.max_content_length: 200mb
#es7.x 之后新增的配置，初始化一个新的集群时需要此配置来选举 master
cluster.initial_master_nodes: ["node-1"]
#es7.x 之后新增的配置，节点发现
discovery.seed_hosts: ["bd1:9300","bd2:9300","bd3:9300"]
gateway.recover_after_nodes: 2
network.tcp.keep_alive: true
network.tcp.no_delay: true
transport.tcp.compress: true
#集群内同时启动的数据任务个数，默认是 2 个
cluster.routing.allocation.cluster_concurrent_rebalance: 16
#添加或删除节点及负载均衡时并发恢复的线程个数，默认 4 个
cluster.routing.allocation.node_concurrent_recoveries: 16
#初始化数据恢复时，并发恢复线程的个数，默认 4 个
cluster.routing.allocation.node_initial_primaries_recoveries: 16

😊😘加油~

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/83751.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

相关文章

等级保护——Linux命令大全

等级保护——Linux命令大全

等级保护——Linux命令大全 1. 基本命令 uname -m 显示机器的处理器架构 uname -r 显示正在使用的内核版本 dmidecode -q 显示硬件系统部件 (SMBIOS / DMI) hdparm -i /dev/hda 罗列一个磁盘的架构特性 hdparm -tT /dev/sda 在磁盘上执行测试性读取操作系统信息 arch 显示机器…

阅读更多...

Linux内核顶层Makefile前期工作分析一

Linux内核顶层Makefile前期工作分析一

一. Linux内核顶层Makefile Linux 的顶层 Makefile 和 uboot 的顶层 Makefile 非常相似，因为 uboot 参考了 Linux。二. Linux内核-顶层Makefile前期工作下面了解一下 Linux内核的顶层 Makefile前期所做的工作。 1、版本号顶层 Makefile 一开始就是 Linux…

阅读更多...

基于ENC28J60+uIP1.0+STM32的UDP Server实现，以及主动发送数据，几个关键的问题可算整明白了！

基于ENC28J60+uIP1.0+STM32的UDP Server实现，以及主动发送数据，几个关键的问题可算整明白了！

ENC28J60，是一款SPI接口的以太网PHYMAC芯片，实现以太网物理层和MAC层硬件通信。uIP是一个TCP/IP软件协议栈，实现TCP、UDP、ARP、ICMP等网络协议。STM32F103RCT6通过SPI接口与ENC28J60通讯，并移植uIP协议，实现一个小型的…

阅读更多...

利用Linux虚拟化技术实现资源隔离和管理

利用Linux虚拟化技术实现资源隔离和管理

在现代计算机系统中，资源隔离和管理是非常重要的，特别是在多租户环境下。通过利用Linux虚拟化技术，我们可以实现对计算资源（如CPU、内存和存储）的隔离和管理，以提供安全、高效、稳定的计算环境。下面将详细…

阅读更多...

C++ 【2】

C++ 【2】

1.指针基础字符：C 一个字符占一个字节在C中 << 这个为插入运算符 >> 这个为提取运算符一个变量的地址称为该变量的指针；如果在程序中定义了一个变量或者数组， 那么，这个变量或数组的地址（指针…

阅读更多...

javax.net.ssl.SSLException: Connection reset

javax.net.ssl.SSLException: Connection reset

代码 https://www.cnblogs.com/colder/p/16612582.html httpClient HttpClients.custom().setDefaultRequestConfig(config).setConnectionReuseStrategy(NoConnectionReuseStrategy.INSTANCE).setConnectionManager(poolingConnManager).build();解决NoHttpResponseExcepti…

阅读更多...

如何将内网ip映射到外网？快解析内网穿透

如何将内网ip映射到外网？快解析内网穿透

关于内网ip映射到外网的问题，就是网络地址转换，私网借公网。要实现这个，看起来说得不错，实际上是有前提条件的。要实现内网ip映射到外网，首先要有一个固定的公网IP，可以从运营商那里得到。当你得到公网IP后…

阅读更多...

Flink——Flink检查点（checkpoint）、保存点（savepoint）的区别与联系

Flink——Flink检查点（checkpoint）、保存点（savepoint）的区别与联系

Flink checkpoint Checkpoint是Flink实现容错机制最核心的功能，能够根据配置周期性地基于Stream中各个Operator的状态来生成Snapshot，从而将这些状态数据定期持久化存储下来，从而将这些状态数据定期持久化存储下来，当Flink程序一…

阅读更多...

FPGA设计时序约束一、主时钟与生成时钟

FPGA设计时序约束一、主时钟与生成时钟

目录一、主时钟create_clock 1.1 定义 1.2 约束设置格式 1.3 Add this clock to the existing clock 1.4 示例 1.5 差分信号二、生成时钟generate_clock 2.1 定义 2.2 格式 2.2.1 by clock frequency 2.2.2 by clock edges 2.2.3 示例 2.2.4 自动生成时钟 2.…

阅读更多...

MongoDB-1入门介绍

MongoDB-1入门介绍

NoSQL NoSQL(NoSQL Not Only SQL)，意即反SQL运动，指的是非关系型的数据库优点 1、对数据库高并发读写。 2、对海量数据的高效率存储和访问。 3、对数据库的高可扩展性和高可用性。弱点： 1、数据库事务一致性需求 2、数据库的写实时性…

阅读更多...

Python 使用函数作为返回值

Python 使用函数作为返回值

视频版教程 Python3零基础7天入门实战视频教程 Python还支持使用函数作为其他函数的返回值 def test(bol):if bol:return addelse:return subdef add(x, y):return x ydef sub(x, y):return x - yb1 test(True) print(b1, b1(1, 2)) b2 test(False) print(b2, b2(1, 2))运…

阅读更多...

flink集群与资源@k8s源码分析-集群

flink集群与资源@k8s源码分析-集群

0 介绍本文是flink集群与资源@k8s源码分析系列的第二篇-集群 1 场景下面详细分析各用例 2 启动k8s集群 k8s集群支持session和application模式，job模式将会被废弃，本文分析session模式集群 Configuration作为配置容器，几乎所有的构建需要从配置类获取配置项，这里不显示…

阅读更多...

nginx 配置 ssl

nginx 配置 ssl

1.1 Nginx如果未开启SSL模块，配置Https时提示错误原因也很简单，nginx缺少http_ssl_module模块，编译安装的时候带上--with-http_ssl_module配置就行了，但是现在的情况是我的nginx已经安装过了，怎么添加模块&#xff0…

阅读更多...

将docker镜像打成tar包

将docker镜像打成tar包

# 打包 docker save -o zookeeper.tar bitnami/zookeeper:3.9.0-debian-11-r11# 解压 docker load -i zookeeper.tar

阅读更多...

day27IO(异常File综合案例）

day27IO(异常File综合案例）

1. 异常 1.1 异常概念异常，就是不正常的意思。在生活中:医生说,你的身体某个部位有异常,该部位和正常相比有点不同,该部位的功能将受影响.在程序中的意思就是： 异常 ：指的是程序在执行过程中，出现的非正常的情况，最…

阅读更多...

事务碰上锁好似那油锅里进了火

事务碰上锁好似那油锅里进了火

目录前言场景代码复现提出疑问该怎么解决呢 1.使用编程式事务 2.将事务独立出一个方法前言很多时候我们谈起事务都是如虎色变，一想起来都是脑袋懵懵的事务的隔离级别及传播机制是什么Spring的事务底层实现原理了解吗哪几种情况下事务会失效 …

阅读更多...

Web jQuery 事件与其他

Web jQuery 事件与其他

jQuery 事件代码下载 jQuery 单个事件注册 jQuery 提供了方便的事件注册机制，其优缺点如下： 优点: 操作简单，且不用担心事件覆盖等问题。缺点: 普通的事件注册不能做事件委托，且无法实现事件解绑，需要借助其他方法…

阅读更多...

【探索C++】C++对C语言的扩展

【探索C++】C++对C语言的扩展

(꒪ꇴ꒪ )，Hello我是祐言QAQ我的博客主页：C/C语言，数据结构，Linux基础，ARM开发板，网络编程等领域UP🌍快上🚘，一起学习，让我们成为一个强大的攻城狮&#xff0…

阅读更多...

JavaScript基础知识12——运算符：算数运算符，比较运算符

JavaScript基础知识12——运算符：算数运算符，比较运算符

哈喽，大家好，我是雷工。以下为JavaScript基础知识学习笔记。一、算数运算符 1、算术运算符：即进行数学计算的符号。 2、有哪些算数运算符： ：加法 -：减法 *：乘法 /:除法 %:取余（…

阅读更多...

系统架构设计师-数据库系统（3）

系统架构设计师-数据库系统（3）

目录一、数据控制 1、安全性 2、完整性 3、并发控制 4、故障恢复二、数据库设计概述 1、数据库设计关注的问题 2、数据库性能优化 3、规范化与反规范化一、数据控制 1、安全性 2、完整性 （1）实体完整性约束：规定基本关系的主属性不能取空…

阅读更多...

最新文章