Elasticsearch集群配置-节点职责划分 Hot Warm 架构实践

前言

本文主要讲了ES在节点部署时可以考虑的节点职责划分,如果不考虑节点部署,那么所有节点都会身兼数职(master-eligible ,data,coordinate等),这对后期的维护拓展并不利,所以本文从节点介绍出发,再到实践Hot Warm 架构,让大家有个es集群分职责部署有个直观印象。然后仍要着重强调,本文只是一个引子,只是告诉你ES有这个东西,当看完本文,以后的所有问题都应该直接去看 Elasticsearc-Nodes 官方文档介绍,ES完善的文档内容已经能够解决大部分问题,下方很多节点介绍也直接是文档原文整理了一下,都没有翻译,一开始本来想翻译的,后面想想完全没有必要,程序猿看英文文档不是很正常吗,哈哈。

一个节点只承担一个角色(Elasticsearch 8.14.3)

节点职责

You define a node’s roles by setting node.roles in elasticsearch.yml. If you set node.roles, the node is only assigned the roles you specify. If you don’t set node.roles, the node is assigned the following roles:

  • master
  • data
  • data_content
  • data_hot
  • data_warm
  • data_cold
  • data_frozen
  • ingest
  • ml
  • remote_cluster_client
  • transform

If you set node.roles, ensure you specify every node role your cluster needs. Every cluster requires the following node roles:

  • master
  • (data_content and data_hot) OR data
Master-eligible node

The master node is responsible for lightweight cluster-wide actions,It is important for cluster health to have a stable master node

  • creating or deleting an index
  • tracking which nodes are part of the cluster
  • deciding which shards to allocate to which nodes

To create a dedicated master-eligible node, set:

node.roles: [ master ]
Coordinating Node

Requests like search requests or bulk-indexing requests may involve data held on different data nodes. A search request, for example, is executed in two phases which are coordinated by the node which receives the client request — the coordinating node.

  1. In the scatter phase, the coordinating node forwards the request to the data nodes which hold the data. Each data node executes the request locally and returns its results to the coordinating node.
  2. In the gather phase, the coordinating node reduces each data node’s results into a single global result set.

Every node is implicitly a coordinating node. This means that a node that has an explicit empty list of roles via node.roles will only act as a coordinating node, which cannot be disabled. As a result, such a node needs to have enough memory and CPU in order to deal with the gather phase.

To create a dedicated coordinating node, set:

node.roles: [ ]
Data Node

Data nodes hold the shards that contain the documents you have indexed. Data nodes handle data related operations like CRUD, search, and aggregations. These operations are I/O-, memory-, and CPU-intensive. It is important to monitor these resources and to add more data nodes if they are overloaded.

In a multi-tier deployment architecture, you use specialized data roles to assign data nodes to specific tiers: data_content,data_hot, data_warm, data_cold, or data_frozen. A node can belong to multiple tiers.

If you want to include a node in all tiers, or if your cluster does not use multiple tiers, then you can use the generic data role.

Generic data node

Generic data nodes are included in all content tiers.

To create a dedicated generic data node, set:

node.roles: [ data ]
Content data node

Content data nodes are part of the content tier. Data stored in the content tier is generally a collection of items such as a product catalog or article archive. Content data typically has long data retention requirements, and you want to be able to retrieve items quickly regardless of how old they are. Content tier nodes are usually optimized for query performance—​they prioritize processing power over IO throughput so they can process complex searches and aggregations and return results quickly.

The content tier is required and is often deployed within the same node grouping as the hot tier. System indices and other indices that aren’t part of a data stream are automatically allocated to the content tier.

To create a dedicated content node, set:

node.roles: ["data_content"]
Hot data node

Hot data nodes are part of the hot tier. The hot tier is the Elasticsearch entry point for time series data and holds your most-recent, most-frequently-searched time series data. Nodes in the hot tier need to be fast for both reads and writes, which requires more hardware resources and faster storage (SSDs). For resiliency, indices in the hot tier should be configured to use one or more replicas.

The hot tier is required. New indices that are part of a data stream are automatically allocated to the hot tier.

To create a dedicated hot node, set:

node.roles: [ "data_hot" ]
Warm data node

Warm data nodes are part of the warm tier. Time series data can move to the warm tier once it is being queried less frequently than the recently-indexed data in the hot tier. The warm tier typically holds data from recent weeks. Updates are still allowed, but likely infrequent. Nodes in the warm tier generally don’t need to be as fast as those in the hot tier. For resiliency, indices in the warm tier should be configured to use one or more replicas.

To create a dedicated warm node, set:

node.roles: [ "data_warm" ]
Cold data node

Cold data nodes are part of the cold tier. When you no longer need to search time series data regularly, it can move from the warm tier to the cold tier. While still searchable, this tier is typically optimized for lower storage costs rather than search speed.

For better storage savings, you can keep fully mounted indices of searchable snapshots on the cold tier. Unlike regular indices, these fully mounted indices don’t require replicas for reliability. In the event of a failure, they can recover data from the underlying snapshot instead. This potentially halves the local storage needed for the data. A snapshot repository is required to use fully mounted indices in the cold tier. Fully mounted indices are read-only.

Alternatively, you can use the cold tier to store regular indices with replicas instead of using searchable snapshots. This lets you store older data on less expensive hardware but doesn’t reduce required disk space compared to the warm tier.

To create a dedicated cold node, set:

node.roles: [ "data_cold" ]
Frozen data node

Frozen data nodes are part of the frozen tier. Once data is no longer being queried, or being queried rarely, it may move from the cold tier to the frozen tier where it stays for the rest of its life.

The frozen tier requires a snapshot repository. The frozen tier uses partially mounted indices to store and load data from a snapshot repository. This reduces local storage and operating costs while still letting you search frozen data. Because Elasticsearch must sometimes fetch frozen data from the snapshot repository, searches on the frozen tier are typically slower than on the cold tier.

To create a dedicated frozen node, set:

node.roles: [ "data_frozen" ]

WARNING: Adding too many coordinating only nodes to a cluster can increase the burden on the entire cluster because the elected master node must await acknowledgement of cluster state updates from every node! The benefit of coordinating only nodes should not be overstated — data nodes can happily serve the same purpose.

Ingest Node

数据前置处理转换节点,支持pipeline管道设置,可以使用ingest对数据进行过滤、转换等操作

Hot Warm 架构实践

部署3个master-eligible节点, 2个coordinate only node,2个data-content(充当warm节点),3个data-hot。这些基本算是如果要做节点职责划分的最小配置。

在这里插入图片描述

  • 单一 master eligible nodes: 负责集群状态(cluster state)的管理
    • 使用低配置的CPU,RAM和磁盘
  • 单一 data nodes: 负责数据存储及处理客户端请求
    • 使用高配置的CPU,RAM和磁盘
  • 单一ingest nodes: 负责数据处理
    • 使用高配置CPU; 中等配置的RAM; 低配置的磁盘
  • 单一Coordinating Only Nodes(Client Node)
    • 使用高配置CPU; 高配置的RAM; 低配置的磁盘
Docker安装脚本
docker network create elasticdocker run -d ^--name es-master-01 ^--hostname es-master-01 ^--restart unless-stopped ^--net elastic ^-p 9200:9200 ^-e node.name=es-master-01 ^-e node.roles=["master"]  ^-e cluster.initial_master_nodes=es-master-01,es-master-02,es-master-03 ^-e discovery.seed_hosts=es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-master-01\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3docker run -d ^--name es-master-02 ^--hostname es-master-02 ^--restart unless-stopped ^--net elastic ^-p 9201:9200 ^-e node.name=es-master-02 ^-e node.roles=["master"]  ^-e cluster.initial_master_nodes=es-master-01,es-master-02,es-master-03 ^-e discovery.seed_hosts=es-master-01,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-master-02\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3docker run -d ^--name es-master-03 ^--hostname es-master-03 ^--restart unless-stopped ^--net elastic ^-p 9202:9200 ^-e node.name=es-master-03 ^-e node.roles=["master"]  ^-e cluster.initial_master_nodes=es-master-01,es-master-02,es-master-03 ^-e discovery.seed_hosts=es-master-01,es-master-02 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-master-03\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  docker run -d ^--name es-coordinating-only-01 ^--hostname es-coordinating-only-01 ^--restart unless-stopped ^--net elastic ^-p 9210:9200 ^-e node.name=es-coordinating-only-01 ^-e node.roles=[]  ^-e discovery.seed_hosts=es-master-01,es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-coordinating-only-01\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  docker run -d ^--name es-coordinating-only-02 ^--hostname es-coordinating-only-02 ^--restart unless-stopped ^--net elastic ^-p 9211:9200 ^-e node.name=es-coordinating-only-02 ^-e node.roles=[]  ^-e discovery.seed_hosts=es-master-01,es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-coordinating-only-02\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  docker run -d ^--name es-data-hot-01 ^--hostname es-data-hot-01 ^--restart unless-stopped ^--net elastic ^-p 9220:9200 ^-e node.name=es-data-hot-01 ^-e node.roles=["data_hot"]  ^-e discovery.seed_hosts=es-master-01,es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-data-hot-01\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  docker run -d ^--name es-data-hot-02 ^--hostname es-data-hot-02 ^--restart unless-stopped ^--net elastic ^-p 9221:9200 ^-e node.name=es-data-hot-02 ^-e node.roles=["data_hot"]  ^-e discovery.seed_hosts=es-master-01,es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-data-hot-02\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  docker run -d ^--name es-data-hot-03 ^--hostname es-data-hot-03 ^--restart unless-stopped ^--net elastic ^-p 9222:9200 ^-e node.name=es-data-hot-03 ^-e node.roles=["data_hot"]  ^-e discovery.seed_hosts=es-master-01,es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-data-hot-03\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  docker run -d ^--name es-data-content-01 ^--hostname es-data-content-01 ^--restart unless-stopped ^--net elastic ^-p 9230:9200 ^-e node.name=es-data-content-01 ^-e node.roles=["data_content"]  ^-e discovery.seed_hosts=es-master-01,es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-data-content-01\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  docker run -d ^--name es-data-content-02 ^--hostname es-data-content-02 ^--restart unless-stopped ^--net elastic ^-p 9231:9200 ^-e node.name=es-data-content-02 ^-e node.roles=["data_content"]  ^-e discovery.seed_hosts=es-master-01,es-master-02,es-master-03 ^-e cluster.name=docker-cluster ^-e xpack.security.enabled=false ^-v C:\Users\wayne\docker\elasticsearch\docker-cluster\es-data-content-02\data:/usr/share/elasticsearch/data ^-m 1GB ^docker.elastic.co/elasticsearch/elasticsearch:8.14.3  
create index template
PUT /_index_template/books_template HTTP/1.1
Host: 123.123.8.2:9210
Content-Type: application/json
Content-Length: 887
{"index_patterns" : ["books"],"template": {"settings": {"index": {"number_of_shards": "1","number_of_replicas": "1","routing": {"allocation": {"include": {// 写入data_hot节点"_tier_preference": "data_hot"}}}}},"mappings": {"properties": {"author": {"type": "text"},"page_count": {"type": "integer"},"name": {"type": "text"},"release_date": {"type": "date"}}}}
}
add a data
POST /books/_doc HTTP/1.1
Host: 123.123.8.2:9210
Content-Type: application/json
Content-Length: 123{"name": "Snow Crash","author": "Neal Stephenson","release_date": "1992-06-01","page_count": 470
} 

在这里插入图片描述

Set a tier preference for existing indices (move index to other data node)

你可以使用ilm机制去自动迁移index,这里只是演示http请求手动迁移books索引从data_hot节点到data_content节点。

PUT /books/_settings HTTP/1.1
Host: 123.123.8.2:9210
Content-Type: application/json
Content-Length: 79{"index.routing.allocation.include._tier_preference": "data_content"
}

在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/875159.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

软件测试10 渗透性测试及验收测试

渗透性测试及验收测试 知识回顾 Web UI自动化测试 引入自动化测试需要满足的条件自动化测试流程简述自动化测试的关键技术Selenium页面元素定位方式 目标 了解安全测试的概念了解常见的安全漏洞了解安全测试流程及测试工具的使用理解验收测试的概念掌握Alpha测试和Beta测试…

【React 】开发环境搭建详细指南

文章目录 一、准备工作1. 安装 Node.js 和 npm2. 选择代码编辑器 二、创建 React 项目1. 使用 Create React App2. 手动配置 React 项目 三、集成开发工具1. ESLint 和 Prettier2. 使用 Git 进行版本控制 在现代前端开发中,React 是一个非常流行的框架,用…

【计算机毕业设计】866校企合作管理系统

🙊作者简介:拥有多年开发工作经验,分享技术代码帮助学生学习,独立完成自己的项目或者毕业设计。 代码可以私聊博主获取。🌹赠送计算机毕业设计600个选题excel文件,帮助大学选题。赠送开题报告模板&#xff…

Git处理Failed to connect to www.google.com port 80: Timed out

Git处理Failed to connect to www.google.com port 80: Timed out 输入提交代码命令:git push -u origin master 报错:fatal: unable to access https://gitee.com/solitudeYu/gerenzhuye.git/: Failed to connect to www.google.com port 80: Timed ou…

AI数字人+城市交通大数据可视化平台,让交通管理与服务更简便、更智能

如今,AI数字人作为科技革命和产业革命的重要驱动力,AI数字人接入城市交通大数据可视化平台,可以有效地将各硬件与业务系统进行深度融合,完成业务闭环。依托AI数字人的应用,使城市交通大数据可视化平台的使用复杂度大幅…

我在Vscode学Java泛型(泛型设计、擦除、通配符)

Java泛型 一、泛型 Generics的意义1.1 在没有泛型的时候,集合如何存储数据1.2 引入泛型的好处1.3 注意事项1.3.1 泛型不支持基本数据类型1.3.2 当泛型指定类型,传递数据时可传入该类及其子类类型1.3.3 如果不写泛型,类型默认是Object 二、泛型…

Elastic 数据分层策略:为弹性及高效的实施而优化

作者:来自 Elastic Michael Calizo, Tim Lee 在 Elastic,我们大多数成功的客户实施都是从单一用例开始的,旨在满足特定的业务需求。Elastic 最初被采用通常是因为开发人员欣赏它提供的功能。然而,由于其灵活性和可定制性&#xff…

Ubuntu安装QQ教程

Ubuntu安装QQ教程 腾讯更新Linux版的QQ,这里安装一下; 首先,进入官网找到合适对应的安装包; QQLinux版本官网:https://im.qq.com/linuxqq/index.shtml 我们是ubuntu系统选择X86下的deb版本,如果是arm开…

ROS参数服务器增删改查实操C++

ROS参数服务器增删改查实操C 创建功能包参数服务器新增(修改)参数参数服务器获取参数参数服务器删除参数 ROS通信机制包括话题通信、服务通信和参数服务器三种通信方式,各原理及代码实现如下表 功能博客链接说明VScode配置 ROS 环境VScode创建…

【Vue实战教程】之Vuex状态管理详解

Vuex状态管理 1 Vuex简介 1.1 什么是Vuex Vuex是一个专为Vue.js应用程序开发的状态管理工具。它采用了集中式存储管理应用的所有的状态,并以相应的规则保证状态以一种可预测的方式发生变化。 简单来说,Vuex是一个适用于在Vue项目开发时使用的状态管理…

机器学习 | 回归算法原理——多项式回归

Hi,大家好,我是半亩花海。接着上次的最速下降法(梯度下降法)继续更新《白话机器学习的数学》这本书的学习笔记,在此分享多项式回归这一回归算法原理。本章的回归算法原理基于《基于广告费预测点击量》项目,…

Adaboost集成学习 | Matlab实现基于LSTM-Adaboost长短期记忆神经网络结合Adaboost集成学习多输入单输出时间序列预测

目录 效果一览基本介绍模型设计程序设计参考资料效果一览 基本介绍 Adaboost集成学习 | Matlab实现基于LSTM-Adaboost长短期记忆神经网络结合Adaboost集成学习时间序列预测(股票价格预测) 模型设计 步骤1: 数据准备 收集和整理历史数据。确保数据集经过适当的预处理,如归一…

【人工智能】Transformers之Pipeline(五):深度估计(depth-estimation)

目录 一、引言 二、深度估计(depth-estimation) 2.1 概述 2.2 技术路径 2.3 应用场景 2.4 pipeline参数 2.4.1 pipeline对象实例化参数 2.4.2 pipeline对象使用参数 2.4 pipeline实战 2.5 模型排名 三、总结 一、引言 pipeline&#xff08…

mysql JSON特性优化

有朋友问到,mysql如果要根据json中的某个属性过滤,数据量大的话,性能很差,要如何提高性能? 为什么要用json串? 由于一些特定场景,mysql需要用到json串,例如文档,不同的…

详解Stable Diffusion 原理图

参考英文文献:The Illustrated Stable Diffusion – Jay Alammar – Visualizing machine learning one concept at a time. 在这个Stable Diffusion模型的架构图中,VAE(变分自编码器)模型对应的是图中的 E 和 D 部分。 具体来说…

【BUG】已解决:NameError: name ‘python‘ is not defined

NameError: name ‘python‘ is not defined 目录 NameError: name ‘python‘ is not defined 【常见模块错误】 【解决方案】 欢迎来到英杰社区https://bbs.csdn.net/topics/617804998 欢迎来到我的主页,我是博主英杰,211科班出身,就职于…

深入学习STL标准模板库

C STL standard template libaray 标准模板库 目录 C STL standard template libaray 标准模板库 一、标准容器顺序容器vectordequelistvector deque list对比 容器适配器stackqueuepriority_queue 关联容器(1)无序关联容器unordered_setunordered_multisetunordered_mapunorde…

Cxx Primer-chap7

类的基本思想是数据抽象和封装,前者强调interface和implement分离,后者在此基础上,强调访问控制符(存疑)。同时类的实现者和使用者考虑的角度不同,前者考虑实现效率,后者仅需关注功能即可&#…

C++相关概念和易错语法(23)(set、仿函数的应用、pair、multiset)

1.set和map存在的意义 (1)set和map的底层都是二叉搜索树,可以达到快速排序(当我们按照迭代器的顺序来遍历set和map,其实是按照中序来遍历的,是排过序的)、去重、搜索的目的。 (2&a…

与众不同的社交体验:Facebook的新功能与新变化

在快速变化的社交媒体领域,Facebook不断引入创新功能和变化,以满足用户日益增长的需求,并提供与众不同的社交体验。从增强现实到数据隐私,Facebook的新功能和更新正在塑造一个全新的社交平台。本文将深入探讨这些新功能和变化&…