欢迎您关注我的公众号【尚雷的驿站】
公众号:尚雷的驿站
CSDN :https://blog.csdn.net/shlei5580
墨天轮:https://www.modb.pro/u/2436
PGFans:https://www.pgfans.cn/user/home?userId=4159
前言
本文描述了将生产ES集群打包拷贝到测试环境,原密码失效重置密码的操作过程。
背景描述
目前运维的一套ES集群近期频频产生告警,如经常出现集群节点掉,节点重新加入集群后,节点间分片重新平衡导致ES服务器负载高,另外也经常出现某个索引由于主分片未分配到任何节点,导致该索引状态为red,长时间难以自动回复,导致应用难以写入ES。
收集了ES日志及JDK GC日志信息,通过长时间多方排查,分析认为当前ES这一问题应该和ES集群所使用的JDK版本有关。当前ES版本为7.12.1,未采用ES自带的OpenJDK版本,因为这套ES服务器上也部署了其它JAVA应用程序,为了方便兼容和方便管理,ES和这些服务器上的JAVA应用使用了相同的 jdk1.8.0_202版本。
JDK 1.8是2015年发布的,已经过去好多年了,目前JDK 都已经发布21版本了,而且根据网上建议,在服务器内存较大的情况,使用G1垃圾回收效率要高于CMS垃圾回收,而当前JDK JVM配置的是CMS垃圾回收方式,ES生产服务器内存是128G。ES 7.12.1默认采用的JDK版本是 AdoptOpenJDK (build 16+36)。
计划对ES JDK版本进行升级,为保险起见,决定使用ES自带的JDK,并将GC垃圾回收方式调整为G1。
Java 8是于2014年3月14号发布。从Java 8开始开发代号已经弃用了,所以从Java 8之后已经没有官方的开发代号了。
Java 8u201/202 是最后一个免费的 Oracle JDK 8 版本,Oracle于 2019-01-15 停止免费商用更新
验证测试
为方便验证验证测试,决定搭建ES集群测试环境,在测试环境仿照生产搭建同版本ES和JDK。将生产ES、JDK、Kibana都打包拷贝到测试环境解压,测试环境目录都和生产相同,并参照生产配置了环境变量。
以上准备完毕,启动了测试环境ES服务器各节点ES应用。 然后在启动Kibana时,却出现了报错,无法正常启动,报错信息如下:
[esuser@xsky-node1 bin]$ ./kibana log [16:11:56.252] [info][plugins-service] Plugin "osquery" is disabled.log [16:11:56.351] [warning][config][deprecation] Setting [elasticsearch.username] to "elastic" is deprecated. You should use the "kibana_system" user instead.log [16:11:56.352] [warning][config][deprecation] Config key [monitoring.cluster_alerts.email_notifications.email_address] will be required for email notifications to work in 8.0."log [16:11:56.352] [warning][config][deprecation] Setting [monitoring.username] to "elastic" is deprecated. You should use the "kibana_system" user instead.log [16:11:56.680] [info][plugins-system] Setting up [100] plugins: [taskManager,licensing,globalSearch,globalSearchProviders,banners,code,usageCollection,xpackLegacy,telemetryCollectionManager,telemetry,telemetryCollectionXpack,kibanaUsageCollection,securityOss,share,newsfeed,mapsLegacy,kibanaLegacy,translations,legacyExport,embeddable,uiActionsEnhanced,expressions,charts,esUiShared,bfetch,data,home,observability,console,consoleExtensions,apmOss,searchprofiler,painlessLab,grokdebugger,management,indexPatternManagement,advancedSettings,fileUpload,savedObjects,visualizations,visTypeVislib,visTypeVega,visTypeTimelion,features,licenseManagement,watcher,canvas,visTypeTagcloud,visTypeTable,visTypeMetric,visTypeMarkdown,tileMap,regionMap,visTypeXy,graph,timelion,dashboard,dashboardEnhanced,visualize,visTypeTimeseries,inputControlVis,discover,discoverEnhanced,savedObjectsManagement,spaces,security,savedObjectsTagging,maps,lens,reporting,lists,encryptedSavedObjects,dataEnhanced,dashboardMode,cloud,upgradeAssistant,snapshotRestore,fleet,indexManagement,rollup,remoteClusters,crossClusterReplication,indexLifecycleManagement,enterpriseSearch,beatsManagement,transform,ingestPipelines,eventLog,actions,alerts,triggersActionsUi,stackAlerts,ml,securitySolution,case,infra,monitoring,logstash,apm,uptime]log [16:11:56.682] [info][plugins][taskManager] TaskManager is identified by the Kibana UUID: 4227be37-8c2a-4580-be72-233bafd4d332log [16:11:56.909] [warning][config][plugins][security] Generating a random key for xpack.security.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.security.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.log [16:11:56.909] [warning][config][plugins][security] Session cookies will be transmitted over insecure connections. This is not recommended.log [16:11:56.958] [warning][config][plugins][reporting] 为 xpack.reporting.encryptionKey 生成随机密钥。为防止会话在重启时失效,请在 kibana.yml 中设置 xpack.reporting.encryptionKey 或使用 bin/kibana-encryption-keys 命令。log [16:11:56.965] [warning][config][plugins][reporting] Chromium 沙盒提供附加保护层,但不受 Linux CentOS 7.9.2009 OS 支持。自动设置“xpack.reporting.capture.browser.chromium.disableSandbox: true”。log [16:11:56.966] [warning][encryptedSavedObjects][plugins] Saved objects encryption key is not set. This will severely limit Kibana functionality. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.log [16:11:56.986] [warning][fleet][plugins] Fleet APIs are disabled because the Encrypted Saved Objects plugin is missing encryption key. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.log [16:11:57.054] [warning][actions][actions][plugins] APIs are disabled because the Encrypted Saved Objects plugin is missing encryption key. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.log [16:11:57.066] [warning][alerting][alerts][plugins][plugins] APIs are disabled because the Encrypted Saved Objects plugin is missing encryption key. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.log [16:11:57.177] [info][monitoring][monitoring][plugins] config sourced from: production clusterlog [16:11:57.398] [info][savedobjects-service] Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations...log [16:11:57.550] [warning][licensing][plugins] License information could not be obtained from Elasticsearch due to [security_exception] unable to authenticate user [elastic] for REST request [/_xpack?accept_enterprise=true], with { header={ WWW-Authenticate="Basic realm=\"security\" charset=\"UTF-8\"" } } :: {"path":"/_xpack?accept_enterprise=true","statusCode":401,"response":"{\"error\":{\"root_cause\":[{\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [elastic] for REST request [/_xpack?accept_enterprise=true]\",\"header\":{\"WWW-Authenticate\":\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\"}}],\"type\":\"security_exception\",\"reason\":\"unable to authenticate user [elastic] for REST request [/_xpack?accept_enterprise=true]\",\"header\":{\"WWW-Authenticate\":\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\"}},\"status\":401}","wwwAuthenticateDirective":"Basic realm=\"security\" charset=\"UTF-8\""} errorlog [16:11:57.554] [warning][monitoring][monitoring][plugins] X-Pack Monitoring Cluster Alerts will not be available: [security_exception] unable to authenticate user [elastic] for REST request [/_xpack?accept_enterprise=true], with { header={ WWW-Authenticate="Basic realm=\"security\" charset=\"UTF-8\"" } }
尝试通过浏览器访问其中一个ES,使用原来的用户名和密码,发现无法登陆,界面如下所示:
重置密码
根据以上信息,应该是我拷贝ES环境没有拷贝data等相关目录,密码已经发生了改变,而当前生产data和logs目录空间过大,我无法拷贝,决定对当前测试环境ES密码进行重置。于是采用如下方式对ES密码进行重置。
[esuser@xsky-node1 ~]$ cd deploy/elasticsearch-7.12.1-9300/bin/
[esuser@xsky-node1 bin]$ ./elasticsearch-setup-passwords interactive
Future versions of Elasticsearch will require Java 11; your Java version from [/usr/java/jdk1.8.0_221/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.Failed to determine the health of the cluster running at http://10.110.7.39:9200
Unexpected response code [503] from calling GET http://10.110.7.39:9200/_cluster/health?pretty
Cause: master_not_discovered_exceptionIt is recommended that you resolve the issues with your cluster before running elasticsearch-setup-passwords.
It is very likely that the password changes will fail when run against an unhealthy cluster.Do you want to continue with the password setup process [y/N]ERROR: User cancelled operation[esuser@xsky-node1 bin]$ ./elasticsearch-setup-passwords interactive
Future versions of Elasticsearch will require Java 11; your Java version from [/usr/java/jdk1.8.0_221/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.Failed to determine the health of the cluster running at http://10.110.7.39:9200
Unexpected response code [503] from calling GET http://10.110.7.39:9200/_cluster/health?pretty
Cause: master_not_discovered_exceptionIt is recommended that you resolve the issues with your cluster before running elasticsearch-setup-passwords.
It is very likely that the password changes will fail when run against an unhealthy cluster.Do you want to continue with the password setup process [y/N]yInitiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]yEnter password for [elastic]:
Reenter password for [elastic]:
Enter password for [apm_system]:
Reenter password for [apm_system]:
Enter password for [kibana_system]:
Reenter password for [kibana_system]:
Enter password for [logstash_system]:
Reenter password for [logstash_system]:
Enter password for [beats_system]:
Reenter password for [beats_system]:
Enter password for [remote_monitoring_user]:
Reenter password for [remote_monitoring_user]: Unexpected response code [503] from calling PUT http://10.110.7.39:9200/_security/user/apm_system/_password?pretty
Cause: Cluster state has not been recovered yet, cannot write to the [null] indexPossible next steps:
* Try running this tool again.
* Try running with the --verbose parameter for additional messages.
* Check the elasticsearch logs for additional error details.
* Use the change password API manually. ERROR: Failed to set password for user [apm_system].
执行上述命令后,发现最后修改命令失败。通过综合分析是因为我虽然将生产ES集群打包拷贝到了测试环境,但ES在初始化集群后无法投票选出主节点,当前ES集群并不是一个真正意义上的集群。 可通过如下办法进行修改,停止所有节点ES进程,然后在其中一个节点elasticsearch.yml里设置 cluster.initial_master_nodes. cluster.initial_master_nodes:后跟的是其中一个node节点名称,参照如下设置:
discovery.zen.ping.unicast.hosts: ["10.110.7.39:9300","10.110.7.39:9301","10.110.7.40:9300","10.110.7.40:9301","10.110.7.41:9300","10.110.7.41:9301","10.110.7.42:9300","10.110.7.42:9301"]
cluster.initial_master_nodes: ["node-7.39-9300"]
然后逐次重启节点,节点重启信息如下所示
- 节点一
[esuser@xsky-node1 config]$ ../bin/elasticsearch
Future versions of Elasticsearch will require Java 11; your Java version from [/usr/java/jdk1.8.0_221/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
... 省略部分代码
[2024-05-06T17:16:23,283][INFO ][o.e.x.s.s.SecurityIndexManager] [node-7.39-9300] security index does not exist, creating [.security-7] with alias [.security]
[2024-05-06T17:16:23,373][INFO ][o.e.c.m.MetadataCreateIndexService] [node-7.39-9300] [.security-7] creating index, cause [api], templates [], shards [1]/[0]
[2024-05-06T17:16:23,391][INFO ][o.e.c.r.a.AllocationService] [node-7.39-9300] updating number_of_replicas to [1] for indices [.security-7]
[2024-05-06T17:16:24,408][INFO ][o.e.c.r.a.AllocationService] [node-7.39-9300] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.security-7][0]]]).
[2024-05-06T17:17:52,114][WARN ][o.e.x.s.a.AuthenticationService] [node-7.39-9300] No authentication credential could be extracted using realms [reserved/reserved,native/native11]. Realms [ldap/ldap1] were skipped because they are not permitted on the current license--节点三
[esuser@xsky-node3 bin]$ ./elasticsearch
Future versions of Elasticsearch will require Java 11; your Java version from [/usr/java/jdk1.8.0_221/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
Future versions of Elasticsearch will require Java 11; your Java version from [/usr/java/jdk1.8.0_221/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
[2024-05-06T17:13:55,404][INFO ][o.e.n.Node ] [node-7.40-9300] version[7.12.1], pid[609], build[default/tar/3186837139b9c6b6d23c3200870651f10d3343b7/2021-04-20T20:56:39.040728659Z], OS[Linux/3.10.0-1160.83.1.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_211/25.211-b12]
... 省略部分代码
[2024-05-06T17:14:22,605][INFO ][o.e.x.s.a.TokenService ] [node-7.41-9300] refresh keys
[2024-05-06T17:14:22,935][INFO ][o.e.x.s.a.TokenService ] [node-7.41-9300] refreshed keys
[2024-05-06T17:14:22,999][INFO ][o.e.l.LicenseService ] [node-7.41-9300] license [839c87b6-4eea-4e98-908b-a1595ac6561d] mode [basic] - valid
[2024-05-06T17:14:23,001][INFO ][o.e.x.s.s.SecurityStatusChangeListener] [node-7.41-9300] Active license is now [BASIC]; Security is enabled
[2024-05-06T17:14:23,013][WARN ][o.e.x.s.a.AuditTrailService] [node-7.41-9300] Auditing logging is DISABLED because the currently active license [BASIC] does not permit it
[2024-05-06T17:14:23,022][INFO ][o.e.h.AbstractHttpServerTransport] [node-7.41-9300] publish_address {10.110.7.41:9200}, bound_addresses {10.110.7.41:9200}
[2024-05-06T17:14:23,022][INFO ][o.e.n.Node ] [node-7.41-9300] started
通过以上操作后,重启了各ES节点应用,此时选出了主节点。 然后再次使用elasticsearch-setup-passwords interactive命令修改密码,结果如下所示:
[esuser@xsky-node1 bin]$ ./elasticsearch-setup-passwords interactive
Future versions of Elasticsearch will require Java 11; your Java version from [/usr/java/jdk1.8.0_221/jre] does not meet this requirement. Consider switching to a distribution of Elasticsearch with a bundled JDK. If you are already using a distribution with a bundled JDK, ensure the JAVA_HOME environment variable is not set.
Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
You will be prompted to enter passwords as the process progresses.
Please confirm that you would like to continue [y/N]y -- 输入yEnter password for [elastic]: --设置 ealastic 密码,以下均设置相同密码
Reenter password for [elastic]:
Enter password for [apm_system]:
Reenter password for [apm_system]:
Enter password for [kibana_system]:
Reenter password for [kibana_system]:
Enter password for [logstash_system]:
Reenter password for [logstash_system]:
Enter password for [beats_system]:
Reenter password for [beats_system]:
Enter password for [remote_monitoring_user]:
Reenter password for [remote_monitoring_user]:
Changed password for user [apm_system]
Changed password for user [kibana_system]
Changed password for user [kibana]
Changed password for user [logstash_system]
Changed password for user [beats_system]
Changed password for user [remote_monitoring_user]
Changed password for user [elastic]
结果验证
通过浏览器,输入其中一个节点ES信息,IP:端口号,使用ealastic 用户及新设置的密码登录,结果如下:
此时再启动kibana,没有再包上面错误,启动信息如下所示:
{"type":"log","@timestamp":"2024-05-15T16:17:39+08:00","tags":["info","plugins-service"],"pid":24059,"message":"Plugin \"osquery\" is disabled."}
{"type":"log","@timestamp":"2024-05-15T16:17:39+08:00","tags":["warning","config","deprecation"],"pid":24059,"message":"Setting [elasticsearch.username] to \"elastic\" is deprecated. You should use the \"kibana_system\" user instead."}
...... 省略部分代码
{"type":"log","@timestamp":"2024-05-15T16:17:39+08:00","tags":["info","plugins-system"],"pid":24059,"message":"Setting up [100] plugins: [taskManager,licensing,globalSearch,globalSearchProviders,banners,code,usageCollection,xpackLegacy,telemetryCollectionManager,telemetry,telemetryCollectionXpack,kibanaUsageCollection,securityOss,share,newsfeed,mapsLegacy,kibanaLegacy,translations,legacyExport,embeddable,uiActionsEnhanced,esUiShared,expressions,charts,bfetch,data,home,observability,console,consoleExtensions,apmOss,searchprofiler,painlessLab,grokdebugger,management,indexPatternManagement,advancedSettings,fileUpload,savedObjects,visualizations,visTypeTimelion,features,licenseManagement,watcher,canvas,visTypeVega,visTypeVislib,visTypeTagcloud,visTypeTable,visTypeMetric,visTypeMarkdown,tileMap,regionMap,visTypeXy,graph,timelion,dashboard,dashboardEnhanced,visualize,visTypeTimeseries,inputControlVis,discover,discoverEnhanced,savedObjectsManagement,spaces,eventLog,security,savedObjectsTagging,maps,lens,encryptedSavedObjects,actions,alerts,triggersActionsUi,stackAlerts,reporting,lists,dataEnhanced,dashboardMode,cloud,upgradeAssistant,snapshotRestore,fleet,indexManagement,rollup,remoteClusters,crossClusterReplication,indexLifecycleManagement,enterpriseSearch,ml,securitySolution,case,infra,monitoring,logstash,apm,uptime,beatsManagement,transform,ingestPipelines]"}...... 省略部分代码{"type":"log","@timestamp":"2024-05-15T16:17:39+08:00","tags":["warning","plugins","reporting","config"],"pid":24059,"message":"为 xpack.reporting.encryptionKey 生成随机密钥。为防止会话在重启时失效,请在 kibana.yml 中设置 xpack.reporting.encryptionKey 或使用 bin/kibana-encryption-keys 命令。"}
{"type":"log","@timestamp":"2024-05-15T16:17:39+08:00","tags":["warning","plugins","reporting","config"],"pid":24059,"message":"Chromium 沙盒提供附加保护层,但不受 Linux CentOS 7.9.2009 OS 支持。自动设置“xpack.reporting.capture.browser.chromium.disableSandbox: true”。"}Migration completed after 580ms"}
{"type":"log","@timestamp":"2024-05-15T16:17:41+08:00","tags":["info","plugins-system"],"pid":24059,"message":"Starting [100] plugins: [taskManager,licensing,globalSearch,globalSearchProviders,banners,code,usageCollection,xpackLegacy,telemetryCollectionManager,telemetry,telemetryCollectionXpack,kibanaUsageCollection,securityOss,share,newsfeed,mapsLegacy,kibanaLegacy,translations,legacyExport,embeddable,uiActionsEnhanced,esUiShared,expressions,charts,bfetch,data,home,observability,console,consoleExtensions,apmOss,searchprofiler,painlessLab,grokdebugger,management,indexPatternManagement,advancedSettings,fileUpload,savedObjects,visualizations,visTypeTimelion,features,licenseManagement,watcher,canvas,visTypeVega,visTypeVislib,visTypeTagcloud,visTypeTable,visTypeMetric,visTypeMarkdown,tileMap,regionMap,visTypeXy,graph,timelion,dashboard,dashboardEnhanced,visualize,visTypeTimeseries,inputControlVis,discover,discoverEnhanced,savedObjectsManagement,spaces,eventLog,security,savedObjectsTagging,maps,lens,encryptedSavedObjects,actions,alerts,triggersActionsUi,stackAlerts,reporting,lists,dataEnhanced,dashboardMode,cloud,upgradeAssistant,snapshotRestore,fleet,indexManagement,rollup,remoteClusters,crossClusterReplication,indexLifecycleManagement,enterpriseSearch,ml,securitySolution,case,infra,monitoring,logstash,apm,uptime,beatsManagement,transform,ingestPipelines]"}
{"type":"log","@timestamp":"2024-05-15T16:17:42+08:00","tags":["listening","info"],"pid":24059,"message":"Server running at http://10.110.7.39:5601/path/klog"}
总结
通过以上方法重置了ES账号密码,最后将 cluster.initial_master_nodes 注释。再重启ES集群,这样就以后可以灵活选主。 这种修改密码的方式也适合原ES集群密码丢失。我在网上查找ES重置密码时,看到有些博主设置的方式也不一样,所以还是要针对自己的环境灵活进行修改。