EFK代替ELK方案7.17.3

文章目录

- 一. 传统的ELK
- 二. EFK
- - 2.1 安装elasticSearch
  - 2.2 服务端安装fileBeats
  - - 2.2.1. 安装 `该也没有必要安装odcker`,直接下载yum或官网jar包启动即可.
    - 2.2.2.编辑配置文件 filebeat-java-logback.yml
    - 2.2.3. es配置`common_log_pipeline`解析日志
  - 三.启动测试

最近发现,logstash日志收集器本身的内存占用和es相当,这也是logstash用java开发,其jvm本身就是内存消耗大户.为了降本增效,发现用go开发的beats可以替代logstash.

ELK : 通常我们将服务器日志通过logback的http发送至logstash服务器统一处理,logstash采集处理后发送到elasticsearch服务器.
EFK: 通常我们将服务器日志保存到本机,本机启动filebeats,fliebeats采集处理发送至elasticsearch.

一. 传统的ELK

在这里插入图片描述

logstash+elasticsearch+Kibana(ELK)日志收集

二. EFK

在这里插入图片描述

logback+ fileBeats + elasticSearch + Kibana日志收集方案

2.1 安装elasticSearch

该docker安装只针对7.18以下版本. 7.18+默认开启生产模式

1. 安装

# 安装es
docker pull elasticsearch:7.17.3
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >> /mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms512m -Xmx512m" \
--restart=always --privileged=true \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.17.3

2. 进入到es挂载目录elasticsearch.yml的挂载目录，添加以下内容

http.host: 0.0.0.0
http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: Authorization
xpack.security.enabled: true
# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl.enabled: true
# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl.enabled: false

3. 重启es容器并进入es容器
4. 进入容器后执行以下命令傻瓜式设置账号密码

./bin/elasticsearch-setup-passwords interactive

5. 重启es容器

2.2 服务端安装fileBeats

2.2.1. 安装 `该也没有必要安装odcker`,直接下载yum或官网jar包启动即可.

强烈建议不要用docker,docker不保证不出错

# 安装beats
docker run -d --name=filebeat:7.17.3 docker.elastic.co/beats/filebeat:7.17.3 \
--privileged=true \ 
--restart=always \
-v /mydata/beats/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro \
-v /mydata/beats/lib/docker/containers:/var/lib/docker/containers:ro \
-v /mydata/beats/run/docker.sock:/var/run/docker.sock:ro \
-v /mydata/beats/log/messages:/var/log/messages \
-e --strict.perms=false \
-E output.elasticsearch.hosts=["elasticsearch:9200"]

# 安装管道
filebeat setup  --pipelines --modules system

2.2.2.编辑配置文件 filebeat-java-logback.yml

目的: 1.设置filebeat的抓取数据路径 2.设置输出目标,及使用何种预处理
以下是7.17.3到8.6的官方配置.只做增添.

###################### Filebeat Configuration Example ########################## This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.# ============================== Filebeat inputs ===============================filebeat.inputs:# Each - is an input. Most options can be set at the input level, so# you can use different inputs for various configurations.# Below are the input-specific configurations.# filestream is an input for collecting log messages from files.- type: filestreamencoding: utf-8# Unique ID among all inputs, an ID is required.id: my-filestream-id# Change to true to enable this input configuration.enabled: true# Paths that should be crawled and fetched. Glob based paths.paths:- c:/mydata/filebeat/logs/*.log#- /mydata/filebeat/logs/*.log# yyyy-MM-dd 时间格式开头的行，合并到上一行末multiline:pattern: '^\d{4}\-\d{2}\-\d{2}'negate: truematch: after# Exclude lines. A list of regular expressions to match. It drops the lines that are# matching any regular expression from the list.# Line filtering happens after the parsers pipeline. If you would like to filter lines# before parsers, use include_message parser.#exclude_lines: ['^DBG']# Include lines. A list of regular expressions to match. It exports the lines that are# matching any regular expression from the list.# Line filtering happens after the parsers pipeline. If you would like to filter lines# before parsers, use include_message parser.#include_lines: ['^ERR', '^WARN']# Exclude files. A list of regular expressions to match. Filebeat drops the files that# are matching any regular expression from the list. By default, no files are dropped.#prospector.scanner.exclude_files: ['.gz$']# Optional additional fields. These fields can be freely picked# to add additional information to the crawled log files for filtering#fields:#  level: debug#  review: 1# ============================== Filebeat modules ==============================filebeat.config.modules:# Glob pattern for configuration loadingpath: ${path.config}/modules.d/*.yml# Set to true to enable config reloadingreload.enabled: true# Period on which files under path should be checked for changes#reload.period: 10s# ======================= Elasticsearch template setting =======================setup.template.settings:index.number_of_shards: 1#index.codec: best_compression#_source.enabled: false
setup.template.name: "yqc"      # 设置一个新的模板，模板的名称
setup.template.pattern: "yqc-*" # 模板匹配那些索引，这里表示以yqc开头的所有的索引
setup.template.overwrite: true
setup.template.enabled: false
setup.ilm.enabled: false
#index.codec: best_compression
#_source.enabled: false# ================================== General ===================================# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:# The tags of the shipper are included in their field with each
# transaction published.
#tags: ["service-X", "web-tier"]# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false# The URL from where to download the dashboard archive. By default, this URL
# has a value that is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:# =================================== Kibana ===================================# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:# Kibana Host
# Scheme and port can be left out and will be set to the default (http and 5601)
# In case you specify and additional path, the scheme is required: http://localhost:5601/path
# IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
#host: "localhost:5601"# Kibana Space ID
# ID of the Kibana Space into which the dashboards should be loaded. By default,
# the Default Space will be used.
#space.id:# =============================== Elastic Cloud ================================# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
#cloud.id:# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
#cloud.auth:# ================================== Outputs ===================================# Configure what output to use when sending the data collected by the beat.# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:# Array of hosts to connect to.hosts: [ "localhost:9200" ]username: "elastic"password: "elastic"# pipeline使用的是es的管道解析功能pipeline: "common_log_pipeline"encoding: utf-8indices:- index: "yqc-info-%{[agent.version]}-%{+yyyy.MM.dd}"when.contains:message: "INFO"- index: "yqc-error-%{[agent.version]}-%{+yyyy.MM.dd}"when.contains:message: "ERROR"# Protocol - either `http` (default) or `https`.#protocol: "https"# Authentication credentials - either API key or username/password.#api_key: "id:api_key"#username: "elastic"#password: "changeme"# ------------------------------ Logstash Output -------------------------------#output.logstash:# The Logstash hosts#hosts: ["localhost:5044"]# Optional SSL. By default is off.# List of root certificates for HTTPS server verifications#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]# Certificate for SSL client authentication#ssl.certificate: "/etc/pki/client/cert.pem"# Client Certificate Key#ssl.key: "/etc/pki/client/cert.key"# ================================= Processors =================================
# pipeline使用的是es的解析功能,而processors是filebeats本身的功能
processors:- add_host_metadata:when.not.contains.tags: forwarded- add_cloud_metadata: ~- add_docker_metadata: ~- add_kubernetes_metadata: ~# ================================== Logging ===================================# Sets log level. The default log level is info.# Available log levels are: error, warning, info, debug#logging.level: debug# At debug level, you can selectively enable logging only for some components.# To enable all selectors, use ["*"]. Examples of other selectors are "beat",# "publisher", "service".#logging.selectors: ["*"]# ============================= X-Pack Monitoring ==============================# Filebeat can export internal metrics to a central Elasticsearch monitoring# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The# reporting is disabled by default.# Set to true to enable the monitoring reporter.#monitoring.enabled: false# Sets the UUID of the Elasticsearch cluster under which monitoring data for this# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.#monitoring.cluster_uuid:# Uncomment to send the metrics to Elasticsearch. Most settings from the# Elasticsearch outputs are accepted here as well.# Note that the settings should point to your Elasticsearch *monitoring* cluster.# Any setting that is not set is automatically inherited from the Elasticsearch# output configuration, so if you have the Elasticsearch output configured such# that it is pointing to your Elasticsearch monitoring cluster, you can simply# uncomment the following line.#monitoring.elasticsearch:# ============================== Instrumentation ===============================# Instrumentation support for the filebeat.#instrumentation:# Set to true to enable instrumentation of filebeat.#enabled: false# Environment in which filebeat is running on (eg: staging, production, etc.)#environment: ""# APM Server hosts to report instrumentation results to.#hosts:#  - http://localhost:8200# API Key for the APM Server(s).# If api_key is set then secret_token will be ignored.#api_key:# Secret token for the APM Server(s).#secret_token:# ================================= Migration ==================================# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true

2.2.3. es配置`common_log_pipeline`解析日志

目的: 我们需要根据日志数据来自定义解析结果, 当然默认的也可以.自定义就需要使用pipeline功能

那如何确定日志数据被pipeline解析的格式? 答案是使用grok语法 grok的模拟解析工具在kibana有提供或在线grok工具. (请自行查阅gork语法)

日志打印格式

    <!-- 日志输出格式 --><property name="log.console.pattern" value="%d{yyyy-MM-dd HH:mm:ss.SSS,GMT+8}-%magenta(${IP})-%blue([%thread])-%highlight(%-5level)-%logger{20}-%yellow(%method)-%cyan(%msg)-%red(%exception%n)" /><property name="log.file.pattern" value="%d{yyyy-MM-dd HH:mm:ss.SSS,GMT+8}-${ip}-[%thread]-%level-%logger{20}-%method-%msg-%exception%n" />

日志数据

2023-09-18 20:34:55.439-ip_IS_UNDEFINED-[main]-INFO-o.a.d.s.b.c.e.OverrideDubboConfigApplicationListener-onApplicationEvent-Dubbo Config was overridden by externalized configuration {dubbo.application.logger=slf4j, dubbo.application.metadataType=remote, dubbo.application.name=vector-member, dubbo.application.qos-enable=false, dubbo.config.multiple=true, dubbo.consumer.check=false, dubbo.consumer.version=1.0.0, dubbo.metadata-report.address=nacos://localhost:8848, dubbo.metadata-report.parameters.namespace=410031c2-6c35-40e0-a417-dbd2870e8aaa, dubbo.metadata-report.parameters.password=nacos, dubbo.metadata-report.parameters.username=nacos, dubbo.protocol.name=dubbo, dubbo.protocol.port=-1, dubbo.protocol.serialization=hessian2, dubbo.provider.version=1.0.0, dubbo.registry.address=nacos://localhost:8848, dubbo.registry.check=false, dubbo.registry.parameters.namespace=410031c2-6c35-40e0-a417-dbd2870e8aaa, dubbo.registry.parameters.password=nacos, dubbo.registry.parameters.username=nacos}-

grok解析

%{TIMESTAMP_ISO8601:timestamp}-%{DATA:ip}-%{DATA:thread}-%{LOGLEVEL:log_level}-%{DATA:class}-%{GREEDYDATA:method}-%{GREEDYDATA:msg}-%{GREEDYDATA:exception_message}

在这里插入图片描述

对应的预处理方法即数据被映射的数据项

PUT _ingest/pipeline/common_log_pipeline
{"description": "common_log_pipeline","processors": [{"grok": {"field": "message","patterns": ["%{TIMESTAMP_ISO8601:timestamp}-%{DATA:ip}-%{DATA:thread}-%{LOGLEVEL:log_level}-%{DATA:class}-%{GREEDYDATA:method}-%{GREEDYDATA:msg}-%{GREEDYDATA:exception_message}"],"ignore_failure":true}},{"remove" : {"field" : "input"}},{"remove" : {"field" : "message"}},{"remove" : {"field" : "agent"}},{"remove" : {"field" : "ecs"}},{"remove" : {"field" : "host"}},{"remove" : {"field" : "log"}}]
}

在这里插入图片描述

三.启动测试

filebeat应该和服务器代码一起,利用filebeat采集服务器存储的日志文件发送到es.

# linux
./filebeat -e -c filebeat.yml
# windows
filebeat.exe -e -c filebeat.yml

在这里插入图片描述