by Ritvik Khanna

Ritvik Khanna着

如何使用Elasticsearch，Logstash和Kibana实时可视化Python中的日志 (How to use Elasticsearch, Logstash and Kibana to visualise logs in Python in realtime)

什么是日志记录？ (What is logging?)

Let’s say you are developing a software product. It works remotely, interacts with different devices, collects data from sensors and provides a service to the user. One day, something goes wrong and the system is not working as expected. It might not be identifying the devices or not receiving any data from the sensors, or might have just gotten a runtime error due to a bug in the code. How can you know for sure?

假设您正在开发软件产品。它可以远程工作，与不同的设备进行交互，从传感器收集数据并为用户提供服务。有一天，出了点问题，系统无法按预期运行。它可能无法识别设备或未从传感器接收任何数据，或者可能由于代码中的错误而刚发生运行时错误。您怎么能确定？

Now, imagine if there are checkpoints in the system code where, if the system returns an unexpected result, it simply flags it and notifies the developer. This is the concept of logging.

现在，想象一下系统代码中是否存在检查点，如果系统返回意外结果，则仅对其进行标记并通知开发人员。这就是日志记录的概念。

Logging enables the developers to understand what the code is actually doing and how the work-flow is. A large part of software developers’ lives is monitoring, troubleshooting and debugging. Logging makes this a much easier and smoother process.

通过日志记录，开发人员可以了解代码的实际作用以及工作流程。软件开发人员的大部分工作是监视，故障排除和调试。日志记录使此过程变得更加轻松和顺畅。

日志可视化 (Visualisation of logs)

Now, if you are an expert developer who has been developing and creating software for quite a while, then you would think that logging is not a big deal and most of our code is included with a Debug.Log('____') statement. Well, that is great but there are some other aspects of logging we can make use of.

现在，如果您是开发和创建软件已有相当一段时间的专家开发人员，那么您会认为日志记录并不重要，并且我们的大多数代码都包含在Debug.Log('____')语句中。很好，但是我们可以利用日志记录的其他一些方面。

Visualisation of specific logged data has the following benefits:

可视化特定记录的数据具有以下好处：

Monitor the operations of the system remotely.
远程监视系统的操作。
Communicate information clearly and efficiently via statistical graphics, plots and information graphics.
通过统计图形，曲线图和信息图形清晰有效地传达信息。
Extract knowledge from the data visualised in the form of different graphs.
从以不同图形形式可视化的数据中提取知识。
Take necessary actions to better the system.
采取必要的措施来改善系统。

There are a number of ways we can visualise raw data. There are a number of libraries in the Python and R programming languages that can help in plotting graphs. You can learn more about it here. But in this post, I am not going to discuss about above mentioned methods. Have you ever heard about the ELK stack?

我们可以通过多种方式可视化原始数据。 Python和R编程语言中有许多库可以帮助绘制图形。您可以在此处了解更多信息。但是在这篇文章中，我将不讨论上述方法。您听说过ELK堆栈吗？

ELK堆栈 (ELK stack)

E — Elasticsearch, L — Logstash, K — Kibana

E- Elasticsearch ，L- Logstash ， K- Kibana

Let me give a brief introduction to it. The ELK stack is a collection of three open source softwares that helps in providing realtime insights about data that can be either structured or unstructured. One can search and analyse data using its tools with extreme ease and efficiently.

让我对其进行简要介绍。 ELK堆栈是三个开源软件的集合，这些软件有助于提供有关可结构化或非结构化数据的实时见解。一个人可以使用其工具轻松高效地搜索和分析数据。

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected. Elasticsearch lets you perform and combine many types of searches — structured, unstructured, geo, metric etc. It is built on Java programming language, which enables Elasticsearch to run on different platforms. It enables users to explore very large amount of data at very high speed.

Elasticsearch是一个分布式的RESTful搜索和分析引擎，能够解决越来越多的用例。作为Elastic Stack的核心，它集中存储您的数据，以便您发现期望的数据并发现意外的数据。 Elasticsearch可让您执行和组合多种类型的搜索-结构化，非结构化，地理，度量等。它基于Java编程语言构建，从而使Elasticsearch可以在不同平台上运行。它使用户能够以很高的速度浏览大量数据。

Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favourite “stash” (like Elasticsearch). Data is often scattered or siloed across many systems in many formats. Logstash supports a variety of inputs that pull in events from a multitude of common sources, all at the same time. Easily ingest from your logs, metrics, web applications, data stores, and various AWS services, all in continuous, streaming fashion. Logstash has a pluggable framework featuring over 200 plugins. Mix, match, and orchestrate different inputs, filters, and outputs to work in pipeline harmony.

Logstash是一个开放源代码的服务器端数据处理管道，它同时从多个源中提取数据，进行转换，然后将其发送到您喜欢的“存储”(例如Elasticsearch)。数据通常以多种格式分散或分散在许多系统中。 Logstash支持各种输入，这些输入可同时从多个常见来源获取事件。轻松以连续，流式方式从日志，指标，Web应用程序，数据存储和各种AWS服务中提取数据。 Logstash有一个可插入框架，其中包含200多个插件。混合，匹配和编排不同的输入，过滤器和输出，以协调管道。

Kibana is an open source analytics and visualisation platform designed to work with Elasticsearch. You use Kibana to search, view, and interact with data stored in Elasticsearch indices. You can easily perform advanced data analysis and visualise your data in a variety of charts, tables, and maps. Kibana makes it easy to understand large volumes of data. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.

Kibana是一个旨在与Elasticsearch一起使用的开源分析和可视化平台。您可以使用Kibana搜索，查看和与Elasticsearch索引中存储的数据进行交互。您可以轻松地执行高级数据分析，并在各种图表，表格和地图中可视化数据。使用Kibana可以轻松理解大量数据。其简单的基于浏览器的界面使您能够快速创建和共享动态仪表板，以实时显示对Elasticsearch查询的更改。

To get a better picture of the workflow of how the three softwares interact with each other, refer to the following diagram:

为了更好地了解这三种软件如何交互的工作流程，请参考下图：

实作 (Implementation)

登录Python (Logging in Python)

Here, I chose to explain the implementation of logging in Python because it is the most used language for projects involving communication between multiple machines and internet of things. It’ll help give you an overall idea of how it works.

在这里，我选择解释使用Python进行日志记录的实现，因为它是涉及多台机器与物联网之间通信的项目的最常用语言。它会帮助您全面了解其工作原理。

Python provides a logging system as a part of its standard library, so you can quickly add logging to your application.

Python提供了一个日志记录系统作为其标准库的一部分，因此您可以快速将日志记录添加到应用程序中。

import logging

In Python, logging can be done at 5 different levels that each respectively indicate the type of event. There are as follows:

在Python中，可以在5个不同的级别上进行日志记录，每个级别分别指示事件的类型。内容如下：

Info — Designates informational messages that highlight the progress of the application at coarse-grained level.
信息 —指定参考消息，以粗粒度级别突出显示应用程序的进度。
Debug — Designates fine-grained informational events that are most useful to debug an application.
调试 -指定对调试应用程序最有用的细粒度信息事件。
Warning — Designates potentially harmful situations.
警告 —表示潜在的有害情况。
Error — Designates error events that might still allow the application to continue running.
错误 —指定可能仍允许应用程序继续运行的错误事件。
Critical — Designates very severe error events that will presumably lead the application to abort.
严重 -指定非常严重的错误事件，可能会导致应用程序中止。

Therefore depending on the problem that needs to be logged, we use the defined level accordingly.

因此，根据需要记录的问题，我们相应地使用定义的级别。

Note: Info and Debug do not get logged by default as logs of only level Warning and above are logged.
注意：默认情况下，不会记录信息和调试信息，因为仅记录警告和更高级别的日志。

Now to give an example and create a set of log statements to visualise, I have created a Python script that logs statements of specific format and a message.

现在给出一个示例并创建一组可视化的日志语句，我创建了一个Python脚本，用于记录特定格式的语句和一条消息。

Here, the log statements will append to a file named logFile.txt in the specified format. I ran the script for three days at different time intervals creating a file containing logs at random like below:

在这里，日志语句将以指定格式追加到名为logFile.txt的文件中。我以不同的时间间隔运行了三天的脚本，创建了一个包含日志的文件，如下所示：

设置Elasticsearch，Logstash和Kibana (Setting up Elasticsearch, Logstash and Kibana)

At first let’s download the three open source softwares from their respective links [elasticsearch],[logstash]and[kibana]. Unzip the files and put all three in the project folder.

首先，让我们从下载他们相应的链接三个开源软件[ elasticsearch ]，[ logstash ]和[ kibana 。解压缩文件，然后将所有三个文件放入项目文件夹。

Let’s get started.

让我们开始吧。

Step 1 — Set up Kibana and Elasticsearch on the local system. We run Kibana by the following command in the bin folder of Kibana.

步骤1 —在本地系统上设置Kibana和Elasticsearch。我们通过以下命令在Kibana的bin文件夹中运行Kibana。

bin\kibana

Similarly, Elasticsearch is setup like this:

同样，Elasticsearch的设置如下：

bin\elasticsearch

Now, in the two separate terminals we can see both of the modules running. In order to check that the services are running open localhost:5621 and localhost:9600.

现在，在两个单独的终端中，我们可以看到两个模块都在运行。为了检查服务是否正在运行，请打开localhost：5621和localhost：9600 。

After both the services are successfully running we use Logstash and Python programs to parse the raw log data and pipeline it to Elasticsearch from which Kibana queries data.

在两个服务都成功运行之后，我们使用Logstash和Python程序解析原始日志数据，并将其通过管道传输到Elasticsearch，Kibana将从中查询数据。

Step 2— Now let’s get on with Logstash. Before starting Logstash, a Logstash configuration file is created in which the details of input file, output location, and filter methods are specified.

第2步 -现在让我们继续进行Logstash。启动Logstash之前，将创建一个Logstash配置文件，其中指定了输入文件，输出位置和过滤器方法的详细信息。

This configuration file plays a major role in the ELK stack. Take a look at filter{grok{…}} line. This is a Grok filter plugin. Grok is a great way to parse unstructured log data into something structured and queryable. This tool is perfect for syslog logs, apache and other webserver logs, mysql logs, and in general, any log format that is generally written for humans and not computer consumption. This grok pattern mentioned in the code tells Logstash how to parse each line entry in our log file.

此配置文件在ELK堆栈中起主要作用。看一下filter {grok {…}}行。这是一个Grok过滤器插件。 Grok是将非结构化日志数据解析为结构化和可查询内容的好方法。该工具非常适合syslog日志，apache和其他Web服务器日志，mysql日志，以及通常用于人类而非计算机使用的任何日志格式。代码中提到的这种grok模式告诉Logstash如何解析日志文件中的每个行条目。

Now save the file in Logstash folder and start the Logstash service.

现在，将文件保存在Logstash文件夹中，然后启动Logstash服务。

bin\logstash –f logstash-simple.conf

In order to learn more about configuring logstash, click [here].
为了了解更多关于配置logstash的信息，请单击[ 此处 ]。

Step 3 — After this the parsed data from the log files will be available in Kibana management at localhost:5621 for creating different visuals and dashboards. To check if Kibana is receiving any data, in the management tab of Kibana run the following command:

步骤3 —之后，将从日志文件中解析的数据在Kibana管理中的localhost：5621可用，以创建不同的图像和仪表板。要检查Kibana是否正在接收任何数据，请在Kibana的管理选项卡中运行以下命令：

localhost:9200/_cat/indices?v

This will display all the indexes. For every visualisation, a new Index pattern has to be selected from dev tools, after which various visualisation techniques are used to create a dashboard.

这将显示所有索引。对于每次可视化，都必须从开发工具中选择新的索引模式，然后使用各种可视化技术来创建仪表板。

使用Kibana的仪表板 (Dashboard Using Kibana)

After setting up everything, now it’s time to create graphs in order to visualise the log data.

设置完所有内容之后，现在该创建图表以可视化日志数据了。

After opening the Kibana management homepage, we will be asked to create a new index pattern. Enter index_name* in the Index pattern field and select @timestamp in the Time Filter field name dropdown menu.

打开Kibana管理主页后，将要求我们创建一个新的索引模式。在索引模式字段中输入index_name* ，然后在时间过滤器字段名称下拉菜单中选择@timestamp 。

Now to create graphs, we go to the Visualize tab.

现在创建图表，我们转到“ 可视化”选项卡。

Select a new visualisation, choose a type of graph and index name, and depending on your axis requirements, create a graph. We can create a histogram with y-axis as the count and x-axis with the log-level keyword or the timestamp.

选择一个新的可视化效果，选择一种图形和索引名称，然后根据您的轴要求创建一个图形。我们可以使用log-level关键字或时间戳创建以y轴为计数和x轴的直方图。

After creating a few graphs, we can add all the required visualisations and create a Dashboard, like below:

创建一些图形后，我们可以添加所有必需的可视化效果并创建一个Dashboard ，如下所示：

Note — Whenever the logs in the log file get updated or appended to the previous logs, as long as the three services are running the data in elasticsearch and graphs in kibana will automatically update according to the new data.

注—只要日志文件中的日志被更新或追加到以前的日志中，只要这三个服务都在运行，elasticsearch中的数据和kibana中的图形将根据新数据自动更新。

结语 (Wrapping up)

Logging can be an aid in fighting errors and debugging programs instead of using a print statement. The logging module divides the messages according to different levels. This results in better understanding of the code and how the call flow goes without interrupting the program.

日志记录可以帮助您解决错误和调试程序，而不是使用print语句。日志记录模块根据不同的级别划分消息。这样可以更好地理解代码以及调用流程如何进行而不会中断程序。

The visualisation of data is a necessary step in situations where a huge amount of data is generated every single moment. Data-Visualization tools and techniques offer executives and other knowledge workers new approaches to dramatically improve their ability to grasp information hiding in their data. Rapid identification of error logs, easy comprehension of data and highly customisable data visuals are some of the advantages. It is one of the most constructive way of organising raw data.

在每时每刻都会生成大量数据的情况下，数据可视化是必不可少的步骤。数据可视化工具和技术为高管和其他知识工作者提供了新的方法，可以大大提高他们掌握隐藏在数据中的信息的能力。快速识别错误日志，轻松理解数据和高度可定制的数据外观是其中的一些优势。它是组织原始数据的最有建设性的方法之一。

For further reference you can refer to the official ELK documentation from here — https://www.elastic.co/learn and on logging in python — https://docs.python.org/2/library/logging.html
如需进一步参考，你可以参考官方文档ELK从这里- https://www.elastic.co/learn并在Python记录- https://docs.python.org/2/library/logging.html