Merge pull request #8461 from qhwdw/tr49

Translated by qhwdw
This commit is contained in:
Xingyu.Wang 2018-04-12 13:44:39 +08:00 committed by GitHub
commit cfc8819111
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 81 additions and 82 deletions

View File

@ -1,82 +0,0 @@
Translating by qhwdw
Reliable IoT event logging with syslog-ng
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/openwires_fromRHT_520_0612LL.png?itok=PqZi55Ab)
For any device connected to the internet or a network, it's essential that you log events so you know what the device is doing and can address any potential problems. Increasingly those devices include Internet of Things (IoT) devices and embedded systems.
One monitoring tool to consider is the open source [syslog-ng][1] application, an enhanced logging daemon with a focus on portability and central log collection. It collects logs from many different sources, processes and filters them, and stores or routes them for further analysis. Most of syslog-ng is written in efficient and highly portable C code. It's suitable for a wide range of scenarios, whether you need something simple enough to run with a really small footprint on underpowered devices or a solution powerful enough to reside in your data center and collect logs from tens of thousands of devices.
You probably noticed the abundance of buzzwords I wrote in that single paragraph. To clarify what this all means, let's go over them, but this time slower and in more depth.
### Logging
First things first. Logging is the recording of events on a computer. On a typical Linux machine, you can find these log messages in the `/var/log` directory. For example, if you log into your machine through SSH, you will find a message similar to this in one of the files:
`Jan 14 11:38:48 ``linux``-0jbu sshd[7716]: Accepted ``publickey`` for root from 127.0.0.1 port 48806 ssh2`
It could be about your CPU running too hot, a document downloaded through HTTP, or just about anything one of your applications considers important.
### syslog-ng
As I wrote above, the syslog-ng application is an enhanced logging daemon with a focus on portability and central log collection. Daemon means syslog-ng is an application running continuously in the background; in this case, it's collecting log messages.
While Linux testing for many of today's applications is limited to x86_64 machines, syslog-ng also works on many BSD and commercial UNIX variants. What is even more important from the embedded/IoT standpoint is that it runs on many different CPU architectures, including 32- and 64-bit ARM, PowerPC, MIPS, and more. (Sometimes I learn about new architectures just by reading about how syslog-ng is used.)
Why is central collection of logs such a big deal? One reason is ease of use, as it creates a single place to check instead of tens or thousands of devices. Another reason is availability—you can check a device's log messages even if the device is unavailable for any reason. A third reason is security; when your device is hacked, checking the logs can uncover traces of the hack.
### Four roles of syslog-ng
Syslog-ng has four major roles: collecting, processing, filtering, and storing log messages.
**Collecting messages:** syslog-ng can collect from a wide variety of [platform-specific sources][2], like `/dev/log`, `journal`, or `sun-streams`. As a central log collector, it speaks both the legacy (`rfc3164`) and the new (`rfc5424`) syslog protocols and all their variants over User Datagram Protocol (UDP), TCP, and encrypted connections. You can also collect log messages (or any kind of text data) from pipes, sockets, files, and even application output.
**Processing log messages:** The possibilities here are almost endless. You can classify, normalize, and structure log messages with built-in parsers. You can even write your own parser in Python if none of the available parsers suits your needs. You can also enrich messages with geolocation data or additional fields based on the message content. Log messages can be reformatted to suit the requirements of the application processing the logs. You can also rewrite log messages—not to falsify messages, of course—for things such as anonymizing log messages as required by many compliance regulations.
**Filtering logs:** There are two main uses for filtering logs: To discard surplus log messages—like debug-level messages—to save on storage, and for log routing—making sure the right logs reach the right destinations. An example of the latter is forwarding all authentication-related messages to a security information and event management (SIEM) system.
**Storing messages:** Traditionally, files were saved locally or sent to a central syslog server; either way, they'd be sent to [flat files][3]. Over the years, syslog-ng began supporting SQL databases, and in the past few years Big Data destinations, including HDFS, Kafka, MongoDB, and Elasticsearch, were added to syslog-ng.
### Message formats
When you look at your log messages under the `/var/log` directory, you will see (like in the SSH message above) most are in the form:
`date + host name + application name + an almost complete English sentence`
Where each application event is described by a different sentence, creating a report based on this data is quite a painful job.
The solution to this mess is to use structured logging. In this case, events are represented as name-value pairs instead of freeform log messages. For example, an SSH login can be described by the application name, source IP address, username, authentication method, and so on.
You can take a structured approach for your log messages right from the beginning. When working with legacy log messages, you can use the different parsers in syslog-ng to turn unstructured (and some of the structured) message formats into name-value pairs. Once you have your logs available as name-value pairs, reporting, alerting, and simply finding the information you are looking for becomes a lot easier.
### Logging IoT
Let's start with a tricky question: Which version of syslog-ng is the most popular? Before you answer, consider these facts: The project started 20 years ago, Red Hat Enterprise Linux EPEL has version 3.5, and the current version is 3.14. When I ask this question during my presentations, the audience members usually suggest the one in their favorite Linux distribution. Surprisingly, the correct answer is version 1.6, almost 15 years old. That's because it is the version included in the Amazon Kindle e-book readers, so it's running on over 100 million devices worldwide. Another example of a consumer device running syslog-ng is the BMW i3 all-electric car.
The Kindle uses syslog-ng to collect all possible information about what the user is doing on the device. On the BMW, syslog-ng does very complex, content-based filtering of log messages, and most it likely records only the most important logs.
Networking and storage are other categories of devices that often use syslog-ng. Networking and storage are other categories of devices that often use syslog-ng. Some better known examples are the Turris Omnia open source Linux router and Synology NAS devices. In most cases, syslog-ng started out as a logging client on these devices, but in some cases it's evolved into a central logging server with a rich web interface.
You can find syslog-ng in industrial devices, as well. It runs on all real-time Linux devices from National Instruments doing measurements and automation. It is also used to collect logs from customer-developed applications. Configuration is done from the command line, but a nice GUI is available to browse the logs.
Finally, there are some large-scale projects, such as cars and airplanes, where syslog-ng runs on both the client and server side. The common theme here is that syslog-ng collects all log and metrics data, sends it to a central cluster of servers where logs are processed, and saves it to one of the supported Big Data destinations where it waits for further analysis.
### Overall benefits for IoT
There are several benefits of using syslog-ng in an IoT environment. First, it delivers high performance and reliable log collection. It also simplifies the architecture, as system and application logs and metrics data can be collected together. Third, it makes it easier to use data, as data is parsed and presented in a ready-to-use format. Finally, efficient routing and filtering by syslog-ng can significantly decrease processing load.
--------------------------------------------------------------------------------
via: https://opensource.com/article/18/3/logging-iot-events-syslog-ng
作者:[Peter Czanik][a]
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
选题:[lujun9972](https://github.com/lujun9972)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://opensource.com/users/czanik
[1]:https://syslog-ng.com/open-source-log-management
[2]:https://syslog-ng.com/documents/html/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/sources.html
[3]:https://en.wikipedia.org/wiki/Flat_file_database

View File

@ -0,0 +1,81 @@
使用 syslog-ng 可靠地记录物联网事件
======
![](https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/openwires_fromRHT_520_0612LL.png?itok=PqZi55Ab)
现在,物联网设备和嵌入式系统越来越多。对于许多连接到因特网或者一个网络的设备来说,记录事件很有必要,因为你需要知道这些设备都做了些什么事情,这样你才能够解决可能出现的问题。
可以考虑去使用的一个监视工具是开源的 [syslog-ng][1] 应用程序它是一个强化的、致力于可移植的、中心化的日志收集守护程序。它可以从许多不同种类的源、进程来收集日志并且可以对这些日志进行过滤、存储、或者路由以便于做进一步的分析。syslog-ng 的大多数代码是用高效率的、高可移植的 C 代码写成的。它能够适用于各种场景,无论你是将它运行在一个处理能力很弱的设备上做一些简单的事情,还是运行在数据中心从成千上万的机器中收集日志的强大应用,它都能够胜任。
你可能注意到在这个段落中,我使用了大量的溢美词汇。为了让你更清晰地了解它,我们来复习一下,但这将花费更多的时间,也了解的更深入一些。
### 日志
首先解释一下日志。日志是记录一台计算机上事件的东西。在一个典型的 Linux 机器上,你可以在 `/var/log` 目录中找到这些信息。例如,如果你通过 SSH 登陆到机器中,你将可以在其中一个日志文件中找到类似于如下内容的信息:
`Jan 14 11:38:48 ``linux``-0jbu sshd[7716]: Accepted ``publickey`` for root from 127.0.0.1 port 48806 ssh2`
它的内容可能是关于你的 CPU 过热,通过 HTTP 下载了一个文档,或者你的应用程序认为重要的任何东西。
### syslog-ng
正如我在上面所写的那样syslog-ng 应用程序是一个强化的、致力于可移植性、和中心化的日志收集守护程序。守护程序的意思是syslog-ng 是一个持续运行在后台的应用程序,用于收集日志信息。
虽然现在大多数应用程序的 Linux 测试是限制在 x86_64 的机器上但是syslog-ng 也可以运行在大多数 BSD 和商业 UNIX 变种版本上的。从嵌入式/物联网的角度来看,这种能够运行在不同的 CPU 架构(包括 32 位和 64 位的 ARM、PowerPC、MIPS 等等)的能力甚至更为重要。(有时候,我通过阅读关于 syslog-ng 如何使用它们来学习新架构)
为什么中心化的日志收集如此重要?其中一个很重要的原因是易于使用,因为它放在一个地方,不用到成百上千的机器上挨个去检查它们的日志。另一个原因是可用性 —— 如果一个设备不论是什么原因导致了它不可用,你可以检查这个设备的日志信息。第三个原因是安全性;当你的设备被黑,检查设备日志可以发现攻击的踪迹。
### syslog-ng 的四种用法
Syslog-ng 有四种主要的用法:收集、处理、过滤、和保存日志信息。
**收集信息:** syslog-ng 能够从各种各样的 [特定平台源][2] 上收集信息,比如 `/dev/log``journal`,或者 `sun-streams`。作为一个中心化的日志收集器,传统的(`rfc3164`)和最新的(`rfc5424`)系统日志协议、以及它们 基于 UDP、TCP、和加密连接的各种变种它都是支持的。你也可以从管道、套接字、文件、甚至应用程序输出来收集日志信息或者各种文本数据
**处理日志信息:** 它的处理能力几乎是无限的。你可以用它内置的解析器来分类、规范、以及结构化日志信息。如果它没有为你提供在你的应用场景中所需要的解析器,你甚至可以用 Python 来自己写一个解析器。你也可以使用地理数据来丰富信息,或者基于信息内容来附加一些字段。日志信息可以按处理它的应用程序所要求的格式进行重新格式化。你也可以重写日志信息 —— 当然了,你不能篡改日志内容 —— 比如在某些情况下,需要满足匿名要求的信息。
**过滤日志:** 过滤日志的用法主要有两种:丢弃不需要保存的日志信息 —— 像调试级别的信息;和路由日志信息—— 确保正确的日志到达正确的目的地。后一种用法的一个例子是转发所有的认证相关的信息到一个安全信息与事件管理系统SIEM
**保存信息:** 传统的做法是,将文件保存在本地或者发送到中心化日志服务器;不论是哪种方式,它们都被发送到一个[平面文件][3]。经过这些年的改进syslog-ng 已经开始支持 SQL 数据库,并且在过去的几年里,大数据目的地,包括 HDFS、Kafka、MongoDB、和 Elasticsearch都被加入到 syslog-ng 的支持中。
### 消息格式
当在你的 `/var/log` 目录中查看消息时,你将看到(如上面的 SSH 信息)大量的消息都是如下格式的内容:
`date + host name + application name + an almost complete English sentence`
在这里的每个应用程序事件都是用不同的语法描述的,基于这些数据去创建一个报告是个痛苦的任务。
解决这种混乱信息的一个方案是使用结构化日志。在这种情况下,事件被表示为键-值对,而不是随意的日志信息。比如,一个 SSH 日志能够按应用程序名字、源 IP 地址、用户名、认证方法等等来描述。
你可以从一开始就对你的日志信息按合适的格式进行结构化处理。当处理传统的日志信息时,你可以在 syslog-ng 中使用不同的解析器,转换非结构化(和部分结构化)的信息为键-值对格式。一旦你的日志信息表示为键-值对,那么,报告、报警、以及简单查找信息将变得很容易。
### 物联网日志
我们从一个棘手的问题开始:哪个版本的 syslog-ng 最流行?在你回答之前,想想如下这些事实:这个项目启动于 20 年以前Red Hat 企业版 Linux EPEL 已经有了 3.5 版,并且当前版本是 3.14。当我在我的演讲中问到这个问题时,观众通常回答他们喜欢的 Linux 发行版。你们绝对想不到的是,正确答案竟然是 1.6 版最流行,这个版本已经有 15 年的历史的。这什么这个版本是最为流行的,因为它是包含在亚马逊 Kindle 阅读器中的版本,它是电子书阅读器,因为它运行在全球范围内超过 1 亿台的设备上。另外一个在消费类设备上运行 syslog-ng 的例子是 BMW i3 电动汽车。
Kindle 使用 syslog-ng 去收集关于用户在这台设备上都做了些什么事情的所有可能的信息。在 BMW 电动汽车上syslog-ng 所做的事情更复杂,基于内容过滤日志信息,并且在大多数情况下,只记录最重要的日志。
使用 syslog-ng 的其它类别设备还有网络和存储。一些比较知名的例子有Turris Omnia 开源 Linux 路由器和群晖 NAS 设备。在大多数案例中syslog-ng 是在设备上作为一个日志客户端来运行,但是在有些案例中,它运行为一个有富 Web 界面的中心日志服务器。
你还可以在一些行业服务中找到 syslog-ng 的身影。它运行在来自美国国家仪器有限公司NI的实时 Linux 设备上,执行测量和自动化任务。它也被用于从定制开发的应用程序中收集日志。从命令行就可以做配置,但是一个漂亮的 GUI 可用于浏览日志。
最后这里还有大量的项目比如汽车和飞机syslog-ng 在它们上面既可以运行为客户端也可以运行为服务端。在这种使用案例中syslog-ng 一般用来收集所有的日志和测量数据,然后发送它们到处理这些日志的中心化服务器集群上,然后保存它们到支持大数据的目的地,以备进一步分析。
### 对物联网的整体益处
在物联网环境中使用 syslog-ng 有几个好处。第一,它的分发性能很高,并且是一个可靠的日志收集器。第二,它的架构也很简单,因此,系统、应用程序日志、以及测量数据可以被一起收集。第三,它使数据易于使用,因为,数据可以被解析和表示为易于使用的格式。最后,通过 syslog-ng 的高效路由和过滤功能,可以显著降低处理程序的负载水平。
--------------------------------------------------------------------------------
via: https://opensource.com/article/18/3/logging-iot-events-syslog-ng
作者:[Peter Czanik][a]
译者:[译者ID](https://github.com/译者ID)
校对:[qhwdw](https://github.com/qhwdw)
选题:[lujun9972](https://github.com/lujun9972)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]:https://opensource.com/users/czanik
[1]:https://syslog-ng.com/open-source-log-management
[2]:https://syslog-ng.com/documents/html/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/sources.html
[3]:https://en.wikipedia.org/wiki/Flat_file_database