TranslateProject/sources/tech/20201010 Robust and Race-free Server Logging using Named Pipes.md
DarkSun 10596fc25a 选题[tech]: 20201010 Robust and Race-free Server Logging using Named Pipes
sources/tech/20201010 Robust and Race-free Server Logging using Named Pipes.md
2020-10-11 05:02:09 +08:00

10 KiB
Raw Blame History

Robust and Race-free Server Logging using Named Pipes

If you do any server administration work, youll have worked with log files. And if your servers need to be reliable, youll know that log files are common source of problems, especially when you need to rotate or ship them (which is practically always). In particular, moving files around causes race conditions.

Thankfully, there are better ways. With named pipes, you can have a simple and robust logging framework, with no race conditions, and without patching your servers to support some network logging protocol.

The problems with rotating log files

First, lets talk about the problems. Race conditions are generally a problem with popular file-based logging setups, whether youre rotating logs into archival storage, or shipping them to a remote log processing stack, or whatever. To keep things concrete, though, let me talk about logrotate, just because its a popular tool.

Say you have a log file at /var/log/foo. It gets pretty big, and you want to process the logs periodically and start with a new, empty file. So you (or your distro maintainers) set up logrotate with various rules about when to rotate the file.

By default, logrotate will rename the file (to something like /var/log/foo.1) and create a new /var/log/foo to write to. That (mostly) works for software that runs intermittently (such as a package manager that does software updates). But it wont do any good if the log file is generated by a long-running server. The server only uses the filename when it opens the file; after that it just keeps writing to its open file descriptor. That means it will keep writing to the old file (now named /var/log/foo.1), and the new /var/log/foo file will stay empty.

To handle this use-case, logrotate supports another mode: copytruncate. In this mode, instead of renaming, logrotate will copy the contents of /var/log/foo to an archival file, and then truncate the original file to zero length. As long as the server has the log file open in append mode, it will automatically write new logs to the start of the file, without needing to detect the truncation and do a file seek (the kernel handles that).

That copytruncate mode creates a race condition, though. Any log lines that are written after the copy but before the truncation will get destroyed. Actually, you tend to get the same race condition even with the default move-and-create mode. Thats because theres not much point just splitting up the logs into multiple files. Most systems are configured to do something like compress the old log file, but ultimately you need to delete the old, uncompressed data, which creates the same race as truncating. (In practice, this race isnt so likely for occasional log writers, like package managers, and the delay flag to logrotate makes it rarer, albeit by making the log handling a bit more complicated.)

Some servers, like Nginx, support a modification of the default logrotate mode:

  1. Rename the old file
  2. Create the new file
  3. (New step) notify the server that it needs to reopen its log file.

This works (as long as the logs processor doesnt delete the old file before the server has finished reopening), but it requires special support from the server, and youre out of luck with most software. Theres a lot of software out there, and log file handling just isnt interesting enough to get high on the to-do list. This approach also only works for long-running servers.

I think this is a good point to stop and take a step back. Having multiple processes juggle log files around on disk without any synchronisation is just an inherently painful way to do things. It causes bugs and makes logging stacks complicated (heres just one of many examples). One alternative is to use some network protocol like MQTT or networked syslog, but, realistically, most servers wont support the one you want. And they shouldnt have to — log files are a great interface for log writers.

Thats okay because *nix “everything is a file” lets us easily get a file interface on the writer side, with a streaming interface on the reader side.

Named pipes 101

Maybe youve seen pipes in pipelines like this:

$ sort user_log.txt | uniq

The pipe connecting sort and uniq is a temporary, anonymous communication channel that sort writes to and uniq reads from. Named pipes are less common, but theyre also communication channels. The only difference is that they persist on the filesystem as if they were files.

Open up a terminal and cd into some temporary working directory. The following creates a named pipe and uses cat to open a writer:

$ mkfifo p
$ # This cat command will sit waiting for input
$ cat > p

Leave that cat command waiting, and open up another terminal in the same directory. In this terminal, start your reader:

$ # This will sit waiting for data to come over the pipe
$ cat p

Now as you type things into the writer end, youll see them appear in the reader end. cat will use line buffering in interactive mode, so data will get transferred every time you start a new line.

cat doesnt have to know anything about pipes for this to work — the pipe acts like a file as long as you just naïvely read or write to it. But if you check, youll see the data isnt stored anywhere. You can pump gigabytes through a pipe without filling up any disk space. Once the data has been read once, its lost. (You can have multiple readers, but only one will receive any buffer-load of data.)

Another thing that makes pipes useful for communication is their buffering and blocking. You can start writing before any readers open the pipe, and data gets temporarily buffered inside the kernel until a reader comes along. If the reader starts first, its read will block, waiting for data from the writer. (The writer will also block if the pipe buffer gets full.) If you try the two-terminal experiment again with a regular file, youll see that the reader cat will eagerly read all the data it can and then exit.

An annoying problem and a simple solution

Maybe youre seeing how named pipes can help with logging: Servers can write to log “files” that are actually named pipes, and a logging stack can read log data directly from the named pipe without letting a single line fall onto the floor. You do whatever you want with the logs, without any racey juggling of files on disk.

Theres one annoying problem: the writer doesnt need a reader to start writing, but if a reader opens the pipe and then closes it, the writer gets a SIGPIPE (“broken pipe”), which will kill it by default. (Try killing the reader cat while typing things into the writer to see what I mean.) Similarly, a reader can read without a writer, but if a writer opens the pipe and then closes it, that will be treated like an end of file. Although the named pipe persists on disk, it isnt a stable communication channel if log writers and log readers can restart (as they will on a real server).

Theres a solution thats a bit weird but very simple. Multiple processes can open the pipe for reading and writing, and the pipe will only close when all readers or all writers close it. All we need for a stable logging pipe is a daemon that holds the named pipe open for both reading and writing, without doing any actual reading or writing. I set this up on my personal server last year, and I wrote a tiny, zero-config program to act as my pipe-holding daemon. It just opens every file in its current working directory for both reading and writing. I run it from a directory that has symbolic links to every named pipe in my logging stack. The program runs in a loop that ends in a wait() for a SIGHUP. If I ever update the symlinks in the directory, I give the daemon a kill -HUP and it reopens them all. Sure, it could do its own directory watching, but the SIGHUP approach is simple and predictable, and the whole thing works reliably. Thanks to the pipe buffer, log writers and log readers can be shut down and restarted independently, any time, without breakage.

My server uses the s6 supervision suite to manage daemons, so I have s6-log reading from each logging pipe. The bottom part of the s6-log documentation page has some good insights into the problems with popular logging systems, and good ideas about better ways to do things.

Imagine: a world without log rotation

Strictly speaking, named pipes arent necessary for race-free logs processing. The s6 suite encourages writing logs to some file descriptor (like standard error), and letting the supervision suite make sure those file descriptors point to something useful. However, the named pipe approach adds a few benefits:

  • It doesnt require any co-ordination between writer and reader
  • It integrates nicely with the software we have today
  • It gives things meaningful names (rather than /dev/fd/4)

Ive worked with companies that spend about as much on their logging stacks as on their serving infrastructure, and, no, “we do logs processing” isnt in their business models. Of course, log rotation and log shipping arent the only problems to blame, but it feels so wrong that weve made logs so complicated. If you work on any logging system, consider if you really need to juggle log files around. You could be helping to make the world a better place.


via: https://theartofmachinery.com/2020/10/10/logging_with_named_pipes.html

作者:Simon Arneaud 选题:lujun9972 译者:译者ID 校对:校对者ID

本文由 LCTT 原创编译,Linux中国 荣誉推出