[Translated] 20190415 Inter-process communication in Linux- Shared storage.md

Signed-off-by: Chang Liu <liuchang011235@163.com>
This commit is contained in:
Chang Liu 2019-04-24 08:30:18 +08:00
parent 0cfeecc593
commit 088ce7f807
2 changed files with 435 additions and 419 deletions

View File

@ -1,419 +0,0 @@
[#]: collector: (lujun9972)
[#]: translator: (FSSlc)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Inter-process communication in Linux: Shared storage)
[#]: via: (https://opensource.com/article/19/4/interprocess-communication-linux-storage)
[#]: author: (Marty Kalin https://opensource.com/users/mkalindepauledu)
Inter-process communication in Linux: Shared storage
======
Learn how processes synchronize with each other in Linux.
![Filing papers and documents][1]
This is the first article in a series about [interprocess communication][2] (IPC) in Linux. The series uses code examples in C to clarify the following IPC mechanisms:
* Shared files
* Shared memory (with semaphores)
* Pipes (named and unnamed)
* Message queues
* Sockets
* Signals
This article reviews some core concepts before moving on to the first two of these mechanisms: shared files and shared memory.
### Core concepts
A _process_ is a program in execution, and each process has its own address space, which comprises the memory locations that the process is allowed to access. A process has one or more _threads_ of execution, which are sequences of executable instructions: a _single-threaded_ process has just one thread, whereas a _multi-threaded_ process has more than one thread. Threads within a process share various resources, in particular, address space. Accordingly, threads within a process can communicate straightforwardly through shared memory, although some modern languages (e.g., Go) encourage a more disciplined approach such as the use of thread-safe channels. Of interest here is that different processes, by default, do _not_ share memory.
There are various ways to launch processes that then communicate, and two ways dominate in the examples that follow:
* A terminal is used to start one process, and perhaps a different terminal is used to start another.
* The system function **fork** is called within one process (the parent) to spawn another process (the child).
The first examples take the terminal approach. The [code examples][3] are available in a ZIP file on my website.
### Shared files
Programmers are all too familiar with file access, including the many pitfalls (non-existent files, bad file permissions, and so on) that beset the use of files in programs. Nonetheless, shared files may be the most basic IPC mechanism. Consider the relatively simple case in which one process ( _producer_ ) creates and writes to a file, and another process ( _consumer_ ) reads from this same file:
```
writes +-----------+ reads
producer-------->| disk file |<\-------consumer
+-----------+
```
The obvious challenge in using this IPC mechanism is that a _race condition_ might arise: the producer and the consumer might access the file at exactly the same time, thereby making the outcome indeterminate. To avoid a race condition, the file must be locked in a way that prevents a conflict between a _write_ operation and any another operation, whether a _read_ or a _write_. The locking API in the standard system library can be summarized as follows:
* A producer should gain an exclusive lock on the file before writing to the file. An _exclusive_ lock can be held by one process at most, which rules out a race condition because no other process can access the file until the lock is released.
* A consumer should gain at least a shared lock on the file before reading from the file. Multiple _readers_ can hold a _shared_ lock at the same time, but no _writer_ can access a file when even a single _reader_ holds a shared lock.
A shared lock promotes efficiency. If one process is just reading a file and not changing its contents, there is no reason to prevent other processes from doing the same. Writing, however, clearly demands exclusive access to a file.
The standard I/O library includes a utility function named **fcntl** that can be used to inspect and manipulate both exclusive and shared locks on a file. The function works through a _file descriptor_ , a non-negative integer value that, within a process, identifies a file. (Different file descriptors in different processes may identify the same physical file.) For file locking, Linux provides the library function **flock** , which is a thin wrapper around **fcntl**. The first example uses the **fcntl** function to expose API details.
#### Example 1. The _producer_ program
```
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#define FileName "data.dat"
#define DataString "Now is the winter of our discontent\nMade glorious summer by this sun of York\n"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1); /* EXIT_FAILURE */
}
int main() {
struct flock lock;
lock.l_type = F_WRLCK; /* read/write (exclusive versus shared) lock */
lock.l_whence = SEEK_SET; /* base for seek offsets */
lock.l_start = 0; /* 1st byte in file */
lock.l_len = 0; /* 0 here means 'until EOF' */
lock.l_pid = getpid(); /* process id */
int fd; /* file descriptor to identify a file within a process */
if ((fd = open(FileName, O_RDWR | O_CREAT, 0666)) < 0) /* -1 signals an error */
report_and_exit("open failed...");
if (fcntl(fd, F_SETLK, &lock) < 0) /** F_SETLK doesn't block, F_SETLKW does **/
report_and_exit("fcntl failed to get lock...");
else {
write(fd, DataString, [strlen][6](DataString)); /* populate data file */
[fprintf][7](stderr, "Process %d has written to data file...\n", lock.l_pid);
}
/* Now release the lock explicitly. */
lock.l_type = F_UNLCK;
if (fcntl(fd, F_SETLK, &lock) < 0)
report_and_exit("explicit unlocking failed...");
close(fd); /* close the file: would unlock if needed */
return 0; /* terminating the process would unlock as well */
}
```
The main steps in the _producer_ program above can be summarized as follows:
* The program declares a variable of type **struct flock** , which represents a lock, and initializes the structure's five fields. The first initialization: [code]`lock.l_type = F_WRLCK; /* exclusive lock */`[/code] makes the lock an exclusive ( _read-write_ ) rather than a shared ( _read-only_ ) lock. If the _producer_ gains the lock, then no other process will be able to write or read the file until the _producer_ releases the lock, either explicitly with the appropriate call to **fcntl** or implicitly by closing the file. (When the process terminates, any opened files would be closed automatically, thereby releasing the lock.)
* The program then initializes the remaining fields. The chief effect is that the _entire_ file is to be locked. However, the locking API allows only designated bytes to be locked. For example, if the file contains multiple text records, then a single record (or even part of a record) could be locked and the rest left unlocked.
* The first call to **fcntl** : [code]`if (fcntl(fd, F_SETLK, &lock) < 0)`[/code] tries to lock the file exclusively, checking whether the call succeeded. In general, the **fcntl** function returns **-1** (hence, less than zero) to indicate failure. The second argument **F_SETLK** means that the call to **fcntl** does _not_ block: the function returns immediately, either granting the lock or indicating failure. If the flag **F_SETLKW** (the **W** at the end is for _wait_ ) were used instead, the call to **fcntl** would block until gaining the lock was possible. In the calls to **fcntl** , the first argument **fd** is the file descriptor, the second argument specifies the action to be taken (in this case, **F_SETLK** for setting the lock), and the third argument is the address of the lock structure (in this case, **& lock**).
* If the _producer_ gains the lock, the program writes two text records to the file.
* After writing to the file, the _producer_ changes the lock structure's **l_type** field to the _unlock_ value: [code]`lock.l_type = F_UNLCK;`[/code] and calls **fcntl** to perform the unlocking operation. The program finishes up by closing the file and exiting.
#### Example 2. The _consumer_ program
```
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#define FileName "data.dat"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1); /* EXIT_FAILURE */
}
int main() {
struct flock lock;
lock.l_type = F_WRLCK; /* read/write (exclusive) lock */
lock.l_whence = SEEK_SET; /* base for seek offsets */
lock.l_start = 0; /* 1st byte in file */
lock.l_len = 0; /* 0 here means 'until EOF' */
lock.l_pid = getpid(); /* process id */
int fd; /* file descriptor to identify a file within a process */
if ((fd = open(FileName, O_RDONLY)) < 0) /* -1 signals an error */
report_and_exit("open to read failed...");
/* If the file is write-locked, we can't continue. */
fcntl(fd, F_GETLK, &lock); /* sets lock.l_type to F_UNLCK if no write lock */
if (lock.l_type != F_UNLCK)
report_and_exit("file is still write locked...");
lock.l_type = F_RDLCK; /* prevents any writing during the reading */
if (fcntl(fd, F_SETLK, &lock) < 0)
report_and_exit("can't get a read-only lock...");
/* Read the bytes (they happen to be ASCII codes) one at a time. */
int c; /* buffer for read bytes */
while (read(fd, &c, 1) > 0) /* 0 signals EOF */
write(STDOUT_FILENO, &c, 1); /* write one byte to the standard output */
/* Release the lock explicitly. */
lock.l_type = F_UNLCK;
if (fcntl(fd, F_SETLK, &lock) < 0)
report_and_exit("explicit unlocking failed...");
close(fd);
return 0;
}
```
The _consumer_ program is more complicated than necessary to highlight features of the locking API. In particular, the _consumer_ program first checks whether the file is exclusively locked and only then tries to gain a shared lock. The relevant code is:
```
lock.l_type = F_WRLCK;
...
fcntl(fd, F_GETLK, &lock); /* sets lock.l_type to F_UNLCK if no write lock */
if (lock.l_type != F_UNLCK)
report_and_exit("file is still write locked...");
```
The **F_GETLK** operation specified in the **fcntl** call checks for a lock, in this case, an exclusive lock given as **F_WRLCK** in the first statement above. If the specified lock does not exist, then the **fcntl** call automatically changes the lock type field to **F_UNLCK** to indicate this fact. If the file is exclusively locked, the _consumer_ terminates. (A more robust version of the program might have the _consumer_ **sleep** a bit and try again several times.)
If the file is not currently locked, then the _consumer_ tries to gain a shared ( _read-only_ ) lock ( **F_RDLCK** ). To shorten the program, the **F_GETLK** call to **fcntl** could be dropped because the **F_RDLCK** call would fail if a _read-write_ lock already were held by some other process. Recall that a _read-only_ lock does prevent any other process from writing to the file, but allows other processes to read from the file. In short, a _shared_ lock can be held by multiple processes. After gaining a shared lock, the _consumer_ program reads the bytes one at a time from the file, prints the bytes to the standard output, releases the lock, closes the file, and terminates.
Here is the output from the two programs launched from the same terminal with **%** as the command line prompt:
```
% ./producer
Process 29255 has written to data file...
% ./consumer
Now is the winter of our discontent
Made glorious summer by this sun of York
```
In this first code example, the data shared through IPC is text: two lines from Shakespeare's play _Richard III_. Yet, the shared file's contents could be voluminous, arbitrary bytes (e.g., a digitized movie), which makes file sharing an impressively flexible IPC mechanism. The downside is that file access is relatively slow, whether the access involves reading or writing. As always, programming comes with tradeoffs. The next example has the upside of IPC through shared memory, rather than shared files, with a corresponding boost in performance.
### Shared memory
Linux systems provide two separate APIs for shared memory: the legacy System V API and the more recent POSIX one. These APIs should never be mixed in a single application, however. A downside of the POSIX approach is that features are still in development and dependent upon the installed kernel version, which impacts code portability. For example, the POSIX API, by default, implements shared memory as a _memory-mapped file_ : for a shared memory segment, the system maintains a _backing file_ with corresponding contents. Shared memory under POSIX can be configured without a backing file, but this may impact portability. My example uses the POSIX API with a backing file, which combines the benefits of memory access (speed) and file storage (persistence).
The shared-memory example has two programs, named _memwriter_ and _memreader_ , and uses a _semaphore_ to coordinate their access to the shared memory. Whenever shared memory comes into the picture with a _writer_ , whether in multi-processing or multi-threading, so does the risk of a memory-based race condition; hence, the semaphore is used to coordinate (synchronize) access to the shared memory.
The _memwriter_ program should be started first in its own terminal. The _memreader_ program then can be started (within a dozen seconds) in its own terminal. The output from the _memreader_ is:
```
`This is the way the world ends...`
```
Each source file has documentation at the top explaining the link flags to be included during compilation.
Let's start with a review of how semaphores work as a synchronization mechanism. A general semaphore also is called a _counting semaphore_ , as it has a value (typically initialized to zero) that can be incremented. Consider a shop that rents bicycles, with a hundred of them in stock, with a program that clerks use to do the rentals. Every time a bike is rented, the semaphore is incremented by one; when a bike is returned, the semaphore is decremented by one. Rentals can continue until the value hits 100 but then must halt until at least one bike is returned, thereby decrementing the semaphore to 99.
A _binary semaphore_ is a special case requiring only two values: 0 and 1. In this situation, a semaphore acts as a _mutex_ : a mutual exclusion construct. The shared-memory example uses a semaphore as a mutex. When the semaphore's value is 0, the _memwriter_ alone can access the shared memory. After writing, this process increments the semaphore's value, thereby allowing the _memreader_ to read the shared memory.
#### Example 3. Source code for the _memwriter_ process
```
/** Compilation: gcc -o memwriter memwriter.c -lrt -lpthread **/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <semaphore.h>
#include <string.h>
#include "shmem.h"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1);
}
int main() {
int fd = shm_open(BackingFile, /* name from smem.h */
O_RDWR | O_CREAT, /* read/write, create if needed */
AccessPerms); /* access permissions (0644) */
if (fd < 0) report_and_exit("Can't open shared mem segment...");
ftruncate(fd, ByteSize); /* get the bytes */
caddr_t memptr = mmap(NULL, /* let system pick where to put segment */
ByteSize, /* how many bytes */
PROT_READ | PROT_WRITE, /* access protections */
MAP_SHARED, /* mapping visible to other processes */
fd, /* file descriptor */
0); /* offset: start at 1st byte */
if ((caddr_t) -1 == memptr) report_and_exit("Can't get segment...");
[fprintf][7](stderr, "shared mem address: %p [0..%d]\n", memptr, ByteSize - 1);
[fprintf][7](stderr, "backing file: /dev/shm%s\n", BackingFile );
/* semaphore code to lock the shared mem */
sem_t* semptr = sem_open(SemaphoreName, /* name */
O_CREAT, /* create the semaphore */
AccessPerms, /* protection perms */
0); /* initial value */
if (semptr == (void*) -1) report_and_exit("sem_open");
[strcpy][8](memptr, MemContents); /* copy some ASCII bytes to the segment */
/* increment the semaphore so that memreader can read */
if (sem_post(semptr) < 0) report_and_exit("sem_post");
sleep(12); /* give reader a chance */
/* clean up */
munmap(memptr, ByteSize); /* unmap the storage */
close(fd);
sem_close(semptr);
shm_unlink(BackingFile); /* unlink from the backing file */
return 0;
}
```
Here's an overview of how the _memwriter_ and _memreader_ programs communicate through shared memory:
* The _memwriter_ program, shown above, calls the **shm_open** function to get a file descriptor for the backing file that the system coordinates with the shared memory. At this point, no memory has been allocated. The subsequent call to the misleadingly named function **ftruncate** : [code]`ftruncate(fd, ByteSize); /* get the bytes */`[/code] allocates **ByteSize** bytes, in this case, a modest 512 bytes. The _memwriter_ and _memreader_ programs access the shared memory only, not the backing file. The system is responsible for synchronizing the shared memory and the backing file.
* The _memwriter_ then calls the **mmap** function: [code] caddr_t memptr = mmap(NULL, /* let system pick where to put segment */
ByteSize, /* how many bytes */
PROT_READ | PROT_WRITE, /* access protections */
MAP_SHARED, /* mapping visible to other processes */
fd, /* file descriptor */
0); /* offset: start at 1st byte */ [/code] to get a pointer to the shared memory. (The _memreader_ makes a similar call.) The pointer type **caddr_t** starts with a **c** for **calloc** , a system function that initializes dynamically allocated storage to zeroes. The _memwriter_ uses the **memptr** for the later _write_ operation, using the library **strcpy** (string copy) function.
* At this point, the _memwriter_ is ready for writing, but it first creates a semaphore to ensure exclusive access to the shared memory. A race condition would occur if the _memwriter_ were writing while the _memreader_ was reading. If the call to **sem_open** succeeds: [code] sem_t* semptr = sem_open(SemaphoreName, /* name */
O_CREAT, /* create the semaphore */
AccessPerms, /* protection perms */
0); /* initial value */ [/code] then the writing can proceed. The **SemaphoreName** (any unique non-empty name will do) identifies the semaphore in both the _memwriter_ and the _memreader_. The initial value of zero gives the semaphore's creator, in this case, the _memwriter_ , the right to proceed, in this case, to the _write_ operation.
* After writing, the _memwriter_ increments the semaphore value to 1: [code]`if (sem_post(semptr) < 0) ..`[/code] with a call to the **sem_post** function. Incrementing the semaphore releases the mutex lock and enables the _memreader_ to perform its _read_ operation. For good measure, the _memwriter_ also unmaps the shared memory from the _memwriter_ address space: [code]`munmap(memptr, ByteSize); /* unmap the storage *`[/code] This bars the _memwriter_ from further access to the shared memory.
#### Example 4. Source code for the _memreader_ process
```
/** Compilation: gcc -o memreader memreader.c -lrt -lpthread **/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <semaphore.h>
#include <string.h>
#include "shmem.h"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1);
}
int main() {
int fd = shm_open(BackingFile, O_RDWR, AccessPerms); /* empty to begin */
if (fd < 0) report_and_exit("Can't get file descriptor...");
/* get a pointer to memory */
caddr_t memptr = mmap(NULL, /* let system pick where to put segment */
ByteSize, /* how many bytes */
PROT_READ | PROT_WRITE, /* access protections */
MAP_SHARED, /* mapping visible to other processes */
fd, /* file descriptor */
0); /* offset: start at 1st byte */
if ((caddr_t) -1 == memptr) report_and_exit("Can't access segment...");
/* create a semaphore for mutual exclusion */
sem_t* semptr = sem_open(SemaphoreName, /* name */
O_CREAT, /* create the semaphore */
AccessPerms, /* protection perms */
0); /* initial value */
if (semptr == (void*) -1) report_and_exit("sem_open");
/* use semaphore as a mutex (lock) by waiting for writer to increment it */
if (!sem_wait(semptr)) { /* wait until semaphore != 0 */
int i;
for (i = 0; i < [strlen][6](MemContents); i++)
write(STDOUT_FILENO, memptr + i, 1); /* one byte at a time */
sem_post(semptr);
}
/* cleanup */
munmap(memptr, ByteSize);
close(fd);
sem_close(semptr);
unlink(BackingFile);
return 0;
}
```
In both the _memwriter_ and _memreader_ programs, the shared-memory functions of main interest are **shm_open** and **mmap** : on success, the first call returns a file descriptor for the backing file, which the second call then uses to get a pointer to the shared memory segment. The calls to **shm_open** are similar in the two programs except that the _memwriter_ program creates the shared memory, whereas the _memreader_ only accesses this already created memory:
```
int fd = shm_open(BackingFile, O_RDWR | O_CREAT, AccessPerms); /* memwriter */
int fd = shm_open(BackingFile, O_RDWR, AccessPerms); /* memreader */
```
With a file descriptor in hand, the calls to **mmap** are the same:
```
`caddr_t memptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);`
```
The first argument to **mmap** is **NULL** , which means that the system determines where to allocate the memory in virtual address space. It's possible (but tricky) to specify an address instead. The **MAP_SHARED** flag indicates that the allocated memory is shareable among processes, and the last argument (in this case, zero) means that the offset for the shared memory should be the first byte. The **size** argument specifies the number of bytes to be allocated (in this case, 512), and the protection argument indicates that the shared memory can be written and read.
When the _memwriter_ program executes successfully, the system creates and maintains the backing file; on my system, the file is _/dev/shm/shMemEx_ , with _shMemEx_ as my name (given in the header file _shmem.h_ ) for the shared storage. In the current version of the _memwriter_ and _memreader_ programs, the statement:
```
`shm_unlink(BackingFile); /* removes backing file */`
```
removes the backing file. If the **unlink** statement is omitted, then the backing file persists after the program terminates.
The _memreader_ , like the _memwriter_ , accesses the semaphore through its name in a call to **sem_open**. But the _memreader_ then goes into a wait state until the _memwriter_ increments the semaphore, whose initial value is 0:
```
`if (!sem_wait(semptr)) { /* wait until semaphore != 0 */`
```
Once the wait is over, the _memreader_ reads the ASCII bytes from the shared memory, cleans up, and terminates.
The shared-memory API includes operations explicitly to synchronize the shared memory segment and the backing file. These operations have been omitted from the example to reduce clutter and keep the focus on the memory-sharing and semaphore code.
The _memwriter_ and _memreader_ programs are likely to execute without inducing a race condition even if the semaphore code is removed: the _memwriter_ creates the shared memory segment and writes immediately to it; the _memreader_ cannot even access the shared memory until this has been created. However, best practice requires that shared-memory access is synchronized whenever a _write_ operation is in the mix, and the semaphore API is important enough to be highlighted in a code example.
### Wrapping up
The shared-file and shared-memory examples show how processes can communicate through _shared storage_ , files in one case and memory segments in the other. The APIs for both approaches are relatively straightforward. Do these approaches have a common downside? Modern applications often deal with streaming data, indeed, with massively large streams of data. Neither the shared-file nor the shared-memory approaches are well suited for massive data streams. Channels of one type or another are better suited. Part 2 thus introduces channels and message queues, again with code examples in C.
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/4/interprocess-communication-linux-storage
作者:[Marty Kalin][a]
选题:[lujun9972][b]
译者:[FSSlc](https://github.com/FSSlc)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/mkalindepauledu
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/documents_papers_file_storage_work.png?itok=YlXpAqAJ (Filing papers and documents)
[2]: https://en.wikipedia.org/wiki/Inter-process_communication
[3]: http://condor.depaul.edu/mkalin
[4]: http://www.opengroup.org/onlinepubs/009695399/functions/perror.html
[5]: http://www.opengroup.org/onlinepubs/009695399/functions/exit.html
[6]: http://www.opengroup.org/onlinepubs/009695399/functions/strlen.html
[7]: http://www.opengroup.org/onlinepubs/009695399/functions/fprintf.html
[8]: http://www.opengroup.org/onlinepubs/009695399/functions/strcpy.html

View File

@ -0,0 +1,435 @@
[#]: collector: (lujun9972)
[#]: translator: (FSSlc)
[#]: reviewer: ( )
[#]: publisher: ( )
[#]: url: ( )
[#]: subject: (Inter-process communication in Linux: Shared storage)
[#]: via: (https://opensource.com/article/19/4/interprocess-communication-linux-storage)
[#]: author: (Marty Kalin https://opensource.com/users/mkalindepauledu)
Linux 下的进程间通信:共享存储
======
学习在 Linux 中进程是如何与其他进程进行同步的。
![Filing papers and documents][1]
本篇是 Linux 下[进程间通信][2]IPC系列的第一篇文章。这个系列将使用 C 语言代码示例来阐明以下 IPC 机制:
* 共享文件
* 共享内存(使用信号量)
* 管道(命名的或非命名的管道)
* 消息队列
* 套接字
* 信号
在聚焦上面提到的共享文件和共享内存这两个机制之前,这篇文章将复习一些核心的概念。
### 核心概念
_进程_ 是运行着的程序,每个进程都有着它自己的地址空间,这些空间由进程被允许获取的内存地址组成。进程有一个或多个执行 _线程_,而线程是一系列执行指令的集合: _单线程_ 进程就只有一个线程,而 _多线程_ 的进程则有多个线程。一个进程中的线程共享各种资源,特别是地址空间。另外,一个进程中的线程可以直接通过共享内存来进行通信,尽管某些现代语言(例如 Go鼓励一种更有序的方式例如使用线程安全的通道。当然对于不同的进程默认情况下它们 _不_ 能共享内存。
启动进程后进行通信有多种方法,下面所举的例子中主要使用了下面的两种方法:
* 一个终端被用来启动一个进程,另外一个不同的终端被用来启动另一个。
* 在一个进程(父进程)中调用系统函数 **fork**,以此生发另一个进程(子进程)。
第一个例子采用了上面使用终端的方法。这些[代码示例][3]的 ZIP 压缩包可以从我的网站下载到。
### 共享文件
程序员对文件获取问题应该都已经很熟识了,包括许多坑(不存在的文件、文件权限损坏等等),这些问题困扰着程序对文件的使用。尽管如此,共享文件可能是最为基础的 IPC 机制了。考虑一下下面这样一个相对简单的例子其中一个进程_生产者_创建和写入一个文件然后另一个进程_消费者_从这个相同的文件中进行读取
```
writes +-----------+ reads
producer-------->| disk file |<-------consumer
+-----------+
```
在使用这个 IPC 机制时最明显的挑战是 _竞争条件_ 可能会发生:生产者和消费者可能恰好在同一时间访问该文件,从而使得输出结果不确定。为了避免竞争条件的发生,该文件在处于 _读__写_ 状态时必须以某种方式处于被锁状态,从而阻止在 _写_ 操作执行时和其他操作的冲突。在标准系统库中与锁相关的 API 可以被总结如下:
* 生产者应该在写入文件时获得一个文件的排斥锁。一个 _排斥_ 锁最多被一个进程所拥有。这样就可以排除掉竞争条件的发生,因为在锁被释放之前没有其他的进程可以访问这个文件。
* 消费者应该在从文件中读取内容时得到至少一个共享锁。多个 _readers_ 可以同时保有一个 _共享_ 锁,但是没有 _writer_ 可以获取到文件内容,甚至在当一个单独的 _reader_ 保有一个共享锁时。
共享锁可以提升效率。假如一个进程只是读入一个文件的内容,而不去改变它的内容,就没有什么原因阻止其他进程来做同样的事。但如果需要写入内容,则很显然需要文件有排斥锁。
标准的 I/O 库中包含一个名为 **fcntl** 的实用函数,它可以被用来检查或者操作一个文件上的排斥锁和共享锁。该函数通过一个 _文件描述符_ (一个在进程中的非负整数值)来标记一个文件(在不同的进程中不同的文件描述符可能标记同一个物理文件)。对于文件的锁定, Linux 提供了名为 **flock** 的库函数,它是 **fcntl** 的一个精简包装。第一个例子中使用 **fcntl** 函数来暴露这些 API 细节。
#### 示例 1. _生产者_ 程序
```c
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#define FileName "data.dat"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1); /* EXIT_FAILURE */
}
int main() {
struct flock lock;
lock.l_type = F_WRLCK; /* read/write (exclusive) lock */
lock.l_whence = SEEK_SET; /* base for seek offsets */
lock.l_start = 0; /* 1st byte in file */
lock.l_len = 0; /* 0 here means 'until EOF' */
lock.l_pid = getpid(); /* process id */
int fd; /* file descriptor to identify a file within a process */
if ((fd = open(FileName, O_RDONLY)) < 0) /* -1 signals an error */
report_and_exit("open to read failed...");
/* If the file is write-locked, we can't continue. */
fcntl(fd, F_GETLK, &lock); /* sets lock.l_type to F_UNLCK if no write lock */
if (lock.l_type != F_UNLCK)
report_and_exit("file is still write locked...");
lock.l_type = F_RDLCK; /* prevents any writing during the reading */
if (fcntl(fd, F_SETLK, &lock) < 0)
report_and_exit("can't get a read-only lock...");
/* Read the bytes (they happen to be ASCII codes) one at a time. */
int c; /* buffer for read bytes */
while (read(fd, &c, 1) > 0) /* 0 signals EOF */
write(STDOUT_FILENO, &c, 1); /* write one byte to the standard output */
/* Release the lock explicitly. */
lock.l_type = F_UNLCK;
if (fcntl(fd, F_SETLK, &lock) < 0)
report_and_exit("explicit unlocking failed...");
close(fd);
return 0;
}
```
上面 _生产者_ 程序的主要步骤可以总结如下:
* 这个程序首先声明了一个类型为 **struct flock** 的变量,它代表一个锁,并对它的 5 个域做了初始化。第一个初始化
```c
lock.l_type = F_WRLCK; /* exclusive lock */
````
使得这个锁为排斥锁_read-write_而不是一个共享锁_read-only_。假如 _生产者_ 获得了这个锁,则其他的进程将不能够对文件做读或者写操作,直到 _生产者_ 释放了这个锁,或者显式地调用 **fcntl**,又或者隐式地关闭这个文件。(当进程终止时,所有被它打开的文件都会被自动关闭,从而释放了锁)
* 上面的程序接着初始化其他的域。主要的效果是 _整个_ 文件都将被锁上。但是,有关锁的 API 允许特别指定的字节被上锁。例如,假如文件包含多个文本记录,则单个记录(或者甚至一个记录的一部分)可以被锁,而其余部分不被锁。
* 第一次调用 **fcntl**
```c
if (fcntl(fd, F_SETLK, &lock) < 0)
```
尝试排斥性地将文件锁住,并检查调用是否成功。一般来说, **fcntl** 函数返回 **-1** (因此小于 0意味着失败。第二个参数 **F_SETLK** 意味着 **fcntl** 的调用 _不是_ 堵塞的;函数立即做返回,要么获得锁,要么显示失败了。假如替换地使用 **F_SETLKW**(末尾的 **W** 代指 _等待_),那么对 **fcntl** 的调用将是阻塞的,直到有可能获得锁的时候。在调用 **fcntl** 函数时,它的第一个参数 **fd** 指的是文件描述符,第二个参数指定了将要采取的动作(在这个例子中,**F_SETLK** 指代设置锁),第三个参数为锁结构的地址(在本例中,指的是 **& lock**)。
* 假如 _生产者_ 获得了锁,这个程序将向文件写入两个文本记录。
* 在向文件写入内容后_生产者_ 改变锁结构中的 **l_type** 域为 _unlock_ 值:
```c
lock.l_type = F_UNLCK;
```
并调用 **fcntl** 来执行解锁操作。最后程序关闭了文件并退出。
#### 示例 2. _消费者_ 程序
```c
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#define FileName "data.dat"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1); /* EXIT_FAILURE */
}
int main() {
struct flock lock;
lock.l_type = F_WRLCK; /* read/write (exclusive) lock */
lock.l_whence = SEEK_SET; /* base for seek offsets */
lock.l_start = 0; /* 1st byte in file */
lock.l_len = 0; /* 0 here means 'until EOF' */
lock.l_pid = getpid(); /* process id */
int fd; /* file descriptor to identify a file within a process */
if ((fd = open(FileName, O_RDONLY)) < 0) /* -1 signals an error */
report_and_exit("open to read failed...");
/* If the file is write-locked, we can't continue. */
fcntl(fd, F_GETLK, &lock); /* sets lock.l_type to F_UNLCK if no write lock */
if (lock.l_type != F_UNLCK)
report_and_exit("file is still write locked...");
lock.l_type = F_RDLCK; /* prevents any writing during the reading */
if (fcntl(fd, F_SETLK, &lock) < 0)
report_and_exit("can't get a read-only lock...");
/* Read the bytes (they happen to be ASCII codes) one at a time. */
int c; /* buffer for read bytes */
while (read(fd, &c, 1) > 0) /* 0 signals EOF */
write(STDOUT_FILENO, &c, 1); /* write one byte to the standard output */
/* Release the lock explicitly. */
lock.l_type = F_UNLCK;
if (fcntl(fd, F_SETLK, &lock) < 0)
report_and_exit("explicit unlocking failed...");
close(fd);
return 0;
}
```
相比于着重解释锁的 API_消费者_ 程序会相对复杂一点儿。特别的_消费者_ 程序首先检查文件是否被排斥性的被锁,然后才尝试去获得一个共享锁。相关的代码为:
```
lock.l_type = F_WRLCK;
...
fcntl(fd, F_GETLK, &lock); /* sets lock.l_type to F_UNLCK if no write lock */
if (lock.l_type != F_UNLCK)
report_and_exit("file is still write locked...");
```
**fcntl** 调用中的 **F_GETLK** 操作指定检查一个锁,在本例中,上面代码的声明中给了一个 **F_WRLCK** 的排斥锁。假如特指的锁不存在,那么 **fcntl** 调用将会自动地改变锁类型域为 **F_UNLCK** 以此来显示当前的状态。假如文件是排斥性地被锁,那么 _消费者_ 将会终止。(一个更健壮的程序版本或许应该让 _消费者_ _睡_ 会儿,然后再尝试几次。)
假如当前文件没有被锁,那么 _消费者_ 将尝试获取一个共享_read-only_**F_RDLCK**)。为了缩短程序,**fcntl** 中的 **F_GETLK** 调用可以丢弃,因为假如其他进程已经保有一个 _读写_ 锁,**F_RDLCK** 的调用就可能会失败。重新调用一个 _只读_ 锁能够阻止其他进程向文件进行写的操作但可以允许其他进程对文件进行读取。简而言之共享锁可以被多个进程所保有。在获取了一个共享锁后_消费者_ 程序将立即从文件中读取字节数据,然后在标准输出中打印这些字节的内容,接着释放锁,关闭文件并终止。
下面的 **%** 为命令行提示符,下面展示的是从相同终端开启这两个程序的输出:
```
% ./producer
Process 29255 has written to data file...
% ./consumer
Now is the winter of our discontent
Made glorious summer by this sun of York
```
在本次的代码示例中,通过 IPC 传输的数据是文本:它们来自莎士比亚的戏剧《理查三世》中的两行台词。然而,共享文件的内容还可以是纷繁复杂的,任意的字节数据(例如一个电影)都可以,这使得文件共享变成了一个非常灵活的 IPC 机制。但它的缺点是文件获取速度较慢,因为文件的获取涉及到读或者写。同往常一样,编程总是伴随着折中。下面的例子将通过共享内存来做 IPC而不是通过共享文件在性能上相应的有极大的提升。
### 共享内存
对于共享内存Linux 系统提供了两类不同的 API传统的 System V API 和更新一点的 POSIX API。在单个应用中这些 API 不能混用。但是, POSIX 方式的一个坏处是它的特性仍在发展中,并且依赖于安装的内核版本,这非常影响代码的可移植性。例如, 默认情况下POSIX API 用 _内存映射文件_ 来实现共享内存:对于一个共享的内存段,系统为相应的内容维护一个 _备份文件_。在 POSIX 规范下共享内存可以被配置为不需要备份文件,但这可能会影响可移植性。我的例子中使用的是带有备份文件的 POSIX API这既结合了内存获取的速度优势又获得了文件存储的持久性。
下面的共享内存例子中包含两个程序,分别名为 _memwriter__memreader_,并使用 _信号量_ 来调整它们对共享内存的获取。在任何时候当共享内存进入一个 _writer_ 的版图时,无论是多进程还是多线程,都有遇到基于内存的竞争条件的风险,所以,需要引入信号量来协调(同步)对共享内存的获取。
_memwriter_ 程序应当在它自己所处的终端首先启动,然后 _memreader_ 程序才可以在它自己所处的终端启动在接着的十几秒内。_memreader_ 的输出如下:
```
This is the way the world ends...
```
在每个源程序的最上方注释部分都解释了在编译它们时需要添加的链接参数。
首先让我们复习一下信号量是如何作为一个同步机制工作的。一般的信号量也被叫做一个 _计数信号量_,因为带有一个可以增加的值(通常初始化为 0。考虑一家租用自行车的商店在它的库存中有 100 辆自行车,还有一个供职员用于租赁的程序。每当一辆自行车被租出去,信号量就增加 1当一辆自行车被还回来信号量就减 1。在信号量的值为 100 之前都还可以进行租赁业务,但如果等于 100 时,就必须停止业务,直到至少有一辆自行车被还回来,从而信号量减为 99。
_二元信号量_ 是一个特例,它只有两个值: 0 和 1。在这种情况下信号量的表现为 _互斥量_(一个互斥的构造)。下面的共享内存示例将把信号量用作互斥量。当信号量的值为 0 时,只有 _memwriter_ 可以获取共享内存,在写操作完成后,这个进程将增加信号量的值,从而允许 _memreader_ 来读取共享内存。
#### 示例 3. _memwriter_ 进程的源程序
```c
/** Compilation: gcc -o memwriter memwriter.c -lrt -lpthread **/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <semaphore.h>
#include <string.h>
#include "shmem.h"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1);
}
int main() {
int fd = shm_open(BackingFile, /* name from smem.h */
O_RDWR | O_CREAT, /* read/write, create if needed */
AccessPerms); /* access permissions (0644) */
if (fd < 0) report_and_exit("Can't open shared mem segment...");
ftruncate(fd, ByteSize); /* get the bytes */
caddr_t memptr = mmap(NULL, /* let system pick where to put segment */
ByteSize, /* how many bytes */
PROT_READ | PROT_WRITE, /* access protections */
MAP_SHARED, /* mapping visible to other processes */
fd, /* file descriptor */
0); /* offset: start at 1st byte */
if ((caddr_t) -1 == memptr) report_and_exit("Can't get segment...");
[fprintf][7](stderr, "shared mem address: %p [0..%d]\n", memptr, ByteSize - 1);
[fprintf][7](stderr, "backing file: /dev/shm%s\n", BackingFile );
/* semahore code to lock the shared mem */
sem_t* semptr = sem_open(SemaphoreName, /* name */
O_CREAT, /* create the semaphore */
AccessPerms, /* protection perms */
0); /* initial value */
if (semptr == (void*) -1) report_and_exit("sem_open");
[strcpy][8](memptr, MemContents); /* copy some ASCII bytes to the segment */
/* increment the semaphore so that memreader can read */
if (sem_post(semptr) < 0) report_and_exit("sem_post");
sleep(12); /* give reader a chance */
/* clean up */
munmap(memptr, ByteSize); /* unmap the storage */
close(fd);
sem_close(semptr);
shm_unlink(BackingFile); /* unlink from the backing file */
return 0;
}
```
下面是 _memwriter__memreader_ 程序如何通过共享内存来通信的一个总结:
* 上面展示的 _memwriter_ 程序调用 **shm_open** 函数来得到作为系统协调共享内存的备份文件的文件描述符。此时,并没有内存被分配。接下来调用的是令人误解的名为 **ftruncate** 的函数
```c
ftruncate(fd, ByteSize); /* get the bytes */
```
它将分配 **ByteSize** 字节的内存,在该情况下,一般为大小适中的 512 字节。_memwriter_ 和 _memreader_ 程序都只从共享内存中获取数据,而不是从备份文件。系统将负责共享内存和备份文件之间数据的同步。
* 接着 _memwriter_ 调用 **mmap** 函数:
```c
caddr_t memptr = mmap(NULL, /* let system pick where to put segment */
ByteSize, /* how many bytes */
PROT_READ | PROT_WRITE, /* access protections */
MAP_SHARED, /* mapping visible to other processes */
fd, /* file descriptor */
0); /* offset: start at 1st byte */
```
来获得共享内存的指针。_memreader_ 也做一次类似的调用。) 指针类型 **caddr_t****c** 开头,它代表 **calloc**,而这是动态初始化分配的内存为 0 的一个系统函数。_memwriter_ 通过库函数 **strcpy**string copy来获取后续 _写_ 操作的 **memptr**
* 到现在为止, _memwriter_ 已经准备好进行写操作了,但首先它要创建一个信号量来确保共享内存的排斥性。假如 _memwriter_ 正在执行写操作而同时 _memreader_ 在执行读操作,则有可能出现竞争条件。假如调用 **sem_open**
成功了:
```c
sem_t* semptr = sem_open(SemaphoreName, /* name */
O_CREAT, /* create the semaphore */
AccessPerms, /* protection perms */
0); /* initial value */
```
那么,接着写操作便可以执行。上面的 **SemaphoreName**(任意一个唯一的非空名称)用来在 _memwriter__memreader_ 识别信号量。 初始值 0 将会传递给信号量的创建者,在这个例子中指的是 _memwriter_ 赋予它执行 _写_ 操作的权利。
* 在写操作完成后_memwriter_ 通过调用 **sem_post** 函数将信号量的值增加到 1
```c
if (sem_post(semptr) < 0) ..
```
增加信号了将释放互斥锁,使得 _memreader_ 可以执行它的 _读_ 操作。为了更好地测量_memwriter_ 也将从它自己的地址空间中取消映射,
```c
munmap(memptr, ByteSize); /* unmap the storage *
```
这将使得 _memwriter_ 不能进一步地访问共享内存。
#### 示例 4. _memreader_ 进程的源代码
```c
/** Compilation: gcc -o memreader memreader.c -lrt -lpthread **/
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <semaphore.h>
#include <string.h>
#include "shmem.h"
void report_and_exit(const char* msg) {
[perror][4](msg);
[exit][5](-1);
}
int main() {
int fd = shm_open(BackingFile, O_RDWR, AccessPerms); /* empty to begin */
if (fd < 0) report_and_exit("Can't get file descriptor...");
/* get a pointer to memory */
caddr_t memptr = mmap(NULL, /* let system pick where to put segment */
ByteSize, /* how many bytes */
PROT_READ | PROT_WRITE, /* access protections */
MAP_SHARED, /* mapping visible to other processes */
fd, /* file descriptor */
0); /* offset: start at 1st byte */
if ((caddr_t) -1 == memptr) report_and_exit("Can't access segment...");
/* create a semaphore for mutual exclusion */
sem_t* semptr = sem_open(SemaphoreName, /* name */
O_CREAT, /* create the semaphore */
AccessPerms, /* protection perms */
0); /* initial value */
if (semptr == (void*) -1) report_and_exit("sem_open");
/* use semaphore as a mutex (lock) by waiting for writer to increment it */
if (!sem_wait(semptr)) { /* wait until semaphore != 0 */
int i;
for (i = 0; i < [strlen][6](MemContents); i++)
write(STDOUT_FILENO, memptr + i, 1); /* one byte at a time */
sem_post(semptr);
}
/* cleanup */
munmap(memptr, ByteSize);
close(fd);
sem_close(semptr);
unlink(BackingFile);
return 0;
}
```
_memwriter_ 和 _memreader_ 程序中,共享内存的主要着重点都在 **shm_open****mmap** 函数上:在成功时,第一个调用返回一个备份文件的文件描述符,而第二个调用则使用这个文件描述符从共享内存段中获取一个指针。它们对 **shm_open** 的调用都很相似,除了 _memwriter_ 程序创建共享内存,而 _memreader_ 只获取这个已经创建
的内存:
```c
int fd = shm_open(BackingFile, O_RDWR | O_CREAT, AccessPerms); /* memwriter */
int fd = shm_open(BackingFile, O_RDWR, AccessPerms); /* memreader */
```
手握文件描述符,接着对 **mmap** 的调用就是类似的了:
```c
caddr_t memptr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
```
**mmap** 的第一个参数为 **NULL**,这意味着让系统自己决定在虚拟内存地址的哪个地方分配内存,当然也可以指定一个地址(但很有技巧性)。**MAP_SHARED** 标志着被分配的内存在进程中是共享的,最后一个参数(在这个例子中为 0 意味着共享内存的偏移量应该为第一个字节。**size** 参数特别指定了将要分配的字节数目(在这个例子中是 512另外的保护参数AccessPerms暗示着共享内存是可读可写的。
_memwriter_ 程序执行成功后,系统将创建并维护备份文件,在我的系统中,该文件为 _/dev/shm/shMemEx_,其中的 _shMemEx_ 是我为共享存储命名的(在头文件 _shmem.h_ 中给定)。在当前版本的 _memwriter__memreader_ 程序中,下面的语句
```c
shm_unlink(BackingFile); /* removes backing file */
```
将会移除备份文件。假如没有 **unlink** 这个词,则备份文件在程序终止后仍然持久地保存着。
_memreader_ 和 _memwriter_ 一样,在调用 **sem_open** 函数时,通过信号量的名字来获取信号量。但 _memreader_ 随后将进入等待状态,直到 _memwriter_ 将初始值为 0 的信号量的值增加。
```c
if (!sem_wait(semptr)) { /* wait until semaphore != 0 */
```
一旦等待结束_memreader_ 将从共享内存中读取 ASCII 数据,然后做些清理工作并终止。
共享内存 API 包括显式地同步共享内存段和备份文件。在这次的示例中,这些操作都被省略了,以免文章显得杂乱,好让我们专注于内存共享和信号量的代码。
即便在信号量代码被移除的情况下_memwriter_ 和 _memreader_ 程序很大几率也能够正常执行而不会引入竞争条件_memwriter_ 创建了共享内存段然后立即向它写入_memreader_ 不能访问共享内存,直到共享内存段被创建好。然而,当一个 _写操作_ 处于混合状态时,最佳实践需要共享内存被同步。信号量 API 足够重要,值得在代码示例中着重强调。
### 总结
上面共享文件和共享内存的例子展示了进程是怎样通过 _共享存储_ 来进行通信的,前者通过文件而后者通过内存块。这两种方法的 API 相对来说都很直接。这两种方法有什么共同的缺点吗?现代的应用经常需要处理流数据,而且是非常大规模的数据流。共享文件或者共享内存的方法都不能很好地处理大规模的流数据。按照类型使用管道会更加合适一些。所以这个系列的第二部分将会介绍管道和消息队列,同样的,我们将使用 C 语言写的代码示例来辅助讲解。
--------------------------------------------------------------------------------
via: https://opensource.com/article/19/4/interprocess-communication-linux-storage
作者:[Marty Kalin][a]
选题:[lujun9972][b]
译者:[FSSlc](https://github.com/FSSlc)
校对:[校对者ID](https://github.com/校对者ID)
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创编译,[Linux中国](https://linux.cn/) 荣誉推出
[a]: https://opensource.com/users/mkalindepauledu
[b]: https://github.com/lujun9972
[1]: https://opensource.com/sites/default/files/styles/image-full-size/public/lead-images/documents_papers_file_storage_work.png?itok=YlXpAqAJ (Filing papers and documents)
[2]: https://en.wikipedia.org/wiki/Inter-process_communication
[3]: http://condor.depaul.edu/mkalin
[4]: http://www.opengroup.org/onlinepubs/009695399/functions/perror.html
[5]: http://www.opengroup.org/onlinepubs/009695399/functions/exit.html
[6]: http://www.opengroup.org/onlinepubs/009695399/functions/strlen.html
[7]: http://www.opengroup.org/onlinepubs/009695399/functions/fprintf.html
[8]: http://www.opengroup.org/onlinepubs/009695399/functions/strcpy.html