Message ID | 20240325085927.2041034-1-stepnem@smrk.net |
---|---|
State | New |
Headers | show |
Series | manual: Drop incorrect statement on PIPE_BUF and blocking writes | expand |
* Štěpán Němec: > diff --git a/manual/pipe.texi b/manual/pipe.texi > index 483c40c5c3dd..8a9a275cafe7 100644 > --- a/manual/pipe.texi > +++ b/manual/pipe.texi > @@ -312,8 +312,7 @@ > > Reading or writing a larger amount of data may not be atomic; for > example, output data from other processes sharing the descriptor may be > -interspersed. Also, once @code{PIPE_BUF} characters have been written, > -further writes will block until some characters are read. > +interspersed. Maybe “further may block” instead? I think the reference to PIPE_BUF and blocking could still be helpful, except that it's not a guarantee, as you correctly point out. Do you have copyright assignment? If not, please add Signed-off-by: in a second submission of the patch. Thanks, Florian
On Mon, 25 Mar 2024 12:46:47 +0100 Florian Weimer wrote: > * Štěpán Němec: > >> diff --git a/manual/pipe.texi b/manual/pipe.texi >> index 483c40c5c3dd..8a9a275cafe7 100644 >> --- a/manual/pipe.texi >> +++ b/manual/pipe.texi >> @@ -312,8 +312,7 @@ >> >> Reading or writing a larger amount of data may not be atomic; for >> example, output data from other processes sharing the descriptor may be >> -interspersed. Also, once @code{PIPE_BUF} characters have been written, >> -further writes will block until some characters are read. >> +interspersed. > > Maybe “further may block” instead? I think the reference to PIPE_BUF > and blocking could still be helpful, except that it's not a guarantee, > as you correctly point out. (Assuming you meant “further writes may block”, i.e., just s/will/may/ in the pre-patch text.) Ignoring the fact that the sentence seems simply wrong, at least in environments where the vast majority of glibc installations run (Linux with the relevant parameters as described in my commit message), I don't find the sentence particularly helpful, as the section focuses on _atomicity_, not blocking, so I find the sudden side note on blocking somewhat out of place here in any case. And as for your particular suggestion (if I understood it correctly), I would find that formulation _very_ unhelpful, unless supplemented by additional details (i.e., under what conditions "may" the blocking happen; but again, why talk about this at all in a section titled "Pipe Atomicity"?). > Do you have copyright assignment? I do not, and I thought it wasn't necessary for this kind of change. > If not, please add Signed-off-by: in a second submission > of the patch. Will do (if the result of the discussion calls for it, i.e., some version of my patch turns out acceptable). Thanks, Štěpán
On Mon, Mar 25, 2024, at 8:13 AM, Štěpán Němec wrote: >>> Reading or writing a larger amount of data may not be atomic; for >>> example, output data from other processes sharing the descriptor may be >>> -interspersed. Also, once @code{PIPE_BUF} characters have been written, >>> -further writes will block until some characters are read. >>> +interspersed. >> >> Maybe “further may block” instead? I think the reference to PIPE_BUF >> and blocking could still be helpful, except that it's not a guarantee, >> as you correctly point out. It's not correct to say that a write of 65536 bytes will _never_ block. Rather, the pipe capacity on Linux is (by default) 65536 bytes, and, if nothing is reading, _any write_ that tries to put a 65537th byte into the pipe will block. For example, both of these will wait 1s before printing "all written": { dd if=/dev/zero bs=1 count=1 status=none; dd if=/dev/zero bs=65536 count=1 status=none; echo 'all written' >&2; } | { sleep 1; wc -c; } { dd if=/dev/zero bs=1 count=1 status=none; dd if=/dev/zero bs=65535 count=1 status=none; dd if=/dev/zero bs=1 count=1 status=none; echo 'all written' >&2; } | { sleep 1; wc -c; } I agree that it is weird to talk about this in a section that's nominally about atomicity. But I think we shouldn't be calling the "no interspersed data from other processes" behavior that we're trying to describe here "atomicity" at all! Quoting <https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html>: # Write requests to a pipe or FIFO shall be handled in the same way as # a regular file with the following exceptions: ... # * Write requests of {PIPE_BUF} bytes or less shall not be # interleaved with data from other processes doing writes on the # same pipe. Writes of greater than {PIPE_BUF} bytes may have data # interleaved, on arbitrary boundaries, with writes by other # processes, whether or not the O_NONBLOCK flag of the file status # flags is set. This is a weak statement. It does *not* guarantee "that nothing else in the system can observe a state in which it is partially complete," as the manual currently puts it. Nor does it guarantee anything about how much a process reading from the pipe will receive if it does a larger read than the write. (To put that another way, if you write data packets to a pipe, the reader cannot use the return value of read() to tell how big the packets were.) Also, it's not clear to me from what you wrote, whether Linux extends the no-interleaved-data guarantee writes larger than PIPE_BUF as long as they are smaller than the pipe capacity, but if it does, we should say so only in a way that makes it clear it's not portable to rely on that. So I propose the appended revision to pipe.texi instead of what you proposed. It moves all this discussion to the beginning of the chapter and explains everything more thoroughly, and hopefully also correctly. zw diff --git a/manual/pipe.texi b/manual/pipe.texi index 483c40c5c3..92c1733c75 100644 --- a/manual/pipe.texi +++ b/manual/pipe.texi @@ -9,30 +9,58 @@ handled in a first-in, first-out (FIFO) order. The pipe has no name; it is created for one use and both ends must be inherited from the single process which created the pipe. +@cindex FIFO @cindex FIFO special file -A @dfn{FIFO special file} is similar to a pipe, but instead of being an -anonymous, temporary connection, a FIFO has a name or names like any -other file. Processes open the FIFO by name in order to communicate -through it. +A @dfn{FIFO special file}, commonly shortened to @dfn{FIFO}, is +similar to a pipe, but instead of being an anonymous, temporary +connection, a FIFO has a name or names like any other file. +Processes open the FIFO by name in order to communicate through it. -A pipe or FIFO has to be open at both ends simultaneously. If you read -from a pipe or FIFO file that doesn't have any processes writing to it +A pipe or FIFO has to be open at both ends simultaneously. If you +read from a pipe or FIFO that doesn't have any processes writing to it (perhaps because they have all closed the file, or exited), the read returns end-of-file. Writing to a pipe or FIFO that doesn't have a reading process is treated as an error condition; it generates a @code{SIGPIPE} signal, and fails with error code @code{EPIPE} if the signal is handled or blocked. -Neither pipes nor FIFO special files allow file positioning. Both -reading and writing operations happen sequentially; reading from the -beginning of the file and writing at the end. +Neither pipes nor FIFOs allow file positioning. Both reading and +writing operations happen sequentially; reading from the beginning of +the file and writing at the end. + +If two or more processes are writing to the same pipe or FIFO, the +data written by each process may be interleaved arbitrarily with data +written by the others. There is only one exception: Each time a +process makes a call to @code{write}, @code{writev}, or other +primitive I/O function (@pxref{I/O Primitives}) that writes, in total, +no more than @code{PIPE_BUF} bytes of data, @emph{that data} will not +be split by data written by other processes. But data written by +other processes could appear immediately before or afterward. + +@xref{Limits for Files}, for information about the @code{PIPE_BUF} +parameter. Note that @code{PIPE_BUF} is usually smaller than the +default buffer size used by I/O on streams (i.e.@: @code{BUFSIZ}); +@xref{Stream Buffering}, for how to control the stream buffer size. + +Pipes and FIFOs may have a limit on the amount of data that's been +written, but not yet read, that they can store. This limit is called +the @dfn{capacity} of the pipe or FIFO. A write that would overfill +the pipe---put more data into it than its capacity---will block until +something reads from the pipe (unless the @code{O_NONBLOCK} flag is +set; @pxref{Operating Modes}). If the write is smaller than +@code{PIPE_BUF}, none of the data will enter the pipe until all of it +can; if the write is larger, there is no guarantee about how much +data enters the pipe and when. + +The capacity must be @emph{at least} @code{PIPE_BUF}. Often it is +bigger. Some systems provide a way to query what the capacity is, +or to set it for individual pipes and FIFOs. @menu * Creating a Pipe:: Making a pipe with the @code{pipe} function. * Pipe to a Subprocess:: Using a pipe to communicate with a child process. * FIFO Special Files:: Making a FIFO special file. -* Pipe Atomicity:: When pipe (or FIFO) I/O is atomic. @end menu @node Creating a Pipe @@ -106,6 +134,16 @@ The advantage of using @code{popen} and @code{pclose} is that the interface is much simpler and easier to use. But it doesn't offer as much flexibility as using the low-level functions directly. +When using pipes to receive data from a subprocess, either with the +low-level functions or with @code{popen} and @code{pclose}, you must +make sure to read all the data @emph{before} you wait for the +subprocess to complete (by calling @code{pclose}, or any of the +functions described in @pxref{Process Completion}). This is because, +if the subprocess writes more data than the pipe's capacity, it will +block until you read some of it. If you're waiting for the subprocess +to complete, you're not doing any reading, so the subprocess will +never exit, and you'll never read any data---a deadlock condition. + @deftypefun {FILE *} popen (const char *@var{command}, const char *@var{mode}) @standards{POSIX.2, stdio.h} @standards{SVID, stdio.h} @@ -299,21 +337,3 @@ The directory that would contain the file resides on a read-only file system. @end table @end deftypefun - -@node Pipe Atomicity -@section Atomicity of Pipe I/O - -Reading or writing pipe data is @dfn{atomic} if the size of data written -is not greater than @code{PIPE_BUF}. This means that the data transfer -seems to be an instantaneous unit, in that nothing else in the system -can observe a state in which it is partially complete. Atomic I/O may -not begin right away (it may need to wait for buffer space or for data), -but once it does begin it finishes immediately. - -Reading or writing a larger amount of data may not be atomic; for -example, output data from other processes sharing the descriptor may be -interspersed. Also, once @code{PIPE_BUF} characters have been written, -further writes will block until some characters are read. - -@xref{Limits for Files}, for information about the @code{PIPE_BUF} -parameter.
On Mon, 25 Mar 2024 12:20:14 -0400 Zack Weinberg wrote: > On Mon, Mar 25, 2024, at 8:13 AM, Štěpán Němec wrote: >>>> Reading or writing a larger amount of data may not be atomic; for >>>> example, output data from other processes sharing the descriptor may be >>>> -interspersed. Also, once @code{PIPE_BUF} characters have been written, >>>> -further writes will block until some characters are read. >>>> +interspersed. >>> >>> Maybe “further may block” instead? I think the reference to PIPE_BUF >>> and blocking could still be helpful, except that it's not a guarantee, >>> as you correctly point out. > > It's not correct to say that a write of 65536 bytes will _never_ > block. Rather, the pipe capacity on Linux is (by default) 65536 > bytes, and, if nothing is reading, _any write_ that tries to put a > 65537th byte into the pipe will block. For example, both of these > will wait 1s before printing "all written": > > { dd if=/dev/zero bs=1 count=1 status=none; > dd if=/dev/zero bs=65536 count=1 status=none; > echo 'all written' >&2; } | > { sleep 1; wc -c; } > > { dd if=/dev/zero bs=1 count=1 status=none; > dd if=/dev/zero bs=65535 count=1 status=none; > dd if=/dev/zero bs=1 count=1 status=none; > echo 'all written' >&2; } | > { sleep 1; wc -c; } This seems correct and perhaps interesting, but how is it relevant? I did not "say that a write of 65536 bytes will _never_ block". I used a simple example to illustrate why the statement in the manual about PIPE_BUF being the factor causing blocking write was incorrect. > I agree that it is weird to talk about this in a section that's > nominally about atomicity. But I think we shouldn't be calling the > "no interspersed data from other processes" behavior that we're trying > to describe here "atomicity" at all! Why? The very document you cite below (POSIX write(2)) makes the _atomic_ ("A write is atomic if the whole amount written in one operation is not interleaved with data from any other process. [...] This volume of POSIX.1-2017 does not say whether write requests for more than {PIPE_BUF} bytes are atomic, but requires that writes of {PIPE_BUF} or fewer bytes shall be atomic.") vs _blocking_ ("The effective size of a pipe or FIFO (the maximum amount that can be written in one operation without blocking) may vary dynamically, depending on the implementation, so it is not possible to specify a fixed value for it.") distinction right at the beginning of RATIONALE, not mentioning that the terminology seems well established, and etymologically fitting (ἄτομος meaning “indivisible”). > Quoting > <https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html>: > > # Write requests to a pipe or FIFO shall be handled in the same way as > # a regular file with the following exceptions: > ... > # * Write requests of {PIPE_BUF} bytes or less shall not be > # interleaved with data from other processes doing writes on the > # same pipe. Writes of greater than {PIPE_BUF} bytes may have data > # interleaved, on arbitrary boundaries, with writes by other > # processes, whether or not the O_NONBLOCK flag of the file status > # flags is set. > > This is a weak statement. How so? See here for the definition of "shall" in POSIXspeak: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap01.html#tag_01_05_05 > It does *not* guarantee "that nothing else in the system > can observe a state in which it is partially complete," as > the manual currently puts it. I admit I'm unable to extract much useful meaning from the vague "nothing else in the system", but if we restrict our perspective to the two ends of a pipe, "can[not] observe a state in which it is partially complete" sounds about right, doesn't it? > Nor does it guarantee anything about how much a process > reading from the pipe will receive if it does a larger > read than the write. (To put that another way, if you > write data packets to a pipe, the reader cannot use the > return value of read() to tell how big the packets were.) This seems to be confusing atomicity with blocking again. > Also, it's not clear to me from what you wrote, whether Linux extends > the no-interleaved-data guarantee writes larger than PIPE_BUF as long > as they are smaller than the pipe capacity, I don't know about any such guarantee. > but if it does, we should say so only in a way that makes > it clear it's not portable to rely on that. > > So I propose the appended revision to pipe.texi instead of what you > proposed. It moves all this discussion to the beginning of the > chapter and explains everything more thoroughly, and hopefully > also correctly. FWIW, I find your proposed text clear, helpful and matching my understanding, and would welcome it to supersede my patch. Thanks, Štěpán
diff --git a/manual/pipe.texi b/manual/pipe.texi index 483c40c5c3dd..8a9a275cafe7 100644 --- a/manual/pipe.texi +++ b/manual/pipe.texi @@ -312,8 +312,7 @@ Reading or writing a larger amount of data may not be atomic; for example, output data from other processes sharing the descriptor may be -interspersed. Also, once @code{PIPE_BUF} characters have been written, -further writes will block until some characters are read. +interspersed. @xref{Limits for Files}, for information about the @code{PIPE_BUF} parameter.