diff mbox series

[net-next] af_unix: preserve position of fd-associated bytes in stream

Message ID 20190406163325.13125-1-chris@monsan.to
State Rejected
Delegated to: David Miller
Headers show
Series [net-next] af_unix: preserve position of fd-associated bytes in stream | expand

Commit Message

Christopher Monsanto April 6, 2019, 4:33 p.m. UTC
It is currently impossible for the reader of an AF_UNIX stream socket to
fully reconstruct the data sent in the presence of SCM_RIGHTS, without
reading byte-for-byte. This prevents efficiently proxying or providing a
high-level buffering interface to these sockets.

Unfortunately POSIX does not specify the behavior of SCM_RIGHTS beyond the
existence of the constant and in general the semantics of ancillary data
attached to streams isn't on the firmest ground. The following is how these
concepts work on every *nix I am aware of. An SCM_RIGHTS array of file
descriptors is uniquely associated with a single byte in the stream; this
byte can be identified by reading the stream one byte at a time. sendmsg()
associates the given fds with the first byte in the buffer provided. When
recvmsg() returns fds, we know that exactly one fd-associated byte appears
in the buffer; if necessary partial reads are employed to guarantee this
invariant.

The issue in question concerns recvmsg(). Linux allows the fd-associated
byte to appear anywhere within the buffer; the reader has no way of knowing
which one it is. Other *nix systems (at least macOS/Solaris/FreeBSD), make
recvmsg() symmetric with sendmsg() and guarantee that the fd-associated
byte is the first byte in the buffer. The first change of this patch has
Linux adopt the stronger semantics of its peers, which fixes the issue at
hand while also bringing us closer to standardizing SCM_RIGHTS.

The existing implementation enforces the one-fd-associated-byte constraint
with a partial read after any skb with attached fds. The second & third
changes remove this unnecessary constraint, allowing data from subsequent
skbs to be copied to the buffer, as long those skbs do not have fds
attached to them.

Signed-off-by: Christopher Monsanto <chris@monsan.to>
---
 net/unix/af_unix.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Comments

David Miller April 8, 2019, 11:21 p.m. UTC | #1
From: Christopher Monsanto <chris@monsan.to>
Date: Sat,  6 Apr 2019 12:33:25 -0400

> It is currently impossible for the reader of an AF_UNIX stream socket to
> fully reconstruct the data sent in the presence of SCM_RIGHTS, without
> reading byte-for-byte. This prevents efficiently proxying or providing a
> high-level buffering interface to these sockets.
> 
> Unfortunately POSIX does not specify the behavior of SCM_RIGHTS beyond the
> existence of the constant and in general the semantics of ancillary data
> attached to streams isn't on the firmest ground. The following is how these
> concepts work on every *nix I am aware of. An SCM_RIGHTS array of file
> descriptors is uniquely associated with a single byte in the stream; this
> byte can be identified by reading the stream one byte at a time. sendmsg()
> associates the given fds with the first byte in the buffer provided. When
> recvmsg() returns fds, we know that exactly one fd-associated byte appears
> in the buffer; if necessary partial reads are employed to guarantee this
> invariant.
> 
> The issue in question concerns recvmsg(). Linux allows the fd-associated
> byte to appear anywhere within the buffer; the reader has no way of knowing
> which one it is. Other *nix systems (at least macOS/Solaris/FreeBSD), make
> recvmsg() symmetric with sendmsg() and guarantee that the fd-associated
> byte is the first byte in the buffer. The first change of this patch has
> Linux adopt the stronger semantics of its peers, which fixes the issue at
> hand while also bringing us closer to standardizing SCM_RIGHTS.
> 
> The existing implementation enforces the one-fd-associated-byte constraint
> with a partial read after any skb with attached fds. The second & third
> changes remove this unnecessary constraint, allowing data from subsequent
> skbs to be copied to the buffer, as long those skbs do not have fds
> attached to them.
> 
> Signed-off-by: Christopher Monsanto <chris@monsan.to>

No app can ever, really ever, really universally depend upon this
behavior.

I really have a hard time accepting that we should change something
semantically on this level moving forward, which has lasted more than
2 decades.
diff mbox series

Patch

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index ddb838a1b74c..761837a7da65 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2297,6 +2297,9 @@  static int unix_stream_read_generic(struct unix_stream_read_state *state,
 
 		unix_state_unlock(sk);
 
+		if (UNIXCB(skb).fp && copied > 0)
+			break;
+
 		if (check_creds) {
 			/* Never glue messages from different writers */
 			if (!unix_skb_scm_eq(skb, &scm))
@@ -2356,9 +2359,6 @@  static int unix_stream_read_generic(struct unix_stream_read_state *state,
 
 			skb_unlink(skb, &sk->sk_receive_queue);
 			consume_skb(skb);
-
-			if (scm.fp)
-				break;
 		} else {
 			/* It is questionable, see note in unix_dgram_recvmsg.
 			 */
@@ -2367,9 +2367,6 @@  static int unix_stream_read_generic(struct unix_stream_read_state *state,
 
 			sk_peek_offset_fwd(sk, chunk);
 
-			if (UNIXCB(skb).fp)
-				break;
-
 			skip = 0;
 			last = skb;
 			last_len = skb->len;