diff mbox

net/unix: fix logic about sk_peek_offset

Message ID 1443733536-992-1-git-send-email-avagin@gmail.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Andrei Vagin Oct. 1, 2015, 9:05 p.m. UTC
From: Andrey Vagin <avagin@openvz.org>

Now send with MSG_PEEK can return data from multiple SKBs.

Unfortunately we take into account the peek offset for each skb,
that is wrong. We need to apply the peek offset only once.

In addition, the peek offset should be used only if MSG_PEEK is set.

Cc: "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING
Cc: Eric Dumazet <edumazet@google.com> (commit_signer:1/14=7%)
Cc: Aaron Conole <aconole@bytheb.org>
Fixes: 9f389e35674f ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
Signed-off-by: Andrey Vagin <avagin@openvz.org>
---
 net/unix/af_unix.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

Comments

Andrei Vagin Oct. 2, 2015, 9:04 a.m. UTC | #1
2015-10-02 0:05 GMT+03:00 Andrey Vagin <avagin@gmail.com>:
> From: Andrey Vagin <avagin@openvz.org>
>
> Now send with MSG_PEEK can return data from multiple SKBs.
>
> Unfortunately we take into account the peek offset for each skb,
> that is wrong. We need to apply the peek offset only once.
>
> In addition, the peek offset should be used only if MSG_PEEK is set.

The attached program can be used to reproduce the bug. It cycles in an
infinite loop without this patch, because recv() always returns data.

658   socketpair(PF_LOCAL, SOCK_STREAM, 0, [3, 4]) = 0
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   sendto(3, "Hello, World!", 13, MSG_DONTWAIT, NULL, 0) = 13
658   setsockopt(4, SOL_SOCKET, 0x2a /* SO_??? */, [0], 4) = 0
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   fstat(1, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0
658   ioctl(1, TCGETS, 0x7fff54e57b30)  = -1 ENOTTY (Inappropriate
ioctl for device)
658   mmap(NULL, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3a86827000
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
658   recvfrom(4, "Hello, World!H", 14, MSG_PEEK|MSG_DONTWAIT, NULL, NULL) = 14
...

>
> Cc: "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING
> Cc: Eric Dumazet <edumazet@google.com> (commit_signer:1/14=7%)
> Cc: Aaron Conole <aconole@bytheb.org>
> Fixes: 9f389e35674f ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
> Signed-off-by: Andrey Vagin <avagin@openvz.org>
> ---
>  net/unix/af_unix.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index ef31b40..94f6582 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -2064,6 +2064,11 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state)
>                 goto out;
>         }
>
> +       if (flags & MSG_PEEK)
> +               skip = sk_peek_offset(sk, flags);
> +       else
> +               skip = 0;
> +
>         do {
>                 int chunk;
>                 struct sk_buff *skb, *last;
> @@ -2112,7 +2117,6 @@ unlock:
>                         break;
>                 }
>
> -               skip = sk_peek_offset(sk, flags);
>                 while (skip >= unix_skb_len(skb)) {
>                         skip -= unix_skb_len(skb);
>                         last = skb;
> @@ -2179,14 +2183,12 @@ unlock:
>                         if (UNIXCB(skb).fp)
>                                 scm.fp = scm_fp_dup(UNIXCB(skb).fp);
>
> -                       if (skip) {
> -                               sk_peek_offset_fwd(sk, chunk);
> -                               skip -= chunk;
> -                       }
> +                       sk_peek_offset_fwd(sk, chunk);
>
>                         if (UNIXCB(skb).fp)
>                                 break;
>
> +                       skip = 0;
>                         last = skb;
>                         last_len = skb->len;
>                         unix_state_lock(sk);
> --
> 2.4.3
>
Aaron Conole Oct. 2, 2015, 12:17 p.m. UTC | #2
Andrey Vagin <avagin@openvz.org> writes:

> 2015-10-02 0:05 GMT+03:00 Andrey Vagin <avagin@gmail.com>:
>> From: Andrey Vagin <avagin@openvz.org>
>>
>> Now send with MSG_PEEK can return data from multiple SKBs.
>>
>> Unfortunately we take into account the peek offset for each skb,
>> that is wrong. We need to apply the peek offset only once.
>>
>> In addition, the peek offset should be used only if MSG_PEEK is set.
<<snip>>
Agreed here, the behavior is off in the peek offset case.

I was unable to apply your patch, for some reason, but will manually
try to apply it and test.

Thanks very much!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrei Vagin Oct. 3, 2015, 7:48 p.m. UTC | #3
2015-10-02 15:17 GMT+03:00 Aaron Conole <aconole@bytheb.org>:
> Andrey Vagin <avagin@openvz.org> writes:
>
>> 2015-10-02 0:05 GMT+03:00 Andrey Vagin <avagin@gmail.com>:
>>> From: Andrey Vagin <avagin@openvz.org>
>>>
>>> Now send with MSG_PEEK can return data from multiple SKBs.
>>>
>>> Unfortunately we take into account the peek offset for each skb,
>>> that is wrong. We need to apply the peek offset only once.
>>>
>>> In addition, the peek offset should be used only if MSG_PEEK is set.
> <<snip>>
> Agreed here, the behavior is off in the peek offset case.
>
> I was unable to apply your patch, for some reason, but will manually
> try to apply it and test.

It's strange. I've checked that this patch can be applied to
davem/net.git and next/linux-next.git,

[avagin@laptop linux-next]$ git checkout net/master
HEAD is now at 36f8daf... Merge tag 'mmc-v4.3-rc3' of
git://git.linaro.org/people/ulf.hansson/mmc
[avagin@laptop linux-next]$ wget https://patchwork.ozlabs.org/patch/525310/mbox/
--2015-10-03 22:46:44--  https://patchwork.ozlabs.org/patch/525310/mbox/
Resolving patchwork.ozlabs.org (patchwork.ozlabs.org)...
103.22.144.67, 2401:3900:2:1::2
Connecting to patchwork.ozlabs.org
(patchwork.ozlabs.org)|103.22.144.67|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘index.html’

index.html                                  [ <=>
                                                      ]   2.06K
--.-KB/s   in 0s

2015-10-03 22:46:47 (122 MB/s) - ‘index.html’ saved [2106]

[avagin@laptop linux-next]$ git am index.html
Applying: net/unix: fix logic about sk_peek_offset

>
> Thanks very much!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Aaron Conole Oct. 4, 2015, 12:50 p.m. UTC | #4
Andrey Vagin <avagin@gmail.com> writes:

> From: Andrey Vagin <avagin@openvz.org>
>
> Now send with MSG_PEEK can return data from multiple SKBs.
>
> Unfortunately we take into account the peek offset for each skb,
> that is wrong. We need to apply the peek offset only once.
>
> In addition, the peek offset should be used only if MSG_PEEK is set.
>
> Cc: "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING
> Cc: Eric Dumazet <edumazet@google.com> (commit_signer:1/14=7%)
> Cc: Aaron Conole <aconole@bytheb.org>
> Fixes: 9f389e35674f ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
> Signed-off-by: Andrey Vagin <avagin@openvz.org>
> ---
Tested-by: Aaron Conole <aconole@bytheb.org>

Thanks again!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 5, 2015, 1:33 p.m. UTC | #5
From: Andrey Vagin <avagin@gmail.com>
Date: Fri,  2 Oct 2015 00:05:36 +0300

> From: Andrey Vagin <avagin@openvz.org>
> 
> Now send with MSG_PEEK can return data from multiple SKBs.
> 
> Unfortunately we take into account the peek offset for each skb,
> that is wrong. We need to apply the peek offset only once.
> 
> In addition, the peek offset should be used only if MSG_PEEK is set.
> 
> Cc: "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING
> Cc: Eric Dumazet <edumazet@google.com> (commit_signer:1/14=7%)
> Cc: Aaron Conole <aconole@bytheb.org>
> Fixes: 9f389e35674f ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
> Signed-off-by: Andrey Vagin <avagin@openvz.org>

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index ef31b40..94f6582 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2064,6 +2064,11 @@  static int unix_stream_read_generic(struct unix_stream_read_state *state)
 		goto out;
 	}
 
+	if (flags & MSG_PEEK)
+		skip = sk_peek_offset(sk, flags);
+	else
+		skip = 0;
+
 	do {
 		int chunk;
 		struct sk_buff *skb, *last;
@@ -2112,7 +2117,6 @@  unlock:
 			break;
 		}
 
-		skip = sk_peek_offset(sk, flags);
 		while (skip >= unix_skb_len(skb)) {
 			skip -= unix_skb_len(skb);
 			last = skb;
@@ -2179,14 +2183,12 @@  unlock:
 			if (UNIXCB(skb).fp)
 				scm.fp = scm_fp_dup(UNIXCB(skb).fp);
 
-			if (skip) {
-				sk_peek_offset_fwd(sk, chunk);
-				skip -= chunk;
-			}
+			sk_peek_offset_fwd(sk, chunk);
 
 			if (UNIXCB(skb).fp)
 				break;
 
+			skip = 0;
 			last = skb;
 			last_len = skb->len;
 			unix_state_lock(sk);