diff mbox series

tcp: refine memory limit test in tcp_fragment()

Message ID 1561403698-391-1-git-send-email-tyhicks@canonical.com
State New
Headers show
Series tcp: refine memory limit test in tcp_fragment() | expand

Commit Message

Tyler Hicks June 24, 2019, 7:14 p.m. UTC
From: Eric Dumazet <edumazet@google.com>

tcp_fragment() might be called for skbs in the write queue.

Memory limits might have been exceeded because tcp_sendmsg() only
checks limits at full skb (64KB) boundaries.

Therefore, we need to make sure tcp_fragment() wont punish applications
that might have setup very low SO_SNDBUF values.

Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Christoph Paasch <cpaasch@apple.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

CVE-2019-11478

(backported from commit b6653b3629e5b88202be3c9abc44713973f5c4b4)
[tyhicks: Don't enforce the limit on the skb that tcp_send_head points
 as that skb has never been sent out. In newer kernels containing commit
 75c119afe14f ("tcp: implement rb-tree based retransmit queue"), where
 there the retransmission queue is separate from the write queue, this
 skb would be in the write queue.
 With the modified check in this backported patch, we run the risk of
 enforcing the memory limit on an skb that is after tcp_send_head in the
 queue yet has never been sent out. However, an inspection of all
 tcp_fragment() call sites finds that this shouldn't occur and the limit
 will only be enforced on skbs that are up for retransmission.]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
---

I've successfully tested this patch using a slightly modified version of
a packetdrill test that was sent to the netdev list. Without this kernel
change, the test hangs. The test successfully completes with this kernel
change.

 net/ipv4/tcp_output.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Tyler Hicks June 24, 2019, 7:17 p.m. UTC | #1
I forgot the proper subject tags. I'll resubmit.

Tyler

On 2019-06-24 19:14:58, Tyler Hicks wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> tcp_fragment() might be called for skbs in the write queue.
> 
> Memory limits might have been exceeded because tcp_sendmsg() only
> checks limits at full skb (64KB) boundaries.
> 
> Therefore, we need to make sure tcp_fragment() wont punish applications
> that might have setup very low SO_SNDBUF values.
> 
> Fixes: f070ef2ac667 ("tcp: tcp_fragment() should apply sane memory limits")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Christoph Paasch <cpaasch@apple.com>
> Tested-by: Christoph Paasch <cpaasch@apple.com>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> CVE-2019-11478
> 
> (backported from commit b6653b3629e5b88202be3c9abc44713973f5c4b4)
> [tyhicks: Don't enforce the limit on the skb that tcp_send_head points
>  as that skb has never been sent out. In newer kernels containing commit
>  75c119afe14f ("tcp: implement rb-tree based retransmit queue"), where
>  there the retransmission queue is separate from the write queue, this
>  skb would be in the write queue.
>  With the modified check in this backported patch, we run the risk of
>  enforcing the memory limit on an skb that is after tcp_send_head in the
>  queue yet has never been sent out. However, an inspection of all
>  tcp_fragment() call sites finds that this shouldn't occur and the limit
>  will only be enforced on skbs that are up for retransmission.]
> Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
> ---
> 
> I've successfully tested this patch using a slightly modified version of
> a packetdrill test that was sent to the netdev list. Without this kernel
> change, the test hangs. The test successfully completes with this kernel
> change.
> 
>  net/ipv4/tcp_output.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index ede265fbf7ba..719d2cc8770c 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1163,7 +1163,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
>  	if (nsize < 0)
>  		nsize = 0;
>  
> -	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
> +	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
> +		     skb != tcp_send_head(sk))) {
>  		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
>  		return -ENOMEM;
>  	}
> -- 
> 2.7.4
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
diff mbox series

Patch

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index ede265fbf7ba..719d2cc8770c 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1163,7 +1163,8 @@  int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
 	if (nsize < 0)
 		nsize = 0;
 
-	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf)) {
+	if (unlikely((sk->sk_wmem_queued >> 1) > sk->sk_sndbuf &&
+		     skb != tcp_send_head(sk))) {
 		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPWQUEUETOOBIG);
 		return -ENOMEM;
 	}