Message ID | 20190530220121.128798-1-willemdebruijn.kernel@gmail.com |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | [net,v2] net: correct zerocopy refcnt with udp MSG_MORE | expand |
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Date: Thu, 30 May 2019 18:01:21 -0400 > From: Willem de Bruijn <willemb@google.com> > > TCP zerocopy takes a uarg reference for every skb, plus one for the > tcp_sendmsg_locked datapath temporarily, to avoid reaching refcnt zero > as it builds, sends and frees skbs inside its inner loop. > > UDP and RAW zerocopy do not send inside the inner loop so do not need > the extra sock_zerocopy_get + sock_zerocopy_put pair. Commit > 52900d22288ed ("udp: elide zerocopy operation in hot path") introduced > extra_uref to pass the initial reference taken in sock_zerocopy_alloc > to the first generated skb. > > But, sock_zerocopy_realloc takes this extra reference at the start of > every call. With MSG_MORE, no new skb may be generated to attach the > extra_uref to, so refcnt is incorrectly 2 with only one skb. > > Do not take the extra ref if uarg && !tcp, which implies MSG_MORE. > Update extra_uref accordingly. > > This conditional assignment triggers a false positive may be used > uninitialized warning, so have to initialize extra_uref at define. > > Changes v1->v2: fix typo in Fixes SHA1 > > Fixes: 52900d22288e7 ("udp: elide zerocopy operation in hot path") > Reported-by: syzbot <syzkaller@googlegroups.com> > Diagnosed-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Willem de Bruijn <willemb@google.com> Applied and queued up for -stable, thanks.
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index e89be62826937..eaad23f9c7b5b 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1036,7 +1036,11 @@ struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, size_t size, uarg->len++; uarg->bytelen = bytelen; atomic_set(&sk->sk_zckey, ++next); - sock_zerocopy_get(uarg); + + /* no extra ref when appending to datagram (MSG_MORE) */ + if (sk->sk_type == SOCK_STREAM) + sock_zerocopy_get(uarg); + return uarg; } } diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index bfd0ca554977a..8c9189a41b136 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -878,7 +878,7 @@ static int __ip_append_data(struct sock *sk, int csummode = CHECKSUM_NONE; struct rtable *rt = (struct rtable *)cork->dst; unsigned int wmem_alloc_delta = 0; - bool paged, extra_uref; + bool paged, extra_uref = false; u32 tskey = 0; skb = skb_peek_tail(queue); @@ -918,7 +918,7 @@ static int __ip_append_data(struct sock *sk, uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb)); if (!uarg) return -ENOBUFS; - extra_uref = true; + extra_uref = !skb; /* only extra ref if !MSG_MORE */ if (rt->dst.dev->features & NETIF_F_SG && csummode == CHECKSUM_PARTIAL) { paged = true; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index adef2236abe2e..f9e43323e6673 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1275,7 +1275,7 @@ static int __ip6_append_data(struct sock *sk, int csummode = CHECKSUM_NONE; unsigned int maxnonfragsize, headersize; unsigned int wmem_alloc_delta = 0; - bool paged, extra_uref; + bool paged, extra_uref = false; skb = skb_peek_tail(queue); if (!skb) { @@ -1344,7 +1344,7 @@ static int __ip6_append_data(struct sock *sk, uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb)); if (!uarg) return -ENOBUFS; - extra_uref = true; + extra_uref = !skb; /* only extra ref if !MSG_MORE */ if (rt->dst.dev->features & NETIF_F_SG && csummode == CHECKSUM_PARTIAL) { paged = true;