Message ID | 20181120180018.178537-1-willemdebruijn.kernel@gmail.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] packet: copy user buffers before orphan or clone | expand |
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Date: Tue, 20 Nov 2018 13:00:18 -0500 > From: Willem de Bruijn <willemb@google.com> > > tpacket_snd sends packets with user pages linked into skb frags. It > notifies that pages can be reused when the skb is released by setting > skb->destructor to tpacket_destruct_skb. > > This can cause data corruption if the skb is orphaned (e.g., on > transmit through veth) or cloned (e.g., on mirror to another psock). > > Create a kernel-private copy of data in these cases, same as tun/tap > zerocopy transmission. Reuse that infrastructure: mark the skb as > SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx). > > Unlike other zerocopy packets, do not set shinfo destructor_arg to > struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify > when the original skb is released and a timestamp is recorded. Do not > change this timestamp behavior. The ubuf_info->callback is not needed > anyway, as no zerocopy notification is expected. > > Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The > resulting value is not a valid ubuf_info pointer, nor a valid > tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this. > > The fix relies on features introduced in commit 52267790ef52 ("sock: > add MSG_ZEROCOPY"), so can be backported as is only to 4.14. > > Tested with from `./in_netns.sh ./txring_overwrite` from > http://github.com/wdebruij/kerneltools/tests > > Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap") > Reported-by: Anand H. Krishnan <anandhkrishnan@gmail.com> > Signed-off-by: Willem de Bruijn <willemb@google.com> Applied, and queued up for -stable. Thanks for the backporting notes. Any chance those tests from your kerneltools repo can make their way into selftests? Thanks.
On Fri, Nov 23, 2018 at 2:09 PM David Miller <davem@davemloft.net> wrote: > > From: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > Date: Tue, 20 Nov 2018 13:00:18 -0500 > > > From: Willem de Bruijn <willemb@google.com> > > > > tpacket_snd sends packets with user pages linked into skb frags. It > > notifies that pages can be reused when the skb is released by setting > > skb->destructor to tpacket_destruct_skb. > > > > This can cause data corruption if the skb is orphaned (e.g., on > > transmit through veth) or cloned (e.g., on mirror to another psock). > > > > Create a kernel-private copy of data in these cases, same as tun/tap > > zerocopy transmission. Reuse that infrastructure: mark the skb as > > SKBTX_ZEROCOPY_FRAG, which will trigger copy in skb_orphan_frags(_rx). > > > > Unlike other zerocopy packets, do not set shinfo destructor_arg to > > struct ubuf_info. tpacket_destruct_skb already uses that ptr to notify > > when the original skb is released and a timestamp is recorded. Do not > > change this timestamp behavior. The ubuf_info->callback is not needed > > anyway, as no zerocopy notification is expected. > > > > Mark destructor_arg as not-a-uarg by setting the lower bit to 1. The > > resulting value is not a valid ubuf_info pointer, nor a valid > > tpacket_snd frame address. Add skb_zcopy_.._nouarg helpers for this. > > > > The fix relies on features introduced in commit 52267790ef52 ("sock: > > add MSG_ZEROCOPY"), so can be backported as is only to 4.14. > > > > Tested with from `./in_netns.sh ./txring_overwrite` from > > http://github.com/wdebruij/kerneltools/tests > > > > Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap") > > Reported-by: Anand H. Krishnan <anandhkrishnan@gmail.com> > > Signed-off-by: Willem de Bruijn <willemb@google.com> > > Applied, and queued up for -stable. Thanks for the backporting notes. > > Any chance those tests from your kerneltools repo can make their way > into selftests? Absolutely. I'll send it to net-next once it has the fix.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 0ba687454267..0d1b2c3f127b 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1326,6 +1326,22 @@ static inline void skb_zcopy_set(struct sk_buff *skb, struct ubuf_info *uarg) } } +static inline void skb_zcopy_set_nouarg(struct sk_buff *skb, void *val) +{ + skb_shinfo(skb)->destructor_arg = (void *)((uintptr_t) val | 0x1UL); + skb_shinfo(skb)->tx_flags |= SKBTX_ZEROCOPY_FRAG; +} + +static inline bool skb_zcopy_is_nouarg(struct sk_buff *skb) +{ + return (uintptr_t) skb_shinfo(skb)->destructor_arg & 0x1UL; +} + +static inline void *skb_zcopy_get_nouarg(struct sk_buff *skb) +{ + return (void *)((uintptr_t) skb_shinfo(skb)->destructor_arg & ~0x1UL); +} + /* Release a reference on a zerocopy structure */ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy) { @@ -1335,7 +1351,7 @@ static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy) if (uarg->callback == sock_zerocopy_callback) { uarg->zerocopy = uarg->zerocopy && zerocopy; sock_zerocopy_put(uarg); - } else { + } else if (!skb_zcopy_is_nouarg(skb)) { uarg->callback(uarg, zerocopy); } diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index ec3095f13aae..a74650e98f42 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2394,7 +2394,7 @@ static void tpacket_destruct_skb(struct sk_buff *skb) void *ph; __u32 ts; - ph = skb_shinfo(skb)->destructor_arg; + ph = skb_zcopy_get_nouarg(skb); packet_dec_pending(&po->tx_ring); ts = __packet_set_timestamp(po, ph, skb); @@ -2461,7 +2461,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb, skb->mark = po->sk.sk_mark; skb->tstamp = sockc->transmit_time; sock_tx_timestamp(&po->sk, sockc->tsflags, &skb_shinfo(skb)->tx_flags); - skb_shinfo(skb)->destructor_arg = ph.raw; + skb_zcopy_set_nouarg(skb, ph.raw); skb_reserve(skb, hlen); skb_reset_network_header(skb);