Message ID | 1456787013-3277-1-git-send-email-bpoirier@suse.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On 01.03.2016 00:03, Benjamin Poirier wrote: > The current reserved_tailroom calculation fails to take hlen and tlen into > account. > > skb: > [__hlen__|__data____________|__tlen___|__extra__] > ^ ^ > head skb_end_offset > > In this representation, hlen + data + tlen is the size passed to alloc_skb. > "extra" is the extra space made available in __alloc_skb because of > rounding up by kmalloc. We can reorder the representation like so: > > [__hlen__|__data____________|__extra__|__tlen___] > ^ ^ > head skb_end_offset > > The maximum space available for ip headers and payload without > fragmentation is min(mtu, data + extra). Therefore, > reserved_tailroom > = data + extra + tlen - min(mtu, data + extra) > = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen) > = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen) > > Compare the second line to the current expression: > reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset) > and we can see that hlen and tlen are not taken into account. > > The min() in the third line can be expanded into: > if mtu < skb_tailroom - tlen: > reserved_tailroom = skb_tailroom - mtu > else: > reserved_tailroom = tlen > > Depending on hlen, tlen, mtu and the number of multicast address records, > the current code may output skbs that have less tailroom than > dev->needed_tailroom or it may output more skbs than needed because not all > space available is used. > > Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs") > Signed-off-by: Benjamin Poirier <bpoirier@suse.com> I like it, thanks a lot! Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
On 03/01/2016 12:03 AM, Benjamin Poirier wrote: [...] > Notes: > Changes v1->v2 > As suggested by Hannes, move the code to an inline helper and express it > using "if" rather than "min". The code is correct, thanks! Therefore: Acked-by: Daniel Borkmann <daniel@iogearbox.net> However, I actually think that v1 was much better/easier as a fix though. :/ Meaning 1) it's likely easier to backport, and 2) that we now need a comment above each skb->reserved_tailroom assignment probably says that min() might perhaps have been more self-documenting ... skb_tailroom_reserve() looks quite generic, but it only makes sense to use in combination with skb_availroom(), which would have been good to put a note to the header comment. Also "the required headroom should already have been reserved before using this function", places one more requirement for usage. If we really want to go that path, maybe rather a skb_setroom() that is coupled with skb_availroom() like: static inline int __skb_tailroom(const struct sk_buff *skb) { return skb->end - skb->tail; } static inline void skb_setroom(struct sk_buff *skb, unsigned int needed_headroom, unsigned int size, unsigned int needed_tailroom) { SKB_LINEAR_ASSERT(skb); skb_reserve(skb, needed_headroom); skb->reserved_tailroom = needed_tailroom; if (size < __skb_tailroom(skb) - needed_tailroom) skb->reserved_tailroom = __skb_tailroom(skb) - size; } Then, skb_tailroom() would internally use __skb_tailroom(), too. And we can also spare us the two unneeded skb_is_nonlinear() checks in our helper which will currently always evaluate to false anyway. ... just a thought. Thanks again, Daniel
On 01.03.2016 11:18, Daniel Borkmann wrote: > On 03/01/2016 12:03 AM, Benjamin Poirier wrote: > [...] >> Notes: >> Changes v1->v2 >> As suggested by Hannes, move the code to an inline helper and >> express it >> using "if" rather than "min". > > The code is correct, thanks! > > Therefore: > > Acked-by: Daniel Borkmann <daniel@iogearbox.net> > > However, I actually think that v1 was much better/easier as a fix > though. :/ > > Meaning 1) it's likely easier to backport, and 2) that we now need a > comment > above each skb->reserved_tailroom assignment probably says that min() might > perhaps have been more self-documenting ... > > skb_tailroom_reserve() looks quite generic, but it only makes sense to > use in > combination with skb_availroom(), which would have been good to put a > note to > the header comment. Also "the required headroom should already have been > reserved before using this function", places one more requirement for > usage. > > If we really want to go that path, maybe rather a skb_setroom() that is > coupled > with skb_availroom() like: > > static inline int __skb_tailroom(const struct sk_buff *skb) > { > return skb->end - skb->tail; > } > > static inline void skb_setroom(struct sk_buff *skb, > unsigned int needed_headroom, > unsigned int size, > unsigned int needed_tailroom) > { > SKB_LINEAR_ASSERT(skb); > > skb_reserve(skb, needed_headroom); > skb->reserved_tailroom = needed_tailroom; > > if (size < __skb_tailroom(skb) - needed_tailroom) > skb->reserved_tailroom = __skb_tailroom(skb) - size; > } > > Then, skb_tailroom() would internally use __skb_tailroom(), too. And we > can also > spare us the two unneeded skb_is_nonlinear() checks in our helper which > will > currently always evaluate to false anyway. > > ... just a thought. I think it is fine. The code will be inlined anyway and probably skb linear assertions will be optimized away. I like the current code in its form: skb_reserve(skb, header); skb_tailroom_reserve(skb, mtu, tailroom); Combined form has too many arguments, so one has to look up the header file again. Bye, Hannes
From: Benjamin Poirier <bpoirier@suse.com> Date: Mon, 29 Feb 2016 15:03:33 -0800 > The current reserved_tailroom calculation fails to take hlen and tlen into > account. ... > Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs") > Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Applied and queued up for -stable, thanks.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4ce9ff7..d3fcd45 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1985,6 +1985,30 @@ static inline void skb_reserve(struct sk_buff *skb, int len) skb->tail += len; } +/** + * skb_tailroom_reserve - adjust reserved_tailroom + * @skb: buffer to alter + * @mtu: maximum amount of headlen permitted + * @needed_tailroom: minimum amount of reserved_tailroom + * + * Set reserved_tailroom so that headlen can be as large as possible but + * not larger than mtu and tailroom cannot be smaller than + * needed_tailroom. + * The required headroom should already have been reserved before using + * this function. + */ +static inline void skb_tailroom_reserve(struct sk_buff *skb, unsigned int mtu, + unsigned int needed_tailroom) +{ + SKB_LINEAR_ASSERT(skb); + if (mtu < skb_tailroom(skb) - needed_tailroom) + /* use at most mtu */ + skb->reserved_tailroom = skb_tailroom(skb) - mtu; + else + /* use up to all available space */ + skb->reserved_tailroom = needed_tailroom; +} + #define ENCAP_TYPE_ETHER 0 #define ENCAP_TYPE_IPPROTO 1 diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 05e4cba..b3086cf 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -356,9 +356,8 @@ static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu) skb_dst_set(skb, &rt->dst); skb->dev = dev; - skb->reserved_tailroom = skb_end_offset(skb) - - min(mtu, skb_end_offset(skb)); skb_reserve(skb, hlen); + skb_tailroom_reserve(skb, mtu, tlen); skb_reset_network_header(skb); pip = ip_hdr(skb); diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 5ee56d0..d64ee7e 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -1574,9 +1574,8 @@ static struct sk_buff *mld_newpack(struct inet6_dev *idev, unsigned int mtu) return NULL; skb->priority = TC_PRIO_CONTROL; - skb->reserved_tailroom = skb_end_offset(skb) - - min(mtu, skb_end_offset(skb)); skb_reserve(skb, hlen); + skb_tailroom_reserve(skb, mtu, tlen); if (__ipv6_get_lladdr(idev, &addr_buf, IFA_F_TENTATIVE)) { /* <draft-ietf-magma-mld-source-05.txt>:
The current reserved_tailroom calculation fails to take hlen and tlen into account. skb: [__hlen__|__data____________|__tlen___|__extra__] ^ ^ head skb_end_offset In this representation, hlen + data + tlen is the size passed to alloc_skb. "extra" is the extra space made available in __alloc_skb because of rounding up by kmalloc. We can reorder the representation like so: [__hlen__|__data____________|__extra__|__tlen___] ^ ^ head skb_end_offset The maximum space available for ip headers and payload without fragmentation is min(mtu, data + extra). Therefore, reserved_tailroom = data + extra + tlen - min(mtu, data + extra) = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen) = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen) Compare the second line to the current expression: reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset) and we can see that hlen and tlen are not taken into account. The min() in the third line can be expanded into: if mtu < skb_tailroom - tlen: reserved_tailroom = skb_tailroom - mtu else: reserved_tailroom = tlen Depending on hlen, tlen, mtu and the number of multicast address records, the current code may output skbs that have less tailroom than dev->needed_tailroom or it may output more skbs than needed because not all space available is used. Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> --- Notes: Changes v1->v2 As suggested by Hannes, move the code to an inline helper and express it using "if" rather than "min". include/linux/skbuff.h | 24 ++++++++++++++++++++++++ net/ipv4/igmp.c | 3 +-- net/ipv6/mcast.c | 3 +-- 3 files changed, 26 insertions(+), 4 deletions(-)