Message ID | 1349422868.21172.38.camel@edumazet-glaptop |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 2012-10-05 at 09:41 +0200, Eric Dumazet wrote: > By the way, the commit you pointed has no effect on the reallocation > performed by pskb_expand_head() : The commit has a side effect, because the problem appeared after it was merged (and goes away if I revert it) > int size = nhead + skb_end_offset(skb) + ntail; > > So pskb_expand_head() always assumed the current head is fully used, and > because we have some kmalloc-power-of-two contraints, each time > pskb_expand_head() is called with a non zero (nhead + ntail) we double > the skb->head ksize. That is true, but only after the commit I mentioned. Before that commit, we indeed reallocate skb->head to twice the size, but skb->end is *not* positioned at the end of newly allocated data. So on the next pskb_expand_head(), if head and tail are not big values, the kmalloc() will be of the same size. The commit adds this after allocation: size = SKB_WITH_OVERHEAD(ksize(data)) [...] skb->end = skb->head + size; so on the next pskb_expand_head, we are going to allocate twice the size for sure. > So why are we using skb_end_offset(skb) here is the question. > > I guess it could be (skb_tail_pointer(skb) - skb->head) on some uses. I think your patch is wrong, ntail is not the new tailroom size, it's what missing to the current tailroom size, by adding ntail + nhead + tail_offset we are removing previous tailroom. We cannot shrink the skb that way here I guess, a caller may check needed headroom & tailroom, calls with nhead=1/ntail=0 because only headroom is missing, but after the call tailroom would be less than before the call. Why don't we juste reallocate to this size: MAX(current_alloc_size, nhead + ntail + current_end - current_head)
On Fri, 2012-10-05 at 12:49 +0200, Maxime Bizon wrote: > On Fri, 2012-10-05 at 09:41 +0200, Eric Dumazet wrote: > > > By the way, the commit you pointed has no effect on the reallocation > > performed by pskb_expand_head() : > > The commit has a side effect, because the problem appeared after it was > merged (and goes away if I revert it) > > > int size = nhead + skb_end_offset(skb) + ntail; > > > > So pskb_expand_head() always assumed the current head is fully used, and > > because we have some kmalloc-power-of-two contraints, each time > > pskb_expand_head() is called with a non zero (nhead + ntail) we double > > the skb->head ksize. > > That is true, but only after the commit I mentioned. > > Before that commit, we indeed reallocate skb->head to twice the size, > but skb->end is *not* positioned at the end of newly allocated data. So > on the next pskb_expand_head(), if head and tail are not big values, the > kmalloc() will be of the same size. > > > The commit adds this after allocation: > > size = SKB_WITH_OVERHEAD(ksize(data)) > [...] > skb->end = skb->head + size; > > so on the next pskb_expand_head, we are going to allocate twice the size > for sure. Yes, but the idea of the patch was to _avoid_ next pskb_expand_head() calls... Its defeated because you have a too small NET_SKB_PAD, and skb_recycle() inability to properly detect ans skb is oversized. > > > So why are we using skb_end_offset(skb) here is the question. > > > > I guess it could be (skb_tail_pointer(skb) - skb->head) on some uses. > > I think your patch is wrong, ntail is not the new tailroom size, it's > what missing to the current tailroom size, by adding ntail + nhead + > tail_offset we are removing previous tailroom. > > We cannot shrink the skb that way here I guess, a caller may check > needed headroom & tailroom, calls with nhead=1/ntail=0 because only > headroom is missing, but after the call tailroom would be less than > before the call. > > Why don't we juste reallocate to this size: > > MAX(current_alloc_size, nhead + ntail + current_end - current_head) Hmm, this changes nothing assuming current_end == skb_end_offset(skb) and current_head = skb->head Not sure what you mean. I guess the right answer is to change API, because we have no clue if the callers want some tailroom or not. If they have 'enough' they currently pass 0, so we dont really now how many bytes they really need.. In fact very few call sites need to increase the tailroom, so it's probably very easy to do this change. New convention would be : pass number of needed bytes after current tail, not after current end. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-10-05 at 14:22 +0200, Eric Dumazet wrote: > Yes, but the idea of the patch was to _avoid_ next pskb_expand_head() > calls... yes but we cannot be sure of that, the caller may not have a good idea of the headroom needed for the whole lifetime of the skb it's better to think we will reduce number of calls, not avoid them that's why I think doubling the size each time is dangerous, since we silently request bigger and bigger allocations if an skb takes an unoptimized path > Hmm, > > this changes nothing assuming current_end == skb_end_offset(skb) > and current_head = skb->head My idea was to leave skb->end at its last position even if we grow skb->head. Since we have a way to know the current allocation size of skb->head, further pskb_expand_head() calls to request tailroom would just push skb->tail & skb->end together if that fits in current ksize(). I've not looked at recent changes in mainline, since you changed how skb->head is managed, that may be totally impossible. Your proposed changed API change to expand_head will fill this anyway. > New convention would be : pass number of needed bytes after current > tail, not after current end. Fully agree on this
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index cdc2859..dd42c6a 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1053,11 +1053,22 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail, { int i; u8 *data; - int size = nhead + skb_end_offset(skb) + ntail; + unsigned int tail_offset = skb_tail_pointer(skb) - skb->head; + int size = nhead + ntail; long off; BUG_ON(nhead < 0); + /* callers using nhead == 0 and ntail == 0 want to get a fresh copy, + * so allocate same amount of memory (skb_end_offset) + * For others, they want extra headroom or tailroom against the + * currently used portion of header (skb->head -> skb_tail_pointer) + */ + if (!size) + size = skb_end_offset(skb); + else + size += tail_offset; + if (skb_shared(skb)) BUG(); @@ -1074,7 +1085,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail, /* Copy only real data... and, alas, header. This should be * optimized for the cases when header is void. */ - memcpy(data + nhead, skb->head, skb_tail_pointer(skb) - skb->head); + memcpy(data + nhead, skb->head, tail_offset); memcpy((struct skb_shared_info *)(data + size), skb_shinfo(skb),