Message ID | 20180507180840.3486.67728.stgit@localhost.localdomain |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | Series short description | expand |
On 05/07/2018 11:08 AM, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > This patch allows us to take care of unrolling the first segment and the > last segment of the loop for processing the segmented skb. Part of the > motivation for this is that it makes it easier to process the fact that the > first fame and all of the frames in between should be mostly identical > in terms of header data, and the last frame has differences in the length > and partial checksum. > > In addition I am dropping the header length calculation since we don't > really need it for anything but the last frame and it can be easily > obtained by just pulling the data_len and offset of tail from the transport > header. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com>
On Mon, May 7, 2018 at 2:08 PM, Alexander Duyck <alexander.duyck@gmail.com> wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > This patch allows us to take care of unrolling the first segment and the > last segment of the loop for processing the segmented skb. Part of the > motivation for this is that it makes it easier to process the fact that the > first fame and all of the frames in between should be mostly identical > in terms of header data, and the last frame has differences in the length > and partial checksum. > > In addition I am dropping the header length calculation since we don't > really need it for anything but the last frame and it can be easily > obtained by just pulling the data_len and offset of tail from the transport > header. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> I'm not a fan of the more complicated control flow, as I pointed out before. It only seems to save one assignment to uh from segs. Both follow-up patches are now more complex, because they need to add the same code in two locations.
On Mon, May 7, 2018 at 2:57 PM, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > On Mon, May 7, 2018 at 2:08 PM, Alexander Duyck > <alexander.duyck@gmail.com> wrote: >> From: Alexander Duyck <alexander.h.duyck@intel.com> >> >> This patch allows us to take care of unrolling the first segment and the >> last segment of the loop for processing the segmented skb. Part of the >> motivation for this is that it makes it easier to process the fact that the >> first fame and all of the frames in between should be mostly identical >> in terms of header data, and the last frame has differences in the length >> and partial checksum. >> >> In addition I am dropping the header length calculation since we don't >> really need it for anything but the last frame and it can be easily >> obtained by just pulling the data_len and offset of tail from the transport >> header. >> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > > I'm not a fan of the more complicated control flow, as I pointed out > before. It only seems to save one assignment to uh from segs. > > Both follow-up patches are now more complex, because they need > to add the same code in two locations. With that said, if you feel strongly, I don't object. The removal of hdrlen and simplification of arguments is definitely an improvement.
On Mon, May 7, 2018 at 12:54 PM, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > On Mon, May 7, 2018 at 2:57 PM, Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: >> On Mon, May 7, 2018 at 2:08 PM, Alexander Duyck >> <alexander.duyck@gmail.com> wrote: >>> From: Alexander Duyck <alexander.h.duyck@intel.com> >>> >>> This patch allows us to take care of unrolling the first segment and the >>> last segment of the loop for processing the segmented skb. Part of the >>> motivation for this is that it makes it easier to process the fact that the >>> first fame and all of the frames in between should be mostly identical >>> in terms of header data, and the last frame has differences in the length >>> and partial checksum. >>> >>> In addition I am dropping the header length calculation since we don't >>> really need it for anything but the last frame and it can be easily >>> obtained by just pulling the data_len and offset of tail from the transport >>> header. >>> >>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> >> I'm not a fan of the more complicated control flow, as I pointed out >> before. It only seems to save one assignment to uh from segs. >> >> Both follow-up patches are now more complex, because they need >> to add the same code in two locations. > > With that said, if you feel strongly, I don't object. > > The removal of hdrlen and simplification of arguments is definitely > an improvement. Thanks for being understanding about this. My preference is to keep the loop unrolled as it is since that way it is not too different from the way we handle this for TCP so it will maintenance of the two easier. Otherwise I have to add a bunch of conditional checks inside the loop. The other advantage to unrolling it as I did is that I don't have to deal with a ton of extra indentation for an if statement inside of a while loop. - Alex
On Mon, May 7, 2018 at 3:59 PM, Alexander Duyck <alexander.duyck@gmail.com> wrote: > On Mon, May 7, 2018 at 12:54 PM, Willem de Bruijn > <willemdebruijn.kernel@gmail.com> wrote: >> On Mon, May 7, 2018 at 2:57 PM, Willem de Bruijn >> <willemdebruijn.kernel@gmail.com> wrote: >>> On Mon, May 7, 2018 at 2:08 PM, Alexander Duyck >>> <alexander.duyck@gmail.com> wrote: >>>> From: Alexander Duyck <alexander.h.duyck@intel.com> >>>> >>>> This patch allows us to take care of unrolling the first segment and the >>>> last segment of the loop for processing the segmented skb. Part of the >>>> motivation for this is that it makes it easier to process the fact that the >>>> first fame and all of the frames in between should be mostly identical >>>> in terms of header data, and the last frame has differences in the length >>>> and partial checksum. >>>> >>>> In addition I am dropping the header length calculation since we don't >>>> really need it for anything but the last frame and it can be easily >>>> obtained by just pulling the data_len and offset of tail from the transport >>>> header. >>>> >>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Willem de Bruijn <willemb@google.com> >>> I'm not a fan of the more complicated control flow, as I pointed out >>> before. It only seems to save one assignment to uh from segs. >>> >>> Both follow-up patches are now more complex, because they need >>> to add the same code in two locations. >> >> With that said, if you feel strongly, I don't object. >> >> The removal of hdrlen and simplification of arguments is definitely >> an improvement. > > Thanks for being understanding about this. > > My preference is to keep the loop unrolled as it is since that way it > is not too different from the way we handle this for TCP so it will > maintenance of the two easier. Otherwise I have to add a bunch of > conditional checks inside the loop. > > The other advantage to unrolling it as I did is that I don't have to > deal with a ton of extra indentation for an if statement inside of a > while loop. Both good reasons. Thanks a lot for the overall cleanup.
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 92c182e99ddc..b15c78ac3f23 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -193,7 +193,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, struct sock *sk = gso_skb->sk; unsigned int sum_truesize = 0; struct sk_buff *segs, *seg; - unsigned int hdrlen; struct udphdr *uh; unsigned int mss; __sum16 check; @@ -203,7 +202,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, if (gso_skb->len <= sizeof(*uh) + mss) return ERR_PTR(-EINVAL); - hdrlen = gso_skb->data - skb_mac_header(gso_skb); skb_pull(gso_skb, sizeof(*uh)); /* clear destructor to avoid skb_segment assigning it to tail */ @@ -216,30 +214,37 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, return segs; } - uh = udp_hdr(segs); + seg = segs; + uh = udp_hdr(seg); /* compute checksum adjustment based on old length versus new */ newlen = htons(sizeof(*uh) + mss); check = csum16_add(csum16_sub(uh->check, uh->len), newlen); - for (seg = segs; seg; seg = seg->next) { - uh = udp_hdr(seg); + for (;;) { + seg->destructor = sock_wfree; + seg->sk = sk; + sum_truesize += seg->truesize; - /* last packet can be partial gso_size */ - if (!seg->next) { - newlen = htons(seg->len - hdrlen); - check = csum16_add(csum16_sub(uh->check, uh->len), - newlen); - } + if (!seg->next) + break; uh->len = newlen; uh->check = check; - seg->destructor = sock_wfree; - seg->sk = sk; - sum_truesize += seg->truesize; + seg = seg->next; + uh = udp_hdr(seg); } + /* last packet can be partial gso_size, account for that in checksum */ + newlen = htons(skb_tail_pointer(seg) - skb_transport_header(seg) + + seg->data_len); + check = csum16_add(csum16_sub(uh->check, uh->len), newlen); + + uh->len = newlen; + uh->check = check; + + /* update refcount for the packet */ refcount_add(sum_truesize - gso_skb->truesize, &sk->sk_wmem_alloc); return segs;