Message ID | 1347868702.26523.79.camel@edumazet-glaptop |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On 09/17/2012 12:58 AM, Eric Dumazet wrote: > On Mon, 2012-09-17 at 07:33 +0000, Dave, Tushar N wrote: >>> -----Original Message----- >>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] >>> On Behalf Of John Fastabend >>> Also wouldn't you want an unlikely() in your patch? >> No because it is quite normal to have packet < ETH_ZLEN. e.g. ARP packets. > ARP packets ? Hardly a performance problem. > > Or make sure all these packets have enough tailroom, or else you are > going to hit the cost of reallocating packets. > > I would better point TCP pure ACK packets, since their size can be 54 > bytes. > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > index cfe6ffe..aefc681 100644 > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -3083,8 +3083,9 @@ void tcp_send_ack(struct sock *sk) > /* We are not putting this on the write queue, so > * tcp_transmit_skb() will set the ownership to this > * sock. > + * Add 64 bytes of tailroom so that some drivers can use skb_pad() > */ > - buff = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC)); > + buff = alloc_skb(MAX_TCP_HEADER + 64, sk_gfp_atomic(sk, GFP_ATOMIC)); > if (buff == NULL) { > inet_csk_schedule_ack(sk); > inet_csk(sk)->icsk_ack.ato = TCP_ATO_MIN; For most systems that extra padding should already be added since alloc_skb will cache line align the buffer anyway. A more general fix might be to make it so that alloc_skb cannot allocate less than 60 byte buffers on systems with a cache line size smaller than 64 bytes. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-09-17 at 13:53 -0700, Alexander Duyck wrote: > On 09/17/2012 12:58 AM, Eric Dumazet wrote: > > On Mon, 2012-09-17 at 07:33 +0000, Dave, Tushar N wrote: > >>> -----Original Message----- > >>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] > >>> On Behalf Of John Fastabend > >>> Also wouldn't you want an unlikely() in your patch? > >> No because it is quite normal to have packet < ETH_ZLEN. e.g. ARP packets. > > ARP packets ? Hardly a performance problem. > > > > Or make sure all these packets have enough tailroom, or else you are > > going to hit the cost of reallocating packets. > > > > I would better point TCP pure ACK packets, since their size can be 54 > > bytes. > > > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > > index cfe6ffe..aefc681 100644 > > --- a/net/ipv4/tcp_output.c > > +++ b/net/ipv4/tcp_output.c > > @@ -3083,8 +3083,9 @@ void tcp_send_ack(struct sock *sk) > > /* We are not putting this on the write queue, so > > * tcp_transmit_skb() will set the ownership to this > > * sock. > > + * Add 64 bytes of tailroom so that some drivers can use skb_pad() > > */ > > - buff = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC)); > > + buff = alloc_skb(MAX_TCP_HEADER + 64, sk_gfp_atomic(sk, GFP_ATOMIC)); > > if (buff == NULL) { > > inet_csk_schedule_ack(sk); > > inet_csk(sk)->icsk_ack.ato = TCP_ATO_MIN; > For most systems that extra padding should already be added since > alloc_skb will cache line align the buffer anyway. > Please define 'most systems' ? > A more general fix might be to make it so that alloc_skb cannot allocate > less than 60 byte buffers on systems with a cache line size smaller than > 64 bytes. Nope, because we do a skb_reserve(skb, MAX_TCP_HEADER) So we might have no bytes available at all after this MAX_TCP_HEADER area. Relying on extra padding in alloc_skb() is hacky anyway, as it depends on external factors (external to TCP stack) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/17/2012 2:02 PM, Eric Dumazet wrote: > On Mon, 2012-09-17 at 13:53 -0700, Alexander Duyck wrote: >> On 09/17/2012 12:58 AM, Eric Dumazet wrote: >>> On Mon, 2012-09-17 at 07:33 +0000, Dave, Tushar N wrote: >>>>> -----Original Message----- >>>>> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] >>>>> On Behalf Of John Fastabend >>>>> Also wouldn't you want an unlikely() in your patch? >>>> No because it is quite normal to have packet < ETH_ZLEN. e.g. ARP packets. >>> ARP packets ? Hardly a performance problem. >>> >>> Or make sure all these packets have enough tailroom, or else you are >>> going to hit the cost of reallocating packets. >>> >>> I would better point TCP pure ACK packets, since their size can be 54 >>> bytes. >>> >>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c >>> index cfe6ffe..aefc681 100644 >>> --- a/net/ipv4/tcp_output.c >>> +++ b/net/ipv4/tcp_output.c >>> @@ -3083,8 +3083,9 @@ void tcp_send_ack(struct sock *sk) >>> /* We are not putting this on the write queue, so >>> * tcp_transmit_skb() will set the ownership to this >>> * sock. >>> + * Add 64 bytes of tailroom so that some drivers can use skb_pad() >>> */ >>> - buff = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC)); >>> + buff = alloc_skb(MAX_TCP_HEADER + 64, sk_gfp_atomic(sk, GFP_ATOMIC)); >>> if (buff == NULL) { >>> inet_csk_schedule_ack(sk); >>> inet_csk(sk)->icsk_ack.ato = TCP_ATO_MIN; >> For most systems that extra padding should already be added since >> alloc_skb will cache line align the buffer anyway. >> > Please define 'most systems' ? Sorry I misspoke. What I meant to say is that the allocation will be aligned to a slab size. If you take a look at alloc_skb it looks like it is still using __alloc_skb so it is going to add skb_shared_info to the size so at least in the case of most 64 bit systems the total allocation size is going to be larger than 512 and as a result skb->head will be allocated from a 1K slab cache leaving plenty of room for padding to be added later. On 32 bit systems the total size will likely be a little over 256 and get rounded up to 512. The only real thing that bugged me about this is that you were adding 64 when the most you should ever need is 10. That was the only real reason I felt like commenting on it. >> A more general fix might be to make it so that alloc_skb cannot allocate >> less than 60 byte buffers on systems with a cache line size smaller than >> 64 bytes. > Nope, because we do a skb_reserve(skb, MAX_TCP_HEADER) > > So we might have no bytes available at all after this MAX_TCP_HEADER > area. > > Relying on extra padding in alloc_skb() is hacky anyway, as it > depends on external factors (external to TCP stack) That is true, but the fact is there is probably a fair amount of that going on without people even realizing it. As I recall the smallest skb head you can allocate on a 64 bit system currently is something like 128 bytes which comes from the 512 byte slab, the next step up after that is a 640 byte head. Since MAX_TCP_HEADER starts at 160 the likelihood of it not getting at least 16 bytes of padding is pretty low. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Alexander Duyck <alexander.duyck@gmail.com> Date: Mon, 17 Sep 2012 20:01:06 -0700 > Since MAX_TCP_HEADER starts at 160 the likelihood of it not getting > at least 16 bytes of padding is pretty low. I know it's not on many people's radar, but with SLOB it will happen a lot probably. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/17/2012 8:03 PM, David Miller wrote: > From: Alexander Duyck <alexander.duyck@gmail.com> > Date: Mon, 17 Sep 2012 20:01:06 -0700 > >> Since MAX_TCP_HEADER starts at 160 the likelihood of it not getting >> at least 16 bytes of padding is pretty low. > I know it's not on many people's radar, but with SLOB it will happen > a lot probably. That is true. I hadn't thought about anything other than SLAB/SLUB. It also just occurred to me that there might be some benefit in cache aligning the max header size. It seems like doing something like that should reduce the overall memory footprint and would probably improve performance. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-09-17 at 20:27 -0700, Alexander Duyck wrote: > It also just occurred to me that there might be some benefit in cache > aligning the max header size. It seems like doing something like that > should reduce the overall memory footprint and would probably improve > performance. Given that most ACK packets are 66 bytes (14 ethernet + 20 IP + 32 TCP), I am not sure we need to make any tweak on alignment ? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/17/2012 10:45 PM, Eric Dumazet wrote: > On Mon, 2012-09-17 at 20:27 -0700, Alexander Duyck wrote: > >> It also just occurred to me that there might be some benefit in cache >> aligning the max header size. It seems like doing something like that >> should reduce the overall memory footprint and would probably improve >> performance. > Given that most ACK packets are 66 bytes (14 ethernet + 20 IP + 32 TCP), > I am not sure we need to make any tweak on alignment ? I'm honestly not sure myself. I will probably spend a few hours tomorrow tweaking a few things to test and see if there is any gain to be had there. The only reason why it occurred to me is that it really isn't too far off from what we did back on the Rx side, except for there we were aligning at the start of the buffer and working our way up. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index cfe6ffe..aefc681 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3083,8 +3083,9 @@ void tcp_send_ack(struct sock *sk) /* We are not putting this on the write queue, so * tcp_transmit_skb() will set the ownership to this * sock. + * Add 64 bytes of tailroom so that some drivers can use skb_pad() */ - buff = alloc_skb(MAX_TCP_HEADER, sk_gfp_atomic(sk, GFP_ATOMIC)); + buff = alloc_skb(MAX_TCP_HEADER + 64, sk_gfp_atomic(sk, GFP_ATOMIC)); if (buff == NULL) { inet_csk_schedule_ack(sk); inet_csk(sk)->icsk_ack.ato = TCP_ATO_MIN;