Message ID | 4FBD546A.1030504@intel.com |
---|---|
State | Awaiting Upstream, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2012-05-23 at 14:19 -0700, Alexander Duyck wrote: > On 05/23/2012 10:10 AM, Alexander Duyck wrote: > > On 05/23/2012 09:39 AM, Eric Dumazet wrote: > >> On Wed, 2012-05-23 at 18:12 +0200, Eric Dumazet wrote: > >> > >>> With current driver, a MTU=1500 frame uses : > >>> > >>> sk_buff (256 bytes) > >>> skb->head : 1024 bytes (or more exaclty now : 512 + 384) > >> By the way, NET_SKB_PAD adds 64 bytes so its 64 + 512 + 384 = 960 > > Actually pahole seems to be indicating to me the size of skb_shared_info > > is 320, unless something has changed in the last few days. > > > > When I get a chance I will try to remember to reduce the ixgbe header > > size to 256 which should also help. The only reason it is set to 512 > > was to deal with the fact that the old alloc_skb code wasn't aligning > > the shared info with the end of whatever size was allocated and so the > > 512 was an approximation to make better use of the 1K slab allocation > > back when we still were using hardware packet split. That should help > > to improve the page utilization for the headers since that would > > increase the uses of a page from 4 to 6 for the skb head frag, and it > > would drop truesize by another 256 bytes. > > > > Thanks, > > > > Alex > Here is the patch for review. I have submitted the official patch to Jeff > so that it can go through his tree for testing, validation, and submission > once Dave's tree opens back up. > > --- > > The recent changes to netdev_alloc_skb actually make it so that the size of > the buffer now actually has a more direct input on the truesize. So in > order to make best use of the piece of a page we are allocated I am > reducing the IXGBE_RX_HDR_SIZE to 256 so that our truesize will be reduced > by 256 bytes as well. > > This should result in performance improvements since the number of uses per > page should increase from 4 to 6 in the case of a 4K page. In addition we > should see socket performance improvements due to the truesize dropping > to less than 1K for buffers less than 256 bytes. > > Not-Yet-Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > --- > > drivers/net/ethernet/intel/ixgbe/ixgbe.h | 15 ++++++++------- > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++-- > 2 files changed, 10 insertions(+), 9 deletions(-) > > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h > index 402dd66..468e4ab 100644 > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h > @@ -77,17 +77,18 @@ > #define IXGBE_MAX_FCPAUSE 0xFFFF > > /* Supported Rx Buffer Sizes */ > -#define IXGBE_RXBUFFER_512 512 /* Used for packet split */ > +#define IXGBE_RXBUFFER_256 256 /* Used for skb receive header */ > #define IXGBE_MAX_RXBUFFER 16384 /* largest size for a single descriptor */ > > /* > - * NOTE: netdev_alloc_skb reserves up to 64 bytes, NET_IP_ALIGN mans we > - * reserve 2 more, and skb_shared_info adds an additional 384 bytes more, > - * this adds up to 512 bytes of extra data meaning the smallest allocation > - * we could have is 1K. > - * i.e. RXBUFFER_512 --> size-1024 slab > + * NOTE: netdev_alloc_skb reserves up to 64 bytes, NET_IP_ALIGN means we > + * reserve 64 more, and skb_shared_info adds an additional 320 bytes more, > + * this adds up to 448 bytes of extra data. > + * > + * Since netdev_alloc_skb now allocates a page fragment we can use a value > + * of 256 and the resultant skb will have a truesize of 960 or less. > */ > -#define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_512 > +#define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_256 > > #define MAXIMUM_ETHERNET_VLAN_SIZE (ETH_FRAME_LEN + ETH_FCS_LEN + VLAN_HLEN) > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > index 7f92e40..f92b31a 100644 > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > @@ -1520,8 +1520,8 @@ static bool ixgbe_cleanup_headers(struct ixgbe_ring *rx_ring, > * 60 bytes if the skb->len is less than 60 for skb_pad. > */ > pull_len = skb_frag_size(frag); > - if (pull_len > 256) > - pull_len = ixgbe_get_headlen(va, pull_len); > + if (pull_len > IXGBE_RX_HDR_SIZE) > + pull_len = ixgbe_get_headlen(va, IXGBE_RX_HDR_SIZE); > > /* align pull length to size of long to optimize memcpy performance */ > skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long))); > By the way you should reword the comment about NET_IP_ALIGN On x86 NET_IP_ALIGN is 0, so we dont 'reserve 64 bytes more' -> 896 bytes Also, are you sure : srrctl |= (IXGBE_RX_HDR_SIZE << IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) & IXGBE_SRRCTL_BSIZEHDR_MASK; is still needed in ixgbe_configure_srrctl() , since it uses IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF (non packet split) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/23/2012 02:37 PM, Eric Dumazet wrote: > On Wed, 2012-05-23 at 14:19 -0700, Alexander Duyck wrote: >> On 05/23/2012 10:10 AM, Alexander Duyck wrote: >>> On 05/23/2012 09:39 AM, Eric Dumazet wrote: >>>> On Wed, 2012-05-23 at 18:12 +0200, Eric Dumazet wrote: >>>> >>>>> With current driver, a MTU=1500 frame uses : >>>>> >>>>> sk_buff (256 bytes) >>>>> skb->head : 1024 bytes (or more exaclty now : 512 + 384) >>>> By the way, NET_SKB_PAD adds 64 bytes so its 64 + 512 + 384 = 960 >>> Actually pahole seems to be indicating to me the size of skb_shared_info >>> is 320, unless something has changed in the last few days. >>> >>> When I get a chance I will try to remember to reduce the ixgbe header >>> size to 256 which should also help. The only reason it is set to 512 >>> was to deal with the fact that the old alloc_skb code wasn't aligning >>> the shared info with the end of whatever size was allocated and so the >>> 512 was an approximation to make better use of the 1K slab allocation >>> back when we still were using hardware packet split. That should help >>> to improve the page utilization for the headers since that would >>> increase the uses of a page from 4 to 6 for the skb head frag, and it >>> would drop truesize by another 256 bytes. >>> >>> Thanks, >>> >>> Alex >> Here is the patch for review. I have submitted the official patch to Jeff >> so that it can go through his tree for testing, validation, and submission >> once Dave's tree opens back up. >> >> --- >> >> The recent changes to netdev_alloc_skb actually make it so that the size of >> the buffer now actually has a more direct input on the truesize. So in >> order to make best use of the piece of a page we are allocated I am >> reducing the IXGBE_RX_HDR_SIZE to 256 so that our truesize will be reduced >> by 256 bytes as well. >> >> This should result in performance improvements since the number of uses per >> page should increase from 4 to 6 in the case of a 4K page. In addition we >> should see socket performance improvements due to the truesize dropping >> to less than 1K for buffers less than 256 bytes. >> >> Not-Yet-Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> --- >> >> drivers/net/ethernet/intel/ixgbe/ixgbe.h | 15 ++++++++------- >> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 ++-- >> 2 files changed, 10 insertions(+), 9 deletions(-) >> >> >> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h >> index 402dd66..468e4ab 100644 >> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h >> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h >> @@ -77,17 +77,18 @@ >> #define IXGBE_MAX_FCPAUSE 0xFFFF >> >> /* Supported Rx Buffer Sizes */ >> -#define IXGBE_RXBUFFER_512 512 /* Used for packet split */ >> +#define IXGBE_RXBUFFER_256 256 /* Used for skb receive header */ >> #define IXGBE_MAX_RXBUFFER 16384 /* largest size for a single descriptor */ >> >> /* >> - * NOTE: netdev_alloc_skb reserves up to 64 bytes, NET_IP_ALIGN mans we >> - * reserve 2 more, and skb_shared_info adds an additional 384 bytes more, >> - * this adds up to 512 bytes of extra data meaning the smallest allocation >> - * we could have is 1K. >> - * i.e. RXBUFFER_512 --> size-1024 slab >> + * NOTE: netdev_alloc_skb reserves up to 64 bytes, NET_IP_ALIGN means we >> + * reserve 64 more, and skb_shared_info adds an additional 320 bytes more, >> + * this adds up to 448 bytes of extra data. >> + * >> + * Since netdev_alloc_skb now allocates a page fragment we can use a value >> + * of 256 and the resultant skb will have a truesize of 960 or less. >> */ >> -#define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_512 >> +#define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_256 >> >> #define MAXIMUM_ETHERNET_VLAN_SIZE (ETH_FRAME_LEN + ETH_FCS_LEN + VLAN_HLEN) >> >> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c >> index 7f92e40..f92b31a 100644 >> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c >> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c >> @@ -1520,8 +1520,8 @@ static bool ixgbe_cleanup_headers(struct ixgbe_ring *rx_ring, >> * 60 bytes if the skb->len is less than 60 for skb_pad. >> */ >> pull_len = skb_frag_size(frag); >> - if (pull_len > 256) >> - pull_len = ixgbe_get_headlen(va, pull_len); >> + if (pull_len > IXGBE_RX_HDR_SIZE) >> + pull_len = ixgbe_get_headlen(va, IXGBE_RX_HDR_SIZE); >> >> /* align pull length to size of long to optimize memcpy performance */ >> skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long))); >> > > By the way you should reword the comment about NET_IP_ALIGN > > On x86 NET_IP_ALIGN is 0, so we dont 'reserve 64 bytes more' On x86 yes, on other platforms that don't clear the value it will add 2 more bytes, and after cache alignment it will likely add another 64. I am really just stating the worst case scenario here. This is why I end the comment with "960 or less". > -> 896 bytes .. or 832 bytes if you really tweak settings and get the sk_buff down to 192 bytes. :-) > Also, are you sure : > > srrctl |= (IXGBE_RX_HDR_SIZE << IXGBE_SRRCTL_BSIZEHDRSIZE_SHIFT) & > IXGBE_SRRCTL_BSIZEHDR_MASK; > > > is still needed in ixgbe_configure_srrctl() , since it uses > IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF (non packet split) That bit is needed for RSC. Basically we have to specify the maximum size of a header, even if we are not using packet split. The value has to be at least 128 or more. Since we are using the value of IXGBE_RX_HDR_SIZE to limit how much header we pull anyway I figure it is probably a good idea just to leave this here. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h index 402dd66..468e4ab 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -77,17 +77,18 @@ #define IXGBE_MAX_FCPAUSE 0xFFFF /* Supported Rx Buffer Sizes */ -#define IXGBE_RXBUFFER_512 512 /* Used for packet split */ +#define IXGBE_RXBUFFER_256 256 /* Used for skb receive header */ #define IXGBE_MAX_RXBUFFER 16384 /* largest size for a single descriptor */ /* - * NOTE: netdev_alloc_skb reserves up to 64 bytes, NET_IP_ALIGN mans we - * reserve 2 more, and skb_shared_info adds an additional 384 bytes more, - * this adds up to 512 bytes of extra data meaning the smallest allocation - * we could have is 1K. - * i.e. RXBUFFER_512 --> size-1024 slab + * NOTE: netdev_alloc_skb reserves up to 64 bytes, NET_IP_ALIGN means we + * reserve 64 more, and skb_shared_info adds an additional 320 bytes more, + * this adds up to 448 bytes of extra data. + * + * Since netdev_alloc_skb now allocates a page fragment we can use a value + * of 256 and the resultant skb will have a truesize of 960 or less. */ -#define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_512 +#define IXGBE_RX_HDR_SIZE IXGBE_RXBUFFER_256 #define MAXIMUM_ETHERNET_VLAN_SIZE (ETH_FRAME_LEN + ETH_FCS_LEN + VLAN_HLEN) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 7f92e40..f92b31a 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -1520,8 +1520,8 @@ static bool ixgbe_cleanup_headers(struct ixgbe_ring *rx_ring, * 60 bytes if the skb->len is less than 60 for skb_pad. */ pull_len = skb_frag_size(frag); - if (pull_len > 256) - pull_len = ixgbe_get_headlen(va, pull_len); + if (pull_len > IXGBE_RX_HDR_SIZE) + pull_len = ixgbe_get_headlen(va, IXGBE_RX_HDR_SIZE); /* align pull length to size of long to optimize memcpy performance */ skb_copy_to_linear_data(skb, va, ALIGN(pull_len, sizeof(long)));