diff mbox series

[RFC,2/2] ixgbe: setup XPS via netif_set_xps()

Message ID 384ee099d617f3d3786a618b11cc10616923ec45.1521124830.git.pabeni@redhat.com
State RFC, archived
Delegated to: David Miller
Headers show
Series net:setup XPS mapping for each online CPU | expand

Commit Message

Paolo Abeni March 15, 2018, 3:08 p.m. UTC
Before this commit, ixgbe with the default setting lacks XPS mapping
for CPUs id greater than the number of tx queues.

As a consequence the xmit path for such CPUs experience a relevant cost
in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
tool:

7.55%--netdev_pick_tx
        |
        --6.92%--__netdev_pick_tx
                  |
                  --6.35%--__skb_tx_hash
                            |
                            --5.94%--__skb_get_hash
                                      |
                                      --3.22%--__skb_flow_dissect

in the following  scenario:

ethtool -L em1 combined 1
taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992       1   10.00     11497225      0       9.20

After this commit the perf tool reports:

0.85%--__netdev_pick_tx

and netperf reports:

MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992       1   10.00     12736058      0      10.19

roughly +10% in xmit tput.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |  2 +-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 13 +++----------
 3 files changed, 5 insertions(+), 11 deletions(-)

Comments

Alexander H Duyck March 15, 2018, 4:43 p.m. UTC | #1
On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Before this commit, ixgbe with the default setting lacks XPS mapping
> for CPUs id greater than the number of tx queues.
>
> As a consequence the xmit path for such CPUs experience a relevant cost
> in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
> tool:
>
> 7.55%--netdev_pick_tx
>         |
>         --6.92%--__netdev_pick_tx
>                   |
>                   --6.35%--__skb_tx_hash
>                             |
>                             --5.94%--__skb_get_hash
>                                       |
>                                       --3.22%--__skb_flow_dissect
>
> in the following  scenario:
>
> ethtool -L em1 combined 1
> taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
> MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> Socket  Message  Elapsed      Messages
> Size    Size     Time         Okay Errors   Throughput
> bytes   bytes    secs            #      #   10^6bits/sec
>
> 212992       1   10.00     11497225      0       9.20
>
> After this commit the perf tool reports:
>
> 0.85%--__netdev_pick_tx
>
> and netperf reports:
>
> MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> Socket  Message  Elapsed      Messages
> Size    Size     Time         Okay Errors   Throughput
> bytes   bytes    secs            #      #   10^6bits/sec
>
> 212992       1   10.00     12736058      0      10.19
>
> roughly +10% in xmit tput.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

I think we shouldn't be configuring XPS if number of Tx or Rx queues
is less than the number of CPUs, or ATR is not enabled.

Really the XPS bits are only really supposed to be used with the ATR
functionality enabled. If we don't have enough queues for a 1:1
mapping we should probably not be programming XPS since ATR isn't
going to function right anyway.

- Alex
Paolo Abeni March 15, 2018, 5:05 p.m. UTC | #2
Hi, 

On Thu, 2018-03-15 at 09:43 -0700, Alexander Duyck wrote:
> On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > Before this commit, ixgbe with the default setting lacks XPS mapping
> > for CPUs id greater than the number of tx queues.
> > 
> > As a consequence the xmit path for such CPUs experience a relevant cost
> > in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
> > tool:
> > 
> > 7.55%--netdev_pick_tx
> >         |
> >         --6.92%--__netdev_pick_tx
> >                   |
> >                   --6.35%--__skb_tx_hash
> >                             |
> >                             --5.94%--__skb_get_hash
> >                                       |
> >                                       --3.22%--__skb_flow_dissect
> > 
> > in the following  scenario:
> > 
> > ethtool -L em1 combined 1
> > taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> > Socket  Message  Elapsed      Messages
> > Size    Size     Time         Okay Errors   Throughput
> > bytes   bytes    secs            #      #   10^6bits/sec
> > 
> > 212992       1   10.00     11497225      0       9.20
> > 
> > After this commit the perf tool reports:
> > 
> > 0.85%--__netdev_pick_tx
> > 
> > and netperf reports:
> > 
> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
> > Socket  Message  Elapsed      Messages
> > Size    Size     Time         Okay Errors   Throughput
> > bytes   bytes    secs            #      #   10^6bits/sec
> > 
> > 212992       1   10.00     12736058      0      10.19
> > 
> > roughly +10% in xmit tput.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
> I think we shouldn't be configuring XPS if number of Tx or Rx queues
> is less than the number of CPUs, or ATR is not enabled.

Thank you for the feedback!

Please note the currently the ixgbe driver is enabling XPS regardless
of the above considerations.

> Really the XPS bits are only really supposed to be used with the ATR
> functionality enabled. If we don't have enough queues for a 1:1
> mapping we should probably not be programming XPS since ATR isn't
> going to function right anyway.

uhm... I don't know the details of ATR, but apparently it is for TCP
only, while the use-case I'm referring to is plain (no tunnel)
unconnected UDP traffic. Am I missing something?

thanks,

Paolo
Alexander H Duyck March 15, 2018, 5:22 p.m. UTC | #3
On Thu, Mar 15, 2018 at 10:05 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> Hi,
>
> On Thu, 2018-03-15 at 09:43 -0700, Alexander Duyck wrote:
>> On Thu, Mar 15, 2018 at 8:08 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>> > Before this commit, ixgbe with the default setting lacks XPS mapping
>> > for CPUs id greater than the number of tx queues.
>> >
>> > As a consequence the xmit path for such CPUs experience a relevant cost
>> > in __netdev_pick_tx, mainly due to skb_tx_hash(), as reported by the perf
>> > tool:
>> >
>> > 7.55%--netdev_pick_tx
>> >         |
>> >         --6.92%--__netdev_pick_tx
>> >                   |
>> >                   --6.35%--__skb_tx_hash
>> >                             |
>> >                             --5.94%--__skb_get_hash
>> >                                       |
>> >                                       --3.22%--__skb_flow_dissect
>> >
>> > in the following  scenario:
>> >
>> > ethtool -L em1 combined 1
>> > taskset 2 netperf -H 192.168.1.1 -t UDP_STREAM -- -m 1
>> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
>> > Socket  Message  Elapsed      Messages
>> > Size    Size     Time         Okay Errors   Throughput
>> > bytes   bytes    secs            #      #   10^6bits/sec
>> >
>> > 212992       1   10.00     11497225      0       9.20
>> >
>> > After this commit the perf tool reports:
>> >
>> > 0.85%--__netdev_pick_tx
>> >
>> > and netperf reports:
>> >
>> > MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.101.1 () port 0 AF_INET
>> > Socket  Message  Elapsed      Messages
>> > Size    Size     Time         Okay Errors   Throughput
>> > bytes   bytes    secs            #      #   10^6bits/sec
>> >
>> > 212992       1   10.00     12736058      0      10.19
>> >
>> > roughly +10% in xmit tput.
>> >
>> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>
>> I think we shouldn't be configuring XPS if number of Tx or Rx queues
>> is less than the number of CPUs, or ATR is not enabled.
>
> Thank you for the feedback!
>
> Please note the currently the ixgbe driver is enabling XPS regardless
> of the above considerations.
>
>> Really the XPS bits are only really supposed to be used with the ATR
>> functionality enabled. If we don't have enough queues for a 1:1
>> mapping we should probably not be programming XPS since ATR isn't
>> going to function right anyway.
>
> uhm... I don't know the details of ATR, but apparently it is for TCP
> only, while the use-case I'm referring to is plain (no tunnel)
> unconnected UDP traffic. Am I missing something?

No. Basically the ATR/XPS bits is an overreach. The original code had
the driver just using the incoming CPU to select a Tx queue via the
ndo_select_queue. I pushed it out to XPS in order to try to get the
drivers to avoid using ndo_select_queue and provide the user with a
way to at least manually disable the Tx side of it.

For now I would say we should not have the driver configuring the XPS
map if it cannot assume a 1:1 mapping. As-is there are a number of
features where having this functionality enabled doesn't make sense.
In those cases we leave cpu as -1 in ixgbe_alloc_q_vector, and leave
the affinity mask as all 0s. It might make sense to just update the
code there in the case of ixgbe so that we don't update the XPS map or
the q_vector->affinity_mask if we cannot achieve a 1:1 mapping. As is
I would say the code is probably in need of updates since
ixgbe_alloc_q_vector doesn't handle the case where we might have a
non-linear CPU layout.

ATR is a feature that has been on my list of things to fix sometime in
the near future, but I haven't had the time as I have been pulled into
too many other efforts. Ideally we should be moving away from ATR and
instead looking at doing something like supporting ndo_rx_flow_steer.

Thanks.

- Alex
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index c1e3a0039ea5..04aaecce81d2 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -293,7 +293,6 @@  enum ixgbe_ring_state_t {
 	__IXGBE_RX_CSUM_UDP_ZERO_ERR,
 	__IXGBE_RX_FCOE,
 	__IXGBE_TX_FDIR_INIT_DONE,
-	__IXGBE_TX_XPS_INIT_DONE,
 	__IXGBE_TX_DETECT_HANG,
 	__IXGBE_HANG_CHECK_ARMED,
 	__IXGBE_TX_XDP_RING,
@@ -827,6 +826,7 @@  enum ixgbe_state_t {
 	__IXGBE_PTP_RUNNING,
 	__IXGBE_PTP_TX_IN_PROGRESS,
 	__IXGBE_RESET_REQUESTED,
+	__IXGBE_TX_XPS_INIT_DONE,
 };
 
 struct ixgbe_cb {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index c0e6ab42e0e1..da4e6416e8eb 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -3224,6 +3224,7 @@  static int ixgbe_set_channels(struct net_device *dev,
 
 #endif
 	/* use setup TC to update any traffic class queue mapping */
+	clear_bit(__IXGBE_TX_XPS_INIT_DONE, &adapter->state);
 	return ixgbe_setup_tc(dev, adapter->hw_tcs);
 }
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 85369423452d..5bd45fc737fa 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3566,16 +3566,6 @@  void ixgbe_configure_tx_ring(struct ixgbe_adapter *adapter,
 		ring->atr_sample_rate = 0;
 	}
 
-	/* initialize XPS */
-	if (!test_and_set_bit(__IXGBE_TX_XPS_INIT_DONE, &ring->state)) {
-		struct ixgbe_q_vector *q_vector = ring->q_vector;
-
-		if (q_vector)
-			netif_set_xps_queue(ring->netdev,
-					    &q_vector->affinity_mask,
-					    ring->queue_index);
-	}
-
 	clear_bit(__IXGBE_HANG_CHECK_ARMED, &ring->state);
 
 	/* reinitialize tx_buffer_info */
@@ -5626,6 +5616,9 @@  static void ixgbe_up_complete(struct ixgbe_adapter *adapter)
 	int err;
 	u32 ctrl_ext;
 
+	if (!test_and_set_bit(__IXGBE_TX_XPS_INIT_DONE, &adapter->state))
+		netif_set_xps(adapter->netdev);
+
 	ixgbe_get_hw_control(adapter);
 	ixgbe_setup_gpie(adapter);