mbox series

[jkirsher/next-queue,v2,0/7] Add support for L2 Fwd Offload w/o ndo_select_queue

Message ID 20180612151322.86792.97587.stgit@ahduyck-green-test.jf.intel.com
Headers show
Series Add support for L2 Fwd Offload w/o ndo_select_queue | expand

Message

Duyck, Alexander H June 12, 2018, 3:18 p.m. UTC
This patch series is meant to allow support for the L2 forward offload, aka
MACVLAN offload without the need for using ndo_select_queue.

The existing solution currently requires that we use ndo_select_queue in
the transmit path if we want to associate specific Tx queues with a given
MACVLAN interface. In order to get away from this we need to repurpose the
tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
a means of accessing the queues on the lower device. As a result we cannot
offload a device that is configured as multiqueue, however it doesn't
really make sense to configure a macvlan interfaced as being multiqueue
anyway since it doesn't really have a qdisc of its own in the first place.

I am submitting this as an RFC for the netdev mailing list, and officially
submitting it for testing to Jeff Kirsher's next-queue in order to validate
the ixgbe specific bits.

The big changes in this set are:
  Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
  Disable XPS for single queue devices
  Replace accel_priv with sb_dev in ndo_select_queue
  Add sb_dev parameter to fallback function for ndo_select_queue
  Consolidated ndo_select_queue functions that appeared to be duplicates

v2: Implement generic "select_queue" functions instead of "fallback" functions.
    Tweak last two patches to account for changes in dev_pick_tx_xxx functions.

---

Alexander Duyck (7):
      net-sysfs: Drop support for XPS and traffic_class on single queue device
      net: Add support for subordinate device traffic classes
      ixgbe: Add code to populate and use macvlan tc to Tx queue map
      net: Add support for subordinate traffic classes to netdev_pick_tx
      net: Add generic ndo_select_queue functions
      net: allow ndo_select_queue to pass netdev
      net: allow fallback function to pass netdev


 drivers/infiniband/hw/hfi1/vnic_main.c            |    2 
 drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c |    4 -
 drivers/net/bonding/bond_main.c                   |    3 
 drivers/net/ethernet/amazon/ena/ena_netdev.c      |    5 -
 drivers/net/ethernet/broadcom/bcmsysport.c        |    6 -
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |    6 +
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    3 
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |    5 -
 drivers/net/ethernet/hisilicon/hns/hns_enet.c     |    5 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |   62 ++++++--
 drivers/net/ethernet/lantiq_etop.c                |   10 -
 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |    7 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h      |    3 
 drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3 
 drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    5 -
 drivers/net/ethernet/renesas/ravb_main.c          |    3 
 drivers/net/ethernet/sun/ldmvsw.c                 |    3 
 drivers/net/ethernet/sun/sunvnet.c                |    3 
 drivers/net/ethernet/ti/netcp_core.c              |    9 -
 drivers/net/hyperv/netvsc_drv.c                   |    6 -
 drivers/net/macvlan.c                             |   10 -
 drivers/net/net_failover.c                        |    7 +
 drivers/net/team/team.c                           |    3 
 drivers/net/tun.c                                 |    3 
 drivers/net/wireless/marvell/mwifiex/main.c       |    3 
 drivers/net/xen-netback/interface.c               |    4 -
 drivers/net/xen-netfront.c                        |    3 
 drivers/staging/netlogic/xlr_net.c                |    9 -
 drivers/staging/rtl8188eu/os_dep/os_intfs.c       |    3 
 drivers/staging/rtl8723bs/os_dep/os_intfs.c       |    7 -
 include/linux/netdevice.h                         |   34 ++++-
 net/core/dev.c                                    |  156 ++++++++++++++++++---
 net/core/net-sysfs.c                              |   36 ++++-
 net/mac80211/iface.c                              |    4 -
 net/packet/af_packet.c                            |    7 +
 35 files changed, 312 insertions(+), 130 deletions(-)

--

Comments

Stephen Hemminger June 12, 2018, 5:50 p.m. UTC | #1
On Tue, 12 Jun 2018 11:18:25 -0400
Alexander Duyck <alexander.h.duyck@intel.com> wrote:

> This patch series is meant to allow support for the L2 forward offload, aka
> MACVLAN offload without the need for using ndo_select_queue.
> 
> The existing solution currently requires that we use ndo_select_queue in
> the transmit path if we want to associate specific Tx queues with a given
> MACVLAN interface. In order to get away from this we need to repurpose the
> tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
> a means of accessing the queues on the lower device. As a result we cannot
> offload a device that is configured as multiqueue, however it doesn't
> really make sense to configure a macvlan interfaced as being multiqueue
> anyway since it doesn't really have a qdisc of its own in the first place.
> 
> I am submitting this as an RFC for the netdev mailing list, and officially
> submitting it for testing to Jeff Kirsher's next-queue in order to validate
> the ixgbe specific bits.
> 
> The big changes in this set are:
>   Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
>   Disable XPS for single queue devices
>   Replace accel_priv with sb_dev in ndo_select_queue
>   Add sb_dev parameter to fallback function for ndo_select_queue
>   Consolidated ndo_select_queue functions that appeared to be duplicates
> 
> v2: Implement generic "select_queue" functions instead of "fallback" functions.
>     Tweak last two patches to account for changes in dev_pick_tx_xxx functions.
> 
> ---
> 
> Alexander Duyck (7):
>       net-sysfs: Drop support for XPS and traffic_class on single queue device
>       net: Add support for subordinate device traffic classes
>       ixgbe: Add code to populate and use macvlan tc to Tx queue map
>       net: Add support for subordinate traffic classes to netdev_pick_tx
>       net: Add generic ndo_select_queue functions
>       net: allow ndo_select_queue to pass netdev
>       net: allow fallback function to pass netdev
> 
> 
>  drivers/infiniband/hw/hfi1/vnic_main.c            |    2 
>  drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c |    4 -
>  drivers/net/bonding/bond_main.c                   |    3 
>  drivers/net/ethernet/amazon/ena/ena_netdev.c      |    5 -
>  drivers/net/ethernet/broadcom/bcmsysport.c        |    6 -
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |    6 +
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    3 
>  drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |    5 -
>  drivers/net/ethernet/hisilicon/hns/hns_enet.c     |    5 -
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |   62 ++++++--
>  drivers/net/ethernet/lantiq_etop.c                |   10 -
>  drivers/net/ethernet/mellanox/mlx4/en_tx.c        |    7 +
>  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h      |    3 
>  drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3 
>  drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    5 -
>  drivers/net/ethernet/renesas/ravb_main.c          |    3 
>  drivers/net/ethernet/sun/ldmvsw.c                 |    3 
>  drivers/net/ethernet/sun/sunvnet.c                |    3 
>  drivers/net/ethernet/ti/netcp_core.c              |    9 -
>  drivers/net/hyperv/netvsc_drv.c                   |    6 -
>  drivers/net/macvlan.c                             |   10 -
>  drivers/net/net_failover.c                        |    7 +
>  drivers/net/team/team.c                           |    3 
>  drivers/net/tun.c                                 |    3 
>  drivers/net/wireless/marvell/mwifiex/main.c       |    3 
>  drivers/net/xen-netback/interface.c               |    4 -
>  drivers/net/xen-netfront.c                        |    3 
>  drivers/staging/netlogic/xlr_net.c                |    9 -
>  drivers/staging/rtl8188eu/os_dep/os_intfs.c       |    3 
>  drivers/staging/rtl8723bs/os_dep/os_intfs.c       |    7 -
>  include/linux/netdevice.h                         |   34 ++++-
>  net/core/dev.c                                    |  156 ++++++++++++++++++---
>  net/core/net-sysfs.c                              |   36 ++++-
>  net/mac80211/iface.c                              |    4 -
>  net/packet/af_packet.c                            |    7 +
>  35 files changed, 312 insertions(+), 130 deletions(-)
> 
> --

This makes sense. I thought you were hoping to get rid of select queue in future?
Florian Fainelli June 12, 2018, 5:56 p.m. UTC | #2
On 06/12/2018 08:18 AM, Alexander Duyck wrote:
> This patch series is meant to allow support for the L2 forward offload, aka
> MACVLAN offload without the need for using ndo_select_queue.
> 
> The existing solution currently requires that we use ndo_select_queue in
> the transmit path if we want to associate specific Tx queues with a given
> MACVLAN interface. In order to get away from this we need to repurpose the
> tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
> a means of accessing the queues on the lower device. As a result we cannot
> offload a device that is configured as multiqueue, however it doesn't
> really make sense to configure a macvlan interfaced as being multiqueue
> anyway since it doesn't really have a qdisc of its own in the first place.

Interesting, so at some point I had came up with the following for
mapping queues between the DSA slave network devices and the DSA master
network device (doing the actual transmission). The DSA master network
device driver is just a normal network device driver.

The set-up is as follows: 4 external Ethernet switch ports, each with 8
egress queues and the DSA master (bcmsysport.c), aka CPU Ethernet
controller has 32 output queues, so you can do a 1:1 mapping of those,
that's actually what we want. A subsequent hardware generation only
provides 16 output queues, so we can still do 2:1 mapping.

The implementation is done like this:

- DSA slave network devices are always created after the DSA master
network device so we can leverage that

- a specific notifier is running from the DSA core and tells the DSA
master about the switch position in the tree (position 0 = directly
attached), and the switch port number and a pointer to the slave network
device

- we establish the mapping between the queues within the bcmsysport
driver as a simple array

- when transmitting, DSA slave network devices set a specific queue/port
number within the 16-bits that skb->queue_mapping permits

- this gets re-used by bcmsysport.c to extract the correct queue number
during ndo_select_queue such that the appropriate queue number gets used
and congestion works end-to-end.

The reason why we do that is because there is some out of band HW that
monitors the queue depth of the switch port's egress queue and
back-pressure the Ethernet controller directly when trying to transmit
to a congested queue.

I had initially considered establishing the mapping using tc and some
custom "bind" argument of some kind, but ended-up doing things the way
they are which are more automatic though they leave less configuration
to an user. This has a number of caveats though:

- this is made generic within the context of DSA in that nothing is
switch driver or Ethernet MAC driver specific and the notifier
represents the contract between these two seemingly independent subsystems

- the queue indicated between DSA slave and master is unfortunately
switch driver/controller specific (BRCM_TAG_SET_PORT_QUEUE,
BRCM_TAG_GET_PORT, BRCM_TAG_GET_QUEUE)

What I like about your patchset is the mapping establishment, but as you
will read from my reply in patch 2, I think the (upper) 1:N (lower)
mapping might not work for my specific use case.

Anyhow, not intended to be blocking this, as it seems to be going in the
right direction anyway.

> 
> I am submitting this as an RFC for the netdev mailing list, and officially
> submitting it for testing to Jeff Kirsher's next-queue in order to validate
> the ixgbe specific bits.
> 
> The big changes in this set are:
>   Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
>   Disable XPS for single queue devices
>   Replace accel_priv with sb_dev in ndo_select_queue
>   Add sb_dev parameter to fallback function for ndo_select_queue
>   Consolidated ndo_select_queue functions that appeared to be duplicates

Interesting, turns out I had a possibly similar use case with DSA with
the slave network devices need to select an outgoing queue number for

> 
> v2: Implement generic "select_queue" functions instead of "fallback" functions.
>     Tweak last two patches to account for changes in dev_pick_tx_xxx functions.
> 
> ---
> 
> Alexander Duyck (7):
>       net-sysfs: Drop support for XPS and traffic_class on single queue device
>       net: Add support for subordinate device traffic classes
>       ixgbe: Add code to populate and use macvlan tc to Tx queue map
>       net: Add support for subordinate traffic classes to netdev_pick_tx
>       net: Add generic ndo_select_queue functions
>       net: allow ndo_select_queue to pass netdev
>       net: allow fallback function to pass netdev
> 
> 
>  drivers/infiniband/hw/hfi1/vnic_main.c            |    2 
>  drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c |    4 -
>  drivers/net/bonding/bond_main.c                   |    3 
>  drivers/net/ethernet/amazon/ena/ena_netdev.c      |    5 -
>  drivers/net/ethernet/broadcom/bcmsysport.c        |    6 -
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |    6 +
>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    3 
>  drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |    5 -
>  drivers/net/ethernet/hisilicon/hns/hns_enet.c     |    5 -
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |   62 ++++++--
>  drivers/net/ethernet/lantiq_etop.c                |   10 -
>  drivers/net/ethernet/mellanox/mlx4/en_tx.c        |    7 +
>  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h      |    3 
>  drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3 
>  drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    5 -
>  drivers/net/ethernet/renesas/ravb_main.c          |    3 
>  drivers/net/ethernet/sun/ldmvsw.c                 |    3 
>  drivers/net/ethernet/sun/sunvnet.c                |    3 
>  drivers/net/ethernet/ti/netcp_core.c              |    9 -
>  drivers/net/hyperv/netvsc_drv.c                   |    6 -
>  drivers/net/macvlan.c                             |   10 -
>  drivers/net/net_failover.c                        |    7 +
>  drivers/net/team/team.c                           |    3 
>  drivers/net/tun.c                                 |    3 
>  drivers/net/wireless/marvell/mwifiex/main.c       |    3 
>  drivers/net/xen-netback/interface.c               |    4 -
>  drivers/net/xen-netfront.c                        |    3 
>  drivers/staging/netlogic/xlr_net.c                |    9 -
>  drivers/staging/rtl8188eu/os_dep/os_intfs.c       |    3 
>  drivers/staging/rtl8723bs/os_dep/os_intfs.c       |    7 -
>  include/linux/netdevice.h                         |   34 ++++-
>  net/core/dev.c                                    |  156 ++++++++++++++++++---
>  net/core/net-sysfs.c                              |   36 ++++-
>  net/mac80211/iface.c                              |    4 -
>  net/packet/af_packet.c                            |    7 +
>  35 files changed, 312 insertions(+), 130 deletions(-)
> 
> --
>
Alexander H Duyck June 12, 2018, 10:33 p.m. UTC | #3
On Tue, Jun 12, 2018 at 10:56 AM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 06/12/2018 08:18 AM, Alexander Duyck wrote:
>> This patch series is meant to allow support for the L2 forward offload, aka
>> MACVLAN offload without the need for using ndo_select_queue.
>>
>> The existing solution currently requires that we use ndo_select_queue in
>> the transmit path if we want to associate specific Tx queues with a given
>> MACVLAN interface. In order to get away from this we need to repurpose the
>> tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
>> a means of accessing the queues on the lower device. As a result we cannot
>> offload a device that is configured as multiqueue, however it doesn't
>> really make sense to configure a macvlan interfaced as being multiqueue
>> anyway since it doesn't really have a qdisc of its own in the first place.
>
> Interesting, so at some point I had came up with the following for
> mapping queues between the DSA slave network devices and the DSA master
> network device (doing the actual transmission). The DSA master network
> device driver is just a normal network device driver.
>
> The set-up is as follows: 4 external Ethernet switch ports, each with 8
> egress queues and the DSA master (bcmsysport.c), aka CPU Ethernet
> controller has 32 output queues, so you can do a 1:1 mapping of those,
> that's actually what we want. A subsequent hardware generation only
> provides 16 output queues, so we can still do 2:1 mapping.
>
> The implementation is done like this:
>
> - DSA slave network devices are always created after the DSA master
> network device so we can leverage that
>
> - a specific notifier is running from the DSA core and tells the DSA
> master about the switch position in the tree (position 0 = directly
> attached), and the switch port number and a pointer to the slave network
> device
>
> - we establish the mapping between the queues within the bcmsysport
> driver as a simple array
>
> - when transmitting, DSA slave network devices set a specific queue/port
> number within the 16-bits that skb->queue_mapping permits
>
> - this gets re-used by bcmsysport.c to extract the correct queue number
> during ndo_select_queue such that the appropriate queue number gets used
> and congestion works end-to-end.
>
> The reason why we do that is because there is some out of band HW that
> monitors the queue depth of the switch port's egress queue and
> back-pressure the Ethernet controller directly when trying to transmit
> to a congested queue.
>
> I had initially considered establishing the mapping using tc and some
> custom "bind" argument of some kind, but ended-up doing things the way
> they are which are more automatic though they leave less configuration
> to an user. This has a number of caveats though:
>
> - this is made generic within the context of DSA in that nothing is
> switch driver or Ethernet MAC driver specific and the notifier
> represents the contract between these two seemingly independent subsystems
>
> - the queue indicated between DSA slave and master is unfortunately
> switch driver/controller specific (BRCM_TAG_SET_PORT_QUEUE,
> BRCM_TAG_GET_PORT, BRCM_TAG_GET_QUEUE)
>
> What I like about your patchset is the mapping establishment, but as you
> will read from my reply in patch 2, I think the (upper) 1:N (lower)
> mapping might not work for my specific use case.
>
> Anyhow, not intended to be blocking this, as it seems to be going in the
> right direction anyway.

I think I am still not getting why the 1:N would be an issue. At least
the way I have the code implemented here the lower queues all have a
qdisc associated with them, just not the upper device. Generally I am
using the macvlan as a bump in the wire to take care of filtering for
the bridging mode. If I have to hairpin packets and send them back up
on on of the the upper interfaces I want to do that in software rather
than hardware so I try to take care of it there instead of routing it
through the hardware.

>>
>> I am submitting this as an RFC for the netdev mailing list, and officially
>> submitting it for testing to Jeff Kirsher's next-queue in order to validate
>> the ixgbe specific bits.
>>
>> The big changes in this set are:
>>   Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
>>   Disable XPS for single queue devices
>>   Replace accel_priv with sb_dev in ndo_select_queue
>>   Add sb_dev parameter to fallback function for ndo_select_queue
>>   Consolidated ndo_select_queue functions that appeared to be duplicates
>
> Interesting, turns out I had a possibly similar use case with DSA with
> the slave network devices need to select an outgoing queue number for

I was kind of assuming this could be applied to a number of possible
use cases. As it was I was wondering if maybe we should look at adding
this as an option for just a standard VLAN as we could perform the
same kind of filtering and just deliver the packet directly to the
VLAN interface instead of requiring the extra trip through the stack
after the tag has been stripped.
Alexander H Duyck June 12, 2018, 10:47 p.m. UTC | #4
On Tue, Jun 12, 2018 at 10:50 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Tue, 12 Jun 2018 11:18:25 -0400
> Alexander Duyck <alexander.h.duyck@intel.com> wrote:
>
>> This patch series is meant to allow support for the L2 forward offload, aka
>> MACVLAN offload without the need for using ndo_select_queue.
>>
>> The existing solution currently requires that we use ndo_select_queue in
>> the transmit path if we want to associate specific Tx queues with a given
>> MACVLAN interface. In order to get away from this we need to repurpose the
>> tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
>> a means of accessing the queues on the lower device. As a result we cannot
>> offload a device that is configured as multiqueue, however it doesn't
>> really make sense to configure a macvlan interfaced as being multiqueue
>> anyway since it doesn't really have a qdisc of its own in the first place.
>>
>> I am submitting this as an RFC for the netdev mailing list, and officially
>> submitting it for testing to Jeff Kirsher's next-queue in order to validate
>> the ixgbe specific bits.
>>
>> The big changes in this set are:
>>   Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
>>   Disable XPS for single queue devices
>>   Replace accel_priv with sb_dev in ndo_select_queue
>>   Add sb_dev parameter to fallback function for ndo_select_queue
>>   Consolidated ndo_select_queue functions that appeared to be duplicates
>>
>> v2: Implement generic "select_queue" functions instead of "fallback" functions.
>>     Tweak last two patches to account for changes in dev_pick_tx_xxx functions.
>>
>> ---
>>
>> Alexander Duyck (7):
>>       net-sysfs: Drop support for XPS and traffic_class on single queue device
>>       net: Add support for subordinate device traffic classes
>>       ixgbe: Add code to populate and use macvlan tc to Tx queue map
>>       net: Add support for subordinate traffic classes to netdev_pick_tx
>>       net: Add generic ndo_select_queue functions
>>       net: allow ndo_select_queue to pass netdev
>>       net: allow fallback function to pass netdev
>>
>>
>>  drivers/infiniband/hw/hfi1/vnic_main.c            |    2
>>  drivers/infiniband/ulp/opa_vnic/opa_vnic_netdev.c |    4 -
>>  drivers/net/bonding/bond_main.c                   |    3
>>  drivers/net/ethernet/amazon/ena/ena_netdev.c      |    5 -
>>  drivers/net/ethernet/broadcom/bcmsysport.c        |    6 -
>>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |    6 +
>>  drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h   |    3
>>  drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |    5 -
>>  drivers/net/ethernet/hisilicon/hns/hns_enet.c     |    5 -
>>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c     |   62 ++++++--
>>  drivers/net/ethernet/lantiq_etop.c                |   10 -
>>  drivers/net/ethernet/mellanox/mlx4/en_tx.c        |    7 +
>>  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h      |    3
>>  drivers/net/ethernet/mellanox/mlx5/core/en.h      |    3
>>  drivers/net/ethernet/mellanox/mlx5/core/en_tx.c   |    5 -
>>  drivers/net/ethernet/renesas/ravb_main.c          |    3
>>  drivers/net/ethernet/sun/ldmvsw.c                 |    3
>>  drivers/net/ethernet/sun/sunvnet.c                |    3
>>  drivers/net/ethernet/ti/netcp_core.c              |    9 -
>>  drivers/net/hyperv/netvsc_drv.c                   |    6 -
>>  drivers/net/macvlan.c                             |   10 -
>>  drivers/net/net_failover.c                        |    7 +
>>  drivers/net/team/team.c                           |    3
>>  drivers/net/tun.c                                 |    3
>>  drivers/net/wireless/marvell/mwifiex/main.c       |    3
>>  drivers/net/xen-netback/interface.c               |    4 -
>>  drivers/net/xen-netfront.c                        |    3
>>  drivers/staging/netlogic/xlr_net.c                |    9 -
>>  drivers/staging/rtl8188eu/os_dep/os_intfs.c       |    3
>>  drivers/staging/rtl8723bs/os_dep/os_intfs.c       |    7 -
>>  include/linux/netdevice.h                         |   34 ++++-
>>  net/core/dev.c                                    |  156 ++++++++++++++++++---
>>  net/core/net-sysfs.c                              |   36 ++++-
>>  net/mac80211/iface.c                              |    4 -
>>  net/packet/af_packet.c                            |    7 +
>>  35 files changed, 312 insertions(+), 130 deletions(-)
>>
>> --
>
> This makes sense. I thought you were hoping to get rid of select queue in future?

That would be nice, however there are still a bunch of corner cases
that are not handled that have been dumped into select queue. For
example in the case of ixgbe the issue is FCoE. There are a number of
other places that are using it as well as I seem to recall netvsc and
bonding both use it to store off the original Rx->Tx queue mapping
when passing through the interface.

For now I figure we can take this one hill at a time and I am just
making it so we don't have to use ndo_select_queue in order to make
vmdq work for macvlan offload.

- Alex
Kirsher, Jeffrey T July 3, 2018, 6:42 p.m. UTC | #5
On Tue, Jun 12, 2018 at 8:18 AM, Alexander Duyck
<alexander.h.duyck@intel.com> wrote:
> This patch series is meant to allow support for the L2 forward offload, aka
> MACVLAN offload without the need for using ndo_select_queue.
>
> The existing solution currently requires that we use ndo_select_queue in
> the transmit path if we want to associate specific Tx queues with a given
> MACVLAN interface. In order to get away from this we need to repurpose the
> tc_to_txq array and XPS pointer for the MACVLAN interface and use those as
> a means of accessing the queues on the lower device. As a result we cannot
> offload a device that is configured as multiqueue, however it doesn't
> really make sense to configure a macvlan interfaced as being multiqueue
> anyway since it doesn't really have a qdisc of its own in the first place.
>
> I am submitting this as an RFC for the netdev mailing list, and officially
> submitting it for testing to Jeff Kirsher's next-queue in order to validate
> the ixgbe specific bits.
>
> The big changes in this set are:
>   Allow lower device to update tc_to_txq and XPS map of offloaded MACVLAN
>   Disable XPS for single queue devices
>   Replace accel_priv with sb_dev in ndo_select_queue
>   Add sb_dev parameter to fallback function for ndo_select_queue
>   Consolidated ndo_select_queue functions that appeared to be duplicates
>
> v2: Implement generic "select_queue" functions instead of "fallback" functions.
>     Tweak last two patches to account for changes in dev_pick_tx_xxx functions.
>
> ---
>
> Alexander Duyck (7):
>       net-sysfs: Drop support for XPS and traffic_class on single queue device
>       net: Add support for subordinate device traffic classes
>       ixgbe: Add code to populate and use macvlan tc to Tx queue map
>       net: Add support for subordinate traffic classes to netdev_pick_tx
>       net: Add generic ndo_select_queue functions
>       net: allow ndo_select_queue to pass netdev
>       net: allow fallback function to pass netdev
>

Alex, there were recent changes to Dave's net-next which caused a
conflict with one or more of your patches.  So I have removed this
series from my tree for now, and will work with you on updating the
series to work with the latest net-next tree.