diff mbox series

[net-next,2/4] bnxt_en: Use NETIF_F_GRO_HW.

Message ID 1512385967-32339-3-git-send-email-michael.chan@broadcom.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series Introduce NETIF_F_GRO_HW | expand

Commit Message

Michael Chan Dec. 4, 2017, 11:12 a.m. UTC
Advertise NETIF_F_GRO_HW if hardware GRO is supported.  Turn on or off
hardware GRO based on NETIF_F_GRO_HW.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Or Gerlitz Dec. 4, 2017, 4:35 p.m. UTC | #1
On Mon, Dec 4, 2017 at 1:12 PM, Michael Chan <michael.chan@broadcom.com> wrote:
> Advertise NETIF_F_GRO_HW if hardware GRO is supported.  Turn on or off
> hardware GRO based on NETIF_F_GRO_HW.

So the patch only deals with advertisement biz.. and I even failed to
find a place where you arm
something in the FW... so this just works OOB, all your buffers && skb
logic && FW setup are fully
ready for that? also, nothing you need to put differently on the SKB
for HW GRO completions?
can you explain this little further or it's all in your netdev slides...

Or.
Michael Chan Dec. 4, 2017, 6:11 p.m. UTC | #2
On Mon, Dec 4, 2017 at 8:35 AM, Or Gerlitz <gerlitz.or@gmail.com> wrote:
> On Mon, Dec 4, 2017 at 1:12 PM, Michael Chan <michael.chan@broadcom.com> wrote:
>> Advertise NETIF_F_GRO_HW if hardware GRO is supported.  Turn on or off
>> hardware GRO based on NETIF_F_GRO_HW.
>
> So the patch only deals with advertisement biz.. and I even failed to
> find a place where you arm
> something in the FW... so this just works OOB, all your buffers && skb
> logic && FW setup are fully
> ready for that? also, nothing you need to put differently on the SKB
> for HW GRO completions?
> can you explain this little further or it's all in your netdev slides...
>

All the logic is already in these 3 drivers in the tree.  You can see
the additional logic in any of these drivers.  It's just that these
drivers have been using NETIF_F_GRO to turn on this mode in
hardware/firmware.  So these patches are just to switch over to use
the new flag to turn on this mode so that it can be independently
turned on or off.
Or Gerlitz Dec. 4, 2017, 9:06 p.m. UTC | #3
On Mon, Dec 4, 2017 at 8:11 PM, Michael Chan <michael.chan@broadcom.com> wrote:

> All the logic is already in these 3 drivers in the tree.  You can see
> the additional logic in any of these drivers.  It's just that these
> drivers have been using NETIF_F_GRO to turn on this mode in
> hardware/firmware.

What happens for tcp encaped by some udp tunnel, does
the HW know how to GRO that?
Eric Dumazet Dec. 4, 2017, 10 p.m. UTC | #4
On Mon, 2017-12-04 at 23:06 +0200, Or Gerlitz wrote:
> On Mon, Dec 4, 2017 at 8:11 PM, Michael Chan <michael.chan@broadcom.c
> om> wrote:
> 
> > All the logic is already in these 3 drivers in the tree.  You can
> > see
> > the additional logic in any of these drivers.  It's just that these
> > drivers have been using NETIF_F_GRO to turn on this mode in
> > hardware/firmware.
> 
> What happens for tcp encaped by some udp tunnel, does
> the HW know how to GRO that?

Yes it does, at least for common encaps (GRE, SIT, IPIP)
Michael Chan Dec. 5, 2017, 12:07 a.m. UTC | #5
On Mon, Dec 4, 2017 at 2:00 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Mon, 2017-12-04 at 23:06 +0200, Or Gerlitz wrote:
>> On Mon, Dec 4, 2017 at 8:11 PM, Michael Chan <michael.chan@broadcom.c
>> om> wrote:
>>
>> > All the logic is already in these 3 drivers in the tree.  You can
>> > see
>> > the additional logic in any of these drivers.  It's just that these
>> > drivers have been using NETIF_F_GRO to turn on this mode in
>> > hardware/firmware.
>>
>> What happens for tcp encaped by some udp tunnel, does
>> the HW know how to GRO that?
>
> Yes it does, at least for common encaps (GRE, SIT, IPIP)
>

Correct, bnxt_en supports all common encaps.  Basically, if we can do
TSO on the packet, we can do the reverse GRO_HW on the same packet
stream.

As already pointed out, GRO_HW is a subset of GRO.  Packets that
cannot be aggregated in hardware (due to hardware resource limitations
or protocol types that it doesn't handle) can just be passed to the
stack for GRO aggregation.
Marcelo Ricardo Leitner Dec. 5, 2017, 6:10 p.m. UTC | #6
On Mon, Dec 04, 2017 at 04:07:15PM -0800, Michael Chan wrote:
> As already pointed out, GRO_HW is a subset of GRO.  Packets that
> cannot be aggregated in hardware (due to hardware resource limitations
> or protocol types that it doesn't handle) can just be passed to the
> stack for GRO aggregation.

How would the parameters/limits work in this case? I mean, currently
we have the default weight of 64 packets per napi poll cycle, the
budget of 300 per cycle and also the time constrain,
net.core.netdev_budget_usecs.

With GRO_HW, this 64 limit may be exceeded. I'm looking at qede code
and it works by couting each completion as 1 rcv_pkts
(qede_fp.c:1318). So if it now gets 64 packets, it's up to 64*MTU
aprox, GRO'ed or not. But with GRO_HW, seems it may be much more than
that and which may not be fair with other interfaces in the system.
Drivers supporting GRO_HW probably should account for this.

And how can one control how much time a packet may spend on NIC queue
waiting to be GRO'ed? Does it use the coalescing parameters for that?

  Marcelo
Michael Chan Dec. 6, 2017, 9:04 p.m. UTC | #7
On Tue, Dec 5, 2017 at 10:10 AM, Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
> On Mon, Dec 04, 2017 at 04:07:15PM -0800, Michael Chan wrote:
>> As already pointed out, GRO_HW is a subset of GRO.  Packets that
>> cannot be aggregated in hardware (due to hardware resource limitations
>> or protocol types that it doesn't handle) can just be passed to the
>> stack for GRO aggregation.
>
> How would the parameters/limits work in this case? I mean, currently
> we have the default weight of 64 packets per napi poll cycle, the
> budget of 300 per cycle and also the time constrain,
> net.core.netdev_budget_usecs.

Good point.  Currently, it is no different than LRO.  Each aggregated
packet is counted as 1.  With LRO, you don't necessarily know many
packets were merged.  With GRO_HW, we know and it's possible to count
the original packets towards the NAPI budget.

>
> With GRO_HW, this 64 limit may be exceeded. I'm looking at qede code
> and it works by couting each completion as 1 rcv_pkts
> (qede_fp.c:1318). So if it now gets 64 packets, it's up to 64*MTU
> aprox, GRO'ed or not. But with GRO_HW, seems it may be much more than
> that and which may not be fair with other interfaces in the system.
> Drivers supporting GRO_HW probably should account for this.

Right.  We can make this adjustment for GRO_HW in a future patchset.

>
> And how can one control how much time a packet may spend on NIC queue
> waiting to be GRO'ed? Does it use the coalescing parameters for that?
>

The GRO_HW timer is currently not exposed.  It's different from
interrupt coalescing.  It's possible to make this a tunable parameter
in the future.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 33c49ad..5d15cdb 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -2751,7 +2751,7 @@  void bnxt_set_tpa_flags(struct bnxt *bp)
 		return;
 	if (bp->dev->features & NETIF_F_LRO)
 		bp->flags |= BNXT_FLAG_LRO;
-	if (bp->dev->features & NETIF_F_GRO)
+	if (bp->dev->features & NETIF_F_GRO_HW)
 		bp->flags |= BNXT_FLAG_GRO;
 }
 
@@ -6817,7 +6817,7 @@  static int bnxt_set_features(struct net_device *dev, netdev_features_t features)
 	bool update_tpa = false;
 
 	flags &= ~BNXT_FLAG_ALL_CONFIG_FEATS;
-	if ((features & NETIF_F_GRO) && !BNXT_CHIP_TYPE_NITRO_A0(bp))
+	if (features & NETIF_F_GRO_HW)
 		flags |= BNXT_FLAG_GRO;
 	if (features & NETIF_F_LRO)
 		flags |= BNXT_FLAG_LRO;
@@ -8093,7 +8093,7 @@  static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 			   NETIF_F_RXCSUM | NETIF_F_GRO;
 
 	if (!BNXT_CHIP_TYPE_NITRO_A0(bp))
-		dev->hw_features |= NETIF_F_LRO;
+		dev->hw_features |= NETIF_F_LRO | NETIF_F_GRO_HW;
 
 	dev->hw_enc_features =
 			NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_SG |
@@ -8107,6 +8107,8 @@  static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 	dev->hw_features |= NETIF_F_HW_VLAN_CTAG_RX | NETIF_F_HW_VLAN_CTAG_TX |
 			    NETIF_F_HW_VLAN_STAG_RX | NETIF_F_HW_VLAN_STAG_TX;
 	dev->features |= dev->hw_features | NETIF_F_HIGHDMA;
+	if (dev->features & NETIF_F_GRO_HW)
+		dev->features &= ~NETIF_F_LRO;
 	dev->priv_flags |= IFF_UNICAST_FLT;
 
 #ifdef CONFIG_BNXT_SRIOV