diff mbox series

[v4,iwl-net] i40e: Prevent setting MTU if greater than MFS

Message ID 20240313090719.33627-2-e.velu@criteo.com
State Changes Requested
Delegated to: Anthony Nguyen
Headers show
Series [v4,iwl-net] i40e: Prevent setting MTU if greater than MFS | expand

Commit Message

Erwan Velu March 13, 2024, 9:07 a.m. UTC
Commit 6871a7de705 ("[intelxl] Use admin queue to set port MAC address
and maximum frame size") from iPXE project set the MFS to 0x600 = 1536.
See https://github.com/ipxe/ipxe/commit/6871a7de705

At boot time the i40e driver complains about it with
the following message but continues.

	MFS for port 1 has been set below the default: 600

If the MTU size is increased, the driver accepts it but large packets will
not be processed by the firmware generating tx_errors. The issue is pretty
silent for users. i.e doing TCP in such context will generates lots of
retransmissions until the proper window size (below 1500) will be used.

To fix this case, it would have been ideal to increase the MFS,
via i40e_aqc_opc_set_mac_config, incoming patch will take care of it.

At least, commit prevents setting up an MTU greater than the current MFS.
It will avoid being in the position of having an MTU set to 9000 on the
netdev with a firmware refusing packets larger than 1536.

A typical trace looks like:
[  377.548696] i40e 0000:5d:00.0 eno5: Error changing mtu to 9000, Max is 1500. MFS is too small.

Signed-off-by: Erwan Velu <e.velu@criteo.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Brett Creeley March 14, 2024, 4:10 p.m. UTC | #1
On 3/13/2024 2:07 AM, Erwan Velu wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
> 
> 
> Commit 6871a7de705 ("[intelxl] Use admin queue to set port MAC address
> and maximum frame size") from iPXE project set the MFS to 0x600 = 1536.
> See https://github.com/ipxe/ipxe/commit/6871a7de705
> 
> At boot time the i40e driver complains about it with
> the following message but continues.
> 
>          MFS for port 1 has been set below the default: 600
> 
> If the MTU size is increased, the driver accepts it but large packets will
> not be processed by the firmware generating tx_errors. The issue is pretty
> silent for users. i.e doing TCP in such context will generates lots of
> retransmissions until the proper window size (below 1500) will be used.
> 
> To fix this case, it would have been ideal to increase the MFS,
> via i40e_aqc_opc_set_mac_config, incoming patch will take care of it.
> 
> At least, commit prevents setting up an MTU greater than the current MFS.
> It will avoid being in the position of having an MTU set to 9000 on the
> netdev with a firmware refusing packets larger than 1536.
> 
> A typical trace looks like:
> [  377.548696] i40e 0000:5d:00.0 eno5: Error changing mtu to 9000, Max is 1500. MFS is too small.
> 
> Signed-off-by: Erwan Velu <e.velu@criteo.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_main.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index f86578857e8a..85ecf2f3de18 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -2946,7 +2946,7 @@ static int i40e_change_mtu(struct net_device *netdev, int new_mtu)
>          struct i40e_netdev_priv *np = netdev_priv(netdev);
>          struct i40e_vsi *vsi = np->vsi;
>          struct i40e_pf *pf = vsi->back;
> -       int frame_size;
> +       int frame_size, mfs, max_mtu;
> 
>          frame_size = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog);
>          if (new_mtu > frame_size - I40E_PACKET_HDR_PAD) {
> @@ -2955,6 +2955,14 @@ static int i40e_change_mtu(struct net_device *netdev, int new_mtu)
>                  return -EINVAL;
>          }
> 
> +       mfs = pf->hw.phy.link_info.max_frame_size;
> +       max_mtu = mfs - I40E_PACKET_HDR_PAD;

If this is how the max_mtu is determined, does it make sense to set this 
before registering the netdev, i.e. netdev->max_mtu in i40e_config_netdev()?

Thanks,

Brett

> +       if (new_mtu > max_mtu) {
> +               netdev_err(netdev, "Error changing mtu to %d, Max is %d. MFS is too small.\n",
> +                          new_mtu, max_mtu);
> +               return -EINVAL;
> +       }
> +
>          netdev_dbg(netdev, "changing MTU from %d to %d\n",
>                     netdev->mtu, new_mtu);
>          netdev->mtu = new_mtu;
> --
> 2.44.0
> 
>
Erwan Velu March 14, 2024, 5:10 p.m. UTC | #2
Le 14/03/2024 à 17:10, Brett Creeley a écrit :
[...]
> If this is how the max_mtu is determined, does it make sense to set this
> before registering the netdev, i.e. netdev->max_mtu in 
> i40e_config_netdev()? 


The absolute max is properly set but I think that's only true if we 
ensure the value of the MFS.

So if with another patch to set the MFS to the right value when asking a 
bigger MTU, having this value makes sense this is the absolute max for 
this device.


Erwan,
Brett Creeley March 14, 2024, 5:55 p.m. UTC | #3
On 3/14/2024 10:10 AM, Erwan Velu wrote:
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
> 
> 
> Le 14/03/2024 à 17:10, Brett Creeley a écrit :
> [...]
>> If this is how the max_mtu is determined, does it make sense to set this
>> before registering the netdev, i.e. netdev->max_mtu in
>> i40e_config_netdev()?
> 
> 
> The absolute max is properly set but I think that's only true if we
> ensure the value of the MFS.
> 
> So if with another patch to set the MFS to the right value when asking a
> bigger MTU, having this value makes sense this is the absolute max for
> this device.

AFAIK there is no API for a user to change the max_mtu, so the only way 
the device's MFS would need to change is if it's done during 
initialization time, which should be done before netdev registration anyway.

I guess it's also possible that the driver's XDP configuration could 
cause a change in the device's MFS and netdev->max_mtu, but that would 
be under the rtnl_lock.

Seems like others are happy with it, but FWIW that's my 2 cents, 
otherwise LGTM.

Reviewed-by: Brett Creeley <brett.creeley@amd.com>


> 
> 
> Erwan,
>
Erwan Velu March 14, 2024, 6:04 p.m. UTC | #4
Le 14/03/2024 à 18:55, Brett Creeley a écrit :
> [...]
> AFAIK there is no API for a user to change the max_mtu, so the only way
> the device's MFS would need to change is if it's done during
> initialization time, which should be done before netdev registration 
> anyway.

Sorry Brett, I was probably unclear and please note that I'm not a 
network developer, just a user that faced a bug.

My initial though was to check the mfs size in i40e_change_mtu() and if 
mfs is too small, then let's increase it.

Maybe just resetting it at init time to the largest value (which seems 
to be the default fw behavior) is a best approach.

I'd love to ear from Intel dev that knows this driver/cards/fw better on 
what's the best approach here.

Erwan,
Tony Nguyen March 14, 2024, 8:31 p.m. UTC | #5
On 3/14/2024 11:04 AM, Erwan Velu wrote:
> 
> Le 14/03/2024 à 18:55, Brett Creeley a écrit :
>> [...]
>> AFAIK there is no API for a user to change the max_mtu, so the only way
>> the device's MFS would need to change is if it's done during
>> initialization time, which should be done before netdev registration 
>> anyway.
> 
> Sorry Brett, I was probably unclear and please note that I'm not a 
> network developer, just a user that faced a bug.
> 
> My initial though was to check the mfs size in i40e_change_mtu() and if 
> mfs is too small, then let's increase it.
> 
> Maybe just resetting it at init time to the largest value (which seems 
> to be the default fw behavior) is a best approach.
> 
> I'd love to ear from Intel dev that knows this driver/cards/fw better on 
> what's the best approach here.

Setting the mfs size to max values during init and reset would better; 
this is what the ice driver does. However, this would take implementing 
new AdminQ calls. IMO this patch is ok to prevent the issue being 
reported and allow for ease of backport.

Thanks,
Tony
Erwan Velu March 15, 2024, 9:17 a.m. UTC | #6
Le 14/03/2024 à 21:31, Tony Nguyen a écrit :
> [..]
> Setting the mfs size to max values during init and reset would better; 
> this is what the ice driver does. However, this would take 
> implementing new AdminQ calls. IMO this patch is ok to prevent the 
> issue being reported and allow for ease of backport.
>
That was my first intention, ensure that no one else get stuck in the 
same situation.

It would be nice to backport it to all stable releases once merged.

Erwan,
Brett Creeley March 15, 2024, 4:19 p.m. UTC | #7
On 3/15/2024 2:17 AM, Erwan Velu wrote:
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
> 
> 
> Le 14/03/2024 à 21:31, Tony Nguyen a écrit :
>> [..]
>> Setting the mfs size to max values during init and reset would better;
>> this is what the ice driver does. However, this would take
>> implementing new AdminQ calls. IMO this patch is ok to prevent the
>> issue being reported and allow for ease of backport.
>>
> That was my first intention, ensure that no one else get stuck in the
> same situation.
> 
> It would be nice to backport it to all stable releases once merged.
> 
> Erwan,
> 

I'm okay with this approach. Thanks.

Brett
Simon Horman March 18, 2024, 5:45 p.m. UTC | #8
On Wed, Mar 13, 2024 at 10:07:16AM +0100, Erwan Velu wrote:
> Commit 6871a7de705 ("[intelxl] Use admin queue to set port MAC address
> and maximum frame size") from iPXE project set the MFS to 0x600 = 1536.
> See https://github.com/ipxe/ipxe/commit/6871a7de705
> 
> At boot time the i40e driver complains about it with
> the following message but continues.
> 
> 	MFS for port 1 has been set below the default: 600
> 
> If the MTU size is increased, the driver accepts it but large packets will
> not be processed by the firmware generating tx_errors. The issue is pretty
> silent for users. i.e doing TCP in such context will generates lots of
> retransmissions until the proper window size (below 1500) will be used.
> 
> To fix this case, it would have been ideal to increase the MFS,
> via i40e_aqc_opc_set_mac_config, incoming patch will take care of it.
> 
> At least, commit prevents setting up an MTU greater than the current MFS.
> It will avoid being in the position of having an MTU set to 9000 on the
> netdev with a firmware refusing packets larger than 1536.
> 
> A typical trace looks like:
> [  377.548696] i40e 0000:5d:00.0 eno5: Error changing mtu to 9000, Max is 1500. MFS is too small.
> 

Hi Erwan, all,

As a fix, I think this patch warrants a fixes tag.
Perhaps this one is appropriate?

Fixes: 41c445ff0f48 ("i40e: main driver core")

> Signed-off-by: Erwan Velu <e.velu@criteo.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index f86578857e8a..85ecf2f3de18 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -2946,7 +2946,7 @@ static int i40e_change_mtu(struct net_device *netdev, int new_mtu)
>  	struct i40e_netdev_priv *np = netdev_priv(netdev);
>  	struct i40e_vsi *vsi = np->vsi;
>  	struct i40e_pf *pf = vsi->back;
> -	int frame_size;
> +	int frame_size, mfs, max_mtu;
>  
>  	frame_size = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog);
>  	if (new_mtu > frame_size - I40E_PACKET_HDR_PAD) {

I am fine with this patch, so please take what follows as a suggestion
for improvement, possibly as a follow-up. Not as a hard requirement from
my side.

The part of this function between the two hunks of this patch is:

		netdev_err(netdev, "Error changing mtu to %d, Max is %d\n",
			   new_mtu, frame_size - I40E_PACKET_HDR_PAD);

My reading is that with this patch two different limits are
checked wrt maximum MTU size:

1. A VSI level limit, which relates to RX buffer size
2. A PHY level limit that relates to the MFS

That seems fine to me. But the log message for 1 (above) does
not seem particularly informative wrt which limit has been exceeded.

> @@ -2955,6 +2955,14 @@ static int i40e_change_mtu(struct net_device *netdev, int new_mtu)
>  		return -EINVAL;
>  	}
>  
> +	mfs = pf->hw.phy.link_info.max_frame_size;
> +	max_mtu = mfs - I40E_PACKET_HDR_PAD;
> +	if (new_mtu > max_mtu) {
> +		netdev_err(netdev, "Error changing mtu to %d, Max is %d. MFS is too small.\n",
> +			   new_mtu, max_mtu);
> +		return -EINVAL;
> +	}
> +
>  	netdev_dbg(netdev, "changing MTU from %d to %d\n",
>  		   netdev->mtu, new_mtu);
>  	netdev->mtu = new_mtu;
> -- 
> 2.44.0
> 
>
Sunil Kovvuri Goutham March 19, 2024, 10:26 a.m. UTC | #9
> -----Original Message-----
> From: Simon Horman <horms@kernel.org>
> Sent: Monday, March 18, 2024 11:15 PM
> To: Erwan Velu <erwanaliasr1@gmail.com>
> Cc: Erwan Velu <e.velu@criteo.com>; Jesse Brandeburg
> <jesse.brandeburg@intel.com>; Tony Nguyen
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>;
> Eric Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>;
> Paolo Abeni <pabeni@redhat.com>; intel-wired-lan@lists.osuosl.org;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [EXTERNAL] Re: [PATCH v4 iwl-net] i40e: Prevent setting MTU if
> greater than MFS
> 
> On Wed, Mar 13, 2024 at 10:07:16AM +0100, Erwan Velu wrote:
> > Commit 6871a7de705 ("[intelxl] Use admin queue to set port MAC
> address
> > and maximum frame size") from iPXE project set the MFS to 0x600 =
> 1536.
> > See
> > https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__github.com_ipxe_i
> >
> pxe_commit_6871a7de705&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=q
> 3VKxXQKibo
> >
> Rw_F01ggTzHuhwawxR1P9_tMCN2FODU4&m=J_D0216agwnkMc8GuVR13
> qCCDBQWZmqULkA
> >
> mTpcwoDk9e1Yw0Z28r7RoYVuLyMex&s=s9O0JKM9qId17Qw5d2dH7c3wLJT
> Bz-frLsKFbz
> > jliE0&e=
> >
> > At boot time the i40e driver complains about it with the following
> > message but continues.
> >
> > 	MFS for port 1 has been set below the default: 600
> >
> > If the MTU size is increased, the driver accepts it but large packets
> > will not be processed by the firmware generating tx_errors. The issue
> > is pretty silent for users. i.e doing TCP in such context will
> > generates lots of retransmissions until the proper window size (below
> 1500) will be used.
> >
> > To fix this case, it would have been ideal to increase the MFS, via
> > i40e_aqc_opc_set_mac_config, incoming patch will take care of it.
> >
> > At least, commit prevents setting up an MTU greater than the current
> MFS.
> > It will avoid being in the position of having an MTU set to 9000 on
> > the netdev with a firmware refusing packets larger than 1536.
> >
> > A typical trace looks like:
> > [  377.548696] i40e 0000:5d:00.0 eno5: Error changing mtu to 9000, Max
> is 1500. MFS is too small.
> >
> 
> Hi Erwan, all,
> 
> As a fix, I think this patch warrants a fixes tag.
> Perhaps this one is appropriate?
> 
> Fixes: 41c445ff0f48 ("i40e: main driver core")
> 
> > Signed-off-by: Erwan Velu <e.velu@criteo.com>
> > ---
> >  drivers/net/ethernet/intel/i40e/i40e_main.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
> > b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > index f86578857e8a..85ecf2f3de18 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > @@ -2946,7 +2946,7 @@ static int i40e_change_mtu(struct net_device
> *netdev, int new_mtu)
> >  	struct i40e_netdev_priv *np = netdev_priv(netdev);
> >  	struct i40e_vsi *vsi = np->vsi;
> >  	struct i40e_pf *pf = vsi->back;
> > -	int frame_size;
> > +	int frame_size, mfs, max_mtu;
> >
> >  	frame_size = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog);
> >  	if (new_mtu > frame_size - I40E_PACKET_HDR_PAD) {
> 
> I am fine with this patch, so please take what follows as a suggestion for
> improvement, possibly as a follow-up. Not as a hard requirement from my
> side.
> 
> The part of this function between the two hunks of this patch is:
> 
> 		netdev_err(netdev, "Error changing mtu to %d, Max is
> %d\n",
> 			   new_mtu, frame_size - I40E_PACKET_HDR_PAD);
> 
> My reading is that with this patch two different limits are checked wrt
> maximum MTU size:
> 
> 1. A VSI level limit, which relates to RX buffer size 2. A PHY level limit that
> relates to the MFS
> 
> That seems fine to me. But the log message for 1 (above) does not seem
> particularly informative wrt which limit has been exceeded.
> 
> > @@ -2955,6 +2955,14 @@ static int i40e_change_mtu(struct net_device
> *netdev, int new_mtu)
> >  		return -EINVAL;
> >  	}
> >
> > +	mfs = pf->hw.phy.link_info.max_frame_size;
> > +	max_mtu = mfs - I40E_PACKET_HDR_PAD;
> > +	if (new_mtu > max_mtu) {
> > +		netdev_err(netdev, "Error changing mtu to %d, Max is %d.
> MFS is too small.\n",
> > +			   new_mtu, max_mtu);
> > +		return -EINVAL;
> > +	}
> > +

Aren't these driver specific checks and messages deprecated in favor generic one ?
Is it not possible to do below

@@ -1607,6 +1607,7 @@ int i40e_aq_get_link_info(struct i40e_hw *hw,
        hw_link_info->ext_info = resp->ext_info;
        hw_link_info->loopback = resp->loopback & I40E_AQ_LOOPBACK_MASK;
        hw_link_info->max_frame_size = le16_to_cpu(resp->max_frame_size);
+       netdev->max_mtu = hw_link_info->max_frame_size - I40E_PACKET_HDR_PAD;
        hw_link_info->pacing = resp->config & I40E_AQ_CONFIG_PACING_MASK;

So that stack will take care checking max MTU.

Thanks,
Sunil.
Erwan Velu March 19, 2024, 11:38 a.m. UTC | #10
Le 18/03/2024 à 18:45, Simon Horman a écrit :
> [...]
> Hi Erwan, all,
>
> As a fix, I think this patch warrants a fixes tag.
> Perhaps this one is appropriate?
>
> Fixes: 41c445ff0f48 ("i40e: main driver core")

Simon

Isn't that a bit too generic ?

[..]

> I am fine with this patch, so please take what follows as a suggestion
> for improvement, possibly as a follow-up. Not as a hard requirement from
> my side.
>
> The part of this function between the two hunks of this patch is:
>
>                  netdev_err(netdev, "Error changing mtu to %d, Max is %d\n",
>                             new_mtu, frame_size - I40E_PACKET_HDR_PAD);
>
> My reading is that with this patch two different limits are
> checked wrt maximum MTU size:
>
> 1. A VSI level limit, which relates to RX buffer size
> 2. A PHY level limit that relates to the MFS
>
> That seems fine to me. But the log message for 1 (above) does
> not seem particularly informative wrt which limit has been exceeded.

I got some comments around this.

I wanted to keep my patch being focused on the mfs issue, but I can 
offer a patch to get a similar output for this. What WRT stands for ?


I wanted also to make another patch for this :

dev_warn(&pdev->dev, "MFS for port %x has been set below the default: 
%x\n",pf->hw.port, val);

The MFS reported as hex without a "0x" prefix is very misleading, I can 
offer a patch for this too.


Erwan,
Simon Horman March 19, 2024, 12:20 p.m. UTC | #11
On Tue, Mar 19, 2024 at 12:38:03PM +0100, Erwan Velu wrote:
> 
> Le 18/03/2024 à 18:45, Simon Horman a écrit :
> > [...]
> > Hi Erwan, all,
> > 
> > As a fix, I think this patch warrants a fixes tag.
> > Perhaps this one is appropriate?
> > 
> > Fixes: 41c445ff0f48 ("i40e: main driver core")
> 
> Simon
> 
> Isn't that a bit too generic ?

Yes, maybe it is.
What we would be after is the first commit where the
user can hit the problem the patch addresses.

> [..]
> 
> > I am fine with this patch, so please take what follows as a suggestion
> > for improvement, possibly as a follow-up. Not as a hard requirement from
> > my side.
> > 
> > The part of this function between the two hunks of this patch is:
> > 
> >                  netdev_err(netdev, "Error changing mtu to %d, Max is %d\n",
> >                             new_mtu, frame_size - I40E_PACKET_HDR_PAD);
> > 
> > My reading is that with this patch two different limits are
> > checked wrt maximum MTU size:
> > 
> > 1. A VSI level limit, which relates to RX buffer size
> > 2. A PHY level limit that relates to the MFS
> > 
> > That seems fine to me. But the log message for 1 (above) does
> > not seem particularly informative wrt which limit has been exceeded.
> 
> I got some comments around this.
> 
> I wanted to keep my patch being focused on the mfs issue, but I can offer a
> patch to get a similar output for this. What WRT stands for ?
> 
> 
> I wanted also to make another patch for this :
> 
> dev_warn(&pdev->dev, "MFS for port %x has been set below the default:
> %x\n",pf->hw.port, val);
> 
> The MFS reported as hex without a "0x" prefix is very misleading, I can
> offer a patch for this too.

FWIIW, I think handling these questions in follow-up patches is fine.
Erwan Velu March 19, 2024, 1:33 p.m. UTC | #12
Le 19/03/2024 à 13:20, Simon Horman a écrit :
[...]
> FWIIW, I think handling these questions in follow-up patches is fine.

I wonder if the previous patch must be merged first, so I can reference 
it in the commit message, or if I should shoot it now.

Erwan,
Pucha, HimasekharX Reddy April 19, 2024, 2:26 p.m. UTC | #13
>-----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Erwan Velu
> Sent: Wednesday, March 13, 2024 2:37 PM
> Cc: Velu, Erwan <e.velu@criteo.com>; linux-kernel@vger.kernel.org; Eric Dumazet <edumazet@google.com>; netdev@vger.kernel.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; intel-wired-lan@lists.osuosl.org; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S. Miller <davem@davemloft.net>
> Subject: [Intel-wired-lan] [PATCH v4 iwl-net] i40e: Prevent setting MTU if greater than MFS
>
> Commit 6871a7de705 ("[intelxl] Use admin queue to set port MAC address and maximum frame size") from iPXE project set the MFS to 0x600 = 1536.
> See https://github.com/ipxe/ipxe/commit/6871a7de705
>
> At boot time the i40e driver complains about it with the following message but continues.
>
>	MFS for port 1 has been set below the default: 600
>
> If the MTU size is increased, the driver accepts it but large packets will not be processed by the firmware generating tx_errors. The issue is pretty silent for users. i.e doing TCP in such context will generates lots of retransmissions until the proper > window size (below 1500) will be used.
>
> To fix this case, it would have been ideal to increase the MFS, via i40e_aqc_opc_set_mac_config, incoming patch will take care of it.
>
> At least, commit prevents setting up an MTU greater than the current MFS.
> It will avoid being in the position of having an MTU set to 9000 on the netdev with a firmware refusing packets larger than 1536.
>
> A typical trace looks like:
> [  377.548696] i40e 0000:5d:00.0 eno5: Error changing mtu to 9000, Max is 1500. MFS is too small.
> 
> Signed-off-by: Erwan Velu <e.velu@criteo.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>


With patch when we set the MFS to 1700 (5888) in the NVM (as seen below) and then set the MTU on PF0 to 9000 and it set it to 9000 with no errors and no messages in dmesg.  

[root@localhost user]# ip link set mtu 9000 dev enp131s0f0np0
[root@localhost user]# ip link show dev enp131s0f0np0
9: enp131s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff

dmesg when loading the driver:
[257.035823] 140e 0000:83:00.1: MFS for port 1 (5888) has been set below the default (9728)
Pucha, HimasekharX Reddy April 22, 2024, 1:19 p.m. UTC | #14
> -----Original Message-----
> From: Erwan Velu <erwanaliasr1@gmail.com> 
> Sent: Friday, April 19, 2024 8:10 PM
> To: Pucha, HimasekharX Reddy <himasekharx.reddy.pucha@intel.com>
> Subject: Re: [Intel-wired-lan] [PATCH v4 iwl-net] i40e: Prevent setting MTU if greater than MFS
>
> Hum that's pretty unexpected.
>Can you print "new_mtu"  and "max_mtu" in i40e_change_mtu() ?

Hi velu,

Please find below logs


[root@ ~]# ifconfig ens803f0np0
ens803f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:00:00:00:01:00  txqueuelen 1000  (Ethernet)
        RX packets 6  bytes 1908 (1.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 4  bytes 1368 (1.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@ ~]# ifconfig ens803f0np0 mtu 9700
[root@ ~]# ifconfig ens803f0np0
ens803f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9700
        ether 00:00:00:00:01:00  txqueuelen 1000  (Ethernet)
        RX packets 10  bytes 3180 (3.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6  bytes 2052 (2.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@~]# ifconfig ens803f0np0 mtu 1000
[root@ ~]# ifconfig ens803f0np0
ens803f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1000
        ether 00:00:00:00:01:00  txqueuelen 1000  (Ethernet)
        RX packets 10  bytes 3180 (3.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6  bytes 2052 (2.0 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Dmesg:

[  +0.013744] i40e 0000:86:00.0: fw 9.140.76856 api 1.15 nvm 9.40 0xefd0ed12 1.3534.0 [8086:158b] [8086:0002]
[  +0.102046] i40e 0000:86:00.0: MAC address: 00:00:00:00:01:00
[  +0.003999] i40e 0000:86:00.0: FW LLDP is enabled
[  +0.006038] i40e 0000:86:00.0 eth0: NIC Link is Up, 25 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: CL108 RS-FEC, Autoneg: True, Flow Control: None
[  +0.000448] i40e 0000:86:00.0: PCI-Express: Speed 8.0GT/s Width x8
[  +0.000385] i40e 0000:86:00.0: MFS for port 0 (5888) has been set below the default (9728)
[  +0.000129] i40e 0000:86:00.0: Features: PF-id[0] VFs: 64 VSIs: 66 QP: 96 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[  +0.004540] i40e 0000:86:00.0 ens803f0np0: renamed from eth0
[  +0.009012] i40e 0000:86:00.1: fw 9.140.76856 api 1.15 nvm 9.40 0xefd0ed12 1.3534.0 [8086:158b] [8086:0002]
[  +0.236975] i40e 0000:86:00.1: MAC address: 00:00:00:00:01:01
[  +0.000252] i40e 0000:86:00.1: FW LLDP is enabled
[  +0.006078] i40e 0000:86:00.1 eth0: NIC Link is Up, 25 Gbps Full Duplex, Requested FEC: CL108 RS-FEC, Negotiated FEC: None, Autoneg: True, Flow Control: None
[  +0.000437] i40e 0000:86:00.1: PCI-Express: Speed 8.0GT/s Width x8
[  +0.000386] i40e 0000:86:00.1: MFS for port 1 (5888) has been set below the default (9728)
[  +0.000129] i40e 0000:86:00.1: Features: PF-id[1] VFs: 64 VSIs: 66 QP: 96 RSS FD_ATR FD_SB NTUPLE DCB VxLAN Geneve PTP VEPA
[  +0.004697] i40e 0000:86:00.1 ens803f1np1: renamed from eth0

[Apr22 10:06] i40e 0000:86:00.0 ens803f0np0: New MTU is 9700, Max MTU is 9702
[ +21.608224] i40e 0000:86:00.0 ens803f0np0: New MTU is 1000, Max MTU is 9702

>Le ven. 19 avr. 2024 à 16:26, Pucha, HimasekharX Reddy <himasekharx.reddy.pucha@intel.com> a écrit :
> >
> > >-----Original Message-----
> > > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf 
> > >Of Erwan Velu
> > > Sent: Wednesday, March 13, 2024 2:37 PM
> > > Cc: Velu, Erwan <e.velu@criteo.com>; linux-kernel@vger.kernel.org; 
> > >Eric Dumazet <edumazet@google.com>; netdev@vger.kernel.org; Nguyen, 
> > >Anthony L <anthony.l.nguyen@intel.com>; 
> > >intel-wired-lan@lists.osuosl.org; Jakub Kicinski <kuba@kernel.org>; 
> > >Paolo Abeni <pabeni@redhat.com>; David S. Miller 
> > ><davem@davemloft.net>
> > > Subject: [Intel-wired-lan] [PATCH v4 iwl-net] i40e: Prevent setting 
> > >MTU if greater than MFS
> > >
> > > Commit 6871a7de705 ("[intelxl] Use admin queue to set port MAC address and maximum frame size") from iPXE project set the MFS to 0x600 = 1536.
> > > See https://github.com/ipxe/ipxe/commit/6871a7de705
> > >
> > > At boot time the i40e driver complains about it with the following message but continues.
> > >
> > >       MFS for port 1 has been set below the default: 600
> > >
> > > If the MTU size is increased, the driver accepts it but large packets will not be processed by the firmware generating tx_errors. The issue is pretty silent for users. i.e doing TCP in such context will generates lots of retransmissions until the proper > window size (below 1500) will be used.
> > >
> > > To fix this case, it would have been ideal to increase the MFS, via i40e_aqc_opc_set_mac_config, incoming patch will take care of it.
> > >
> > > At least, commit prevents setting up an MTU greater than the current MFS.
> > > It will avoid being in the position of having an MTU set to 9000 on the netdev with a firmware refusing packets larger than 1536.
> > >
> > > A typical trace looks like:
> > > [  377.548696] i40e 0000:5d:00.0 eno5: Error changing mtu to 9000, Max is 1500. MFS is too small.
> > >
> > > Signed-off-by: Erwan Velu <e.velu@criteo.com>
> > > ---
> > >  drivers/net/ethernet/intel/i40e/i40e_main.c | 10 +++++++++-
> > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > >
> >
> >
> > With patch when we set the MFS to 1700 (5888) in the NVM (as seen below) and then set the MTU on PF0 to 9000 and it set it to 9000 with no errors and no messages in dmesg.
> >
> > [root@localhost user]# ip link set mtu 9000 dev enp131s0f0np0 
> > [root@localhost user]# ip link show dev enp131s0f0np0
> > 9: enp131s0f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
> >     link/ether 00:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
> >
> > dmesg when loading the driver:
> > [257.035823] 140e 0000:83:00.1: MFS for port 1 (5888) has been set 
> > below the default (9728)
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index f86578857e8a..85ecf2f3de18 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -2946,7 +2946,7 @@  static int i40e_change_mtu(struct net_device *netdev, int new_mtu)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 	struct i40e_pf *pf = vsi->back;
-	int frame_size;
+	int frame_size, mfs, max_mtu;
 
 	frame_size = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog);
 	if (new_mtu > frame_size - I40E_PACKET_HDR_PAD) {
@@ -2955,6 +2955,14 @@  static int i40e_change_mtu(struct net_device *netdev, int new_mtu)
 		return -EINVAL;
 	}
 
+	mfs = pf->hw.phy.link_info.max_frame_size;
+	max_mtu = mfs - I40E_PACKET_HDR_PAD;
+	if (new_mtu > max_mtu) {
+		netdev_err(netdev, "Error changing mtu to %d, Max is %d. MFS is too small.\n",
+			   new_mtu, max_mtu);
+		return -EINVAL;
+	}
+
 	netdev_dbg(netdev, "changing MTU from %d to %d\n",
 		   netdev->mtu, new_mtu);
 	netdev->mtu = new_mtu;