diff mbox series

[ovs-dev,RFC,v2,1/1] netdev-dpdk: Fix requested MTU size validation.

Message ID 1516296742-23454-1-git-send-email-ian.stokes@intel.com
State Superseded
Delegated to: Ian Stokes
Headers show
Series [ovs-dev,RFC,v2,1/1] netdev-dpdk: Fix requested MTU size validation. | expand

Commit Message

Stokes, Ian Jan. 18, 2018, 5:32 p.m. UTC
This commit replaces MTU_TO_FRAME_LEN(mtu) with
MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().

When setting an MTU we first check if the requested MTU frame
size will exceed the maximum packet frame size supported in
netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later
stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).

However when using rte_ethdev_set_mtu(mtu) the calculation used to check
that the MTU does not exceed the max frame length for that device varies
between DPDK device drivers. For example ixgbe driver calculates MTU
frame length  as

mtu + ETHER_HDR_LEN + ETHER_CRC_LEN

i40e driver calculates it as

mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2

em driver calculates it as

mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE

Currently it is possible to set an MTU for a netdev_dpdk device that exceeds
the upper limit MTU for that devices DPDK driver. This leads to a
segfault. This is because the MTU frame length comparison as is, does
not take into account the addition of the vlan tag overhead expected in
the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the
subsequent dpdk_ethdev_init() will fail before the queues have been
created for the DPDK device. This coupled with assumptions regarding
reconfiguration requirements for the netdev will lead to a segfault when
the rxq is polled for this device.

A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
validating a requested MTU in netdev_dpdk_set_mtu().
MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:

mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)

It's use accounts for the potential of up to two vlan headers being
included in the overhead required by some devices in MTU frame length at
the netdev_dpdk_set_mtu() stage. This allows OVS to flag an error rather
than the DPDK driver if the MTU exceeds the max DPDK frame length. OVS
can fail gracefully at this point by falling back to a default MTU of 1500
and continue to configure the port.

Note: this fix is a work around, a better approach would be if DPDK devices
could report the maximum MTU value that can be requested on a per device
basis. This capability however is not currently available. A downside of
this patch is that the MTU upper limit will be reduced by 8 bytes for
DPDK devices that do not need to account for vlan tags in MTU frame length
driver calculations e.g. ixgbe devices upper MTU limit is reduced from
the OVS point of view from 9710 to 9702.

CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
Signed-off-by: Ian Stokes <ian.stokes@intel.com>

---
v1->v2
* Add note to limitations in DPDK documentation regarding MTU.
* Correct MTU calculation in commit message.
* Flag that we provision 8 bytes (2 x vlan header) by using
  MTU_TO_MAX_FRAME_LEN in commit message and code comments.
---
 Documentation/intro/install/dpdk.rst |   12 ++++++++++++
 lib/netdev-dpdk.c                    |   13 ++++++++++++-
 2 files changed, 24 insertions(+), 1 deletions(-)

Comments

Flavio Leitner Jan. 19, 2018, 1:32 p.m. UTC | #1
Hi Ian,

On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote:
> This commit replaces MTU_TO_FRAME_LEN(mtu) with
> MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
> NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
> 
> When setting an MTU we first check if the requested MTU frame
> size will exceed the maximum packet frame size supported in
> netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
> ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later
> stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).

Perhaps?
dpdk_eth_dev_init()
rte_eth_dev_set_mtu()


> However when using rte_ethdev_set_mtu(mtu) the calculation used to check
> that the MTU does not exceed the max frame length for that device varies
> between DPDK device drivers. For example ixgbe driver calculates MTU
> frame length  as
> 
> mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
> 
> i40e driver calculates it as
> 
> mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
> 
> em driver calculates it as
> 
> mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
> 
> Currently it is possible to set an MTU for a netdev_dpdk device that exceeds
> the upper limit MTU for that devices DPDK driver. This leads to a
> segfault. This is because the MTU frame length comparison as is, does
> not take into account the addition of the vlan tag overhead expected in
> the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the
> subsequent dpdk_ethdev_init() will fail before the queues have been
> created for the DPDK device. This coupled with assumptions regarding
> reconfiguration requirements for the netdev will lead to a segfault when
> the rxq is polled for this device.
> 
> A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
> validating a requested MTU in netdev_dpdk_set_mtu().
> MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:
> 
> mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)
> 
> It's use accounts for the potential of up to two vlan headers being
> included in the overhead required by some devices in MTU frame length at
> the netdev_dpdk_set_mtu() stage. This allows OVS to flag an error rather
> than the DPDK driver if the MTU exceeds the max DPDK frame length. OVS
> can fail gracefully at this point by falling back to a default MTU of 1500
> and continue to configure the port.
> 
> Note: this fix is a work around, a better approach would be if DPDK devices
> could report the maximum MTU value that can be requested on a per device
> basis. This capability however is not currently available. A downside of
> this patch is that the MTU upper limit will be reduced by 8 bytes for
> DPDK devices that do not need to account for vlan tags in MTU frame length
> driver calculations e.g. ixgbe devices upper MTU limit is reduced from
> the OVS point of view from 9710 to 9702.
> 
> CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
> 
> ---
> v1->v2
> * Add note to limitations in DPDK documentation regarding MTU.
> * Correct MTU calculation in commit message.
> * Flag that we provision 8 bytes (2 x vlan header) by using
>   MTU_TO_MAX_FRAME_LEN in commit message and code comments.
> ---
>  Documentation/intro/install/dpdk.rst |   12 ++++++++++++
>  lib/netdev-dpdk.c                    |   13 ++++++++++++-
>  2 files changed, 24 insertions(+), 1 deletions(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
> index 3fecb5c..5800096 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -585,6 +585,18 @@ Limitations
>  
>  .. _DPDK release notes: http://dpdk.org/doc/guides/rel_notes/release_17_11.html
>  
> +- Upper bound MTU: DPDK device drivers differ in how MTU overhead is
> +  calculated e.g. i40e driver includes 2 x vlan headers in MTU overhead,
> +  em driver includes 1 x vlan header, ixgbe driver does not include a vlan
> +  header in overhead. Currently it is not possible for OVS DPDK to know what
> +  upper bound MTU value is supported for a given device. As such OVS DPDK
> +  must provision for the case where the  maximum MTU value includes 2 x vlan
> +  headers. This reduces the upper bound MTU value for devices that do not
> +  include vlan headers in their overhead by 8 bytes e.g. ixgbe devices  upper

two spaces

> +  bound MTU is reduced from 9710 to 9702. This work around is temporary and
> +  is expected to be removed once a method is provided by DPDK to query the
> +  maximum MTU for a given device.
> +
>  Reporting Bugs
>  --------------
>  
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index e32c7f6..2ac76e3 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
>  {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>  
> -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> +    /* XXX: We need to ensure the requested MTU frame length does not
> +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
> +     * in how the MTU frame length is calculated when rte_ethdev_set_mtu(mtu)
> +     * is called e.g. i40e driver includes 2 x vlan headers in MTU overhead
> +     * the em driver includes 1 x vlan tag in MTU overhead, the ixgbe driver
> +     * does not include vlan tags in MTU overhead. As such we should use
> +     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x vlan headers
> +     * (8 bytes) for comparison. This avoids a failure later with
> +     * rte_ethdev_set_mtu(). This approach should be used until DPDK provides

rte_eth_dev_set_mtu

> +     * a method to retrieve maximum MTU frame length for a given device.
> +     */
> +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>          || mtu < ETHER_MIN_MTU) {
>          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>          return EINVAL;

We got a bug report for another related MTU issue because in older
OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead,
so if the packet length is the maximum allowed, it can't forward
packets with VLAN headers included. So, I agree that this needs
improvements on the DPDK side to report not only the maximum MTU but
also to standardize what means MTU. Today we don't know the overhead
each PMD will account.

In the hope that the committer can fix the small typos above or if
there is another spin:

Acked-by: Flavio Leitner <fbl@sysclose.org>

Thanks Ian!


> -- 
> 1.7.0.7
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Stokes, Ian Jan. 19, 2018, 2:46 p.m. UTC | #2
> Hi Ian,
> 
> On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote:
> > This commit replaces MTU_TO_FRAME_LEN(mtu) with
> > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
> > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
> >
> > When setting an MTU we first check if the requested MTU frame size
> > will exceed the maximum packet frame size supported in
> > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
> > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
> > later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
> 
> Perhaps?
> dpdk_eth_dev_init()
> rte_eth_dev_set_mtu()
> 
> 
> > However when using rte_ethdev_set_mtu(mtu) the calculation used to
> > check that the MTU does not exceed the max frame length for that
> > device varies between DPDK device drivers. For example ixgbe driver
> > calculates MTU frame length  as
> >
> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
> >
> > i40e driver calculates it as
> >
> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
> >
> > em driver calculates it as
> >
> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
> >
> > Currently it is possible to set an MTU for a netdev_dpdk device that
> > exceeds the upper limit MTU for that devices DPDK driver. This leads
> > to a segfault. This is because the MTU frame length comparison as is,
> > does not take into account the addition of the vlan tag overhead
> > expected in the drivers. The netdev_dpdk_set_mtu() call will
> > incorrectly succeed but the subsequent dpdk_ethdev_init() will fail
> > before the queues have been created for the DPDK device. This coupled
> > with assumptions regarding reconfiguration requirements for the netdev
> > will lead to a segfault when the rxq is polled for this device.
> >
> > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
> > validating a requested MTU in netdev_dpdk_set_mtu().
> > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:
> >
> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)
> >
> > It's use accounts for the potential of up to two vlan headers being
> > included in the overhead required by some devices in MTU frame length
> > at the netdev_dpdk_set_mtu() stage. This allows OVS to flag an error
> > rather than the DPDK driver if the MTU exceeds the max DPDK frame
> > length. OVS can fail gracefully at this point by falling back to a
> > default MTU of 1500 and continue to configure the port.
> >
> > Note: this fix is a work around, a better approach would be if DPDK
> > devices could report the maximum MTU value that can be requested on a
> > per device basis. This capability however is not currently available.
> > A downside of this patch is that the MTU upper limit will be reduced
> > by 8 bytes for DPDK devices that do not need to account for vlan tags
> > in MTU frame length driver calculations e.g. ixgbe devices upper MTU
> > limit is reduced from the OVS point of view from 9710 to 9702.
> >
> > CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
> > Signed-off-by: Ian Stokes <ian.stokes@intel.com>
> >
> > ---
> > v1->v2
> > * Add note to limitations in DPDK documentation regarding MTU.
> > * Correct MTU calculation in commit message.
> > * Flag that we provision 8 bytes (2 x vlan header) by using
> >   MTU_TO_MAX_FRAME_LEN in commit message and code comments.
> > ---
> >  Documentation/intro/install/dpdk.rst |   12 ++++++++++++
> >  lib/netdev-dpdk.c                    |   13 ++++++++++++-
> >  2 files changed, 24 insertions(+), 1 deletions(-)
> >
> > diff --git a/Documentation/intro/install/dpdk.rst
> > b/Documentation/intro/install/dpdk.rst
> > index 3fecb5c..5800096 100644
> > --- a/Documentation/intro/install/dpdk.rst
> > +++ b/Documentation/intro/install/dpdk.rst
> > @@ -585,6 +585,18 @@ Limitations
> >
> >  .. _DPDK release notes:
> > http://dpdk.org/doc/guides/rel_notes/release_17_11.html
> >
> > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead is
> > +  calculated e.g. i40e driver includes 2 x vlan headers in MTU
> > +overhead,
> > +  em driver includes 1 x vlan header, ixgbe driver does not include a
> > +vlan
> > +  header in overhead. Currently it is not possible for OVS DPDK to
> > +know what
> > +  upper bound MTU value is supported for a given device. As such OVS
> > +DPDK
> > +  must provision for the case where the  maximum MTU value includes 2
> > +x vlan
> > +  headers. This reduces the upper bound MTU value for devices that do
> > +not
> > +  include vlan headers in their overhead by 8 bytes e.g. ixgbe
> > +devices  upper
> 
> two spaces
> 
> > +  bound MTU is reduced from 9710 to 9702. This work around is
> > + temporary and  is expected to be removed once a method is provided
> > + by DPDK to query the  maximum MTU for a given device.
> > +
> >  Reporting Bugs
> >  --------------
> >
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> > e32c7f6..2ac76e3 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int
> > mtu)  {
> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >
> > -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> > +    /* XXX: We need to ensure the requested MTU frame length does not
> > +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
> > +     * in how the MTU frame length is calculated when
> rte_ethdev_set_mtu(mtu)
> > +     * is called e.g. i40e driver includes 2 x vlan headers in MTU
> overhead
> > +     * the em driver includes 1 x vlan tag in MTU overhead, the ixgbe
> driver
> > +     * does not include vlan tags in MTU overhead. As such we should
> use
> > +     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x vlan
> headers
> > +     * (8 bytes) for comparison. This avoids a failure later with
> > +     * rte_ethdev_set_mtu(). This approach should be used until DPDK
> > + provides
> 
> rte_eth_dev_set_mtu
> 
> > +     * a method to retrieve maximum MTU frame length for a given
> device.
> > +     */
> > +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> >          || mtu < ETHER_MIN_MTU) {
> >          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
> >          return EINVAL;
> 
> We got a bug report for another related MTU issue because in older
> OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead, so
> if the packet length is the maximum allowed, it can't forward packets with
> VLAN headers included. So, I agree that this needs improvements on the
> DPDK side to report not only the maximum MTU but also to standardize what
> means MTU. Today we don't know the overhead each PMD will account.
> 
> In the hope that the committer can fix the small typos above or if there
> is another spin:
> 
> Acked-by: Flavio Leitner <fbl@sysclose.org>

Thanks Flavio,

I agree, we can add this to requirements for DPDK MTU capabilities.

I'll re-spin a v3 and remove the RFC, once there's an ACK from Mark I'll add this to the DPDK_MERGE over the weekend.

Thanks
Ian
> 
> Thanks Ian!
> 
> 
> > --
> > 1.7.0.7
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> --
> Flavio
Mark Kavanagh Jan. 22, 2018, 10:58 a.m. UTC | #3
Hey Ian,

Thanks for this patch.

Apart from the issues that Flavio pointed out, I have a few additional comments inline.

Cheers,
Mark

>From: Stokes, Ian
>Sent: Friday, January 19, 2018 2:47 PM
>To: Flavio Leitner <fbl@sysclose.org>
>Cc: dev@openvswitch.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
>Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested MTU
>size validation.
>
>> Hi Ian,
>>
>> On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote:
>> > This commit replaces MTU_TO_FRAME_LEN(mtu) with
>> > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
>> > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
>> >
>> > When setting an MTU we first check if the requested MTU frame size
>> > will exceed the maximum packet frame size supported in
>> > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
>> > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
>> > later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
>>
>> Perhaps?
>> dpdk_eth_dev_init()
>> rte_eth_dev_set_mtu()
>>
>>
>> > However when using rte_ethdev_set_mtu(mtu) the calculation used to
>> > check that the MTU does not exceed the max frame length for that
>> > device varies between DPDK device drivers. For example ixgbe driver
>> > calculates MTU frame length  as
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
>> >
>> > i40e driver calculates it as
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
>> >
>> > em driver calculates it as
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
>> >
>> > Currently it is possible to set an MTU for a netdev_dpdk device that
>> > exceeds the upper limit MTU for that devices DPDK driver. This leads
>> > to a segfault. This is because the MTU frame length comparison as is,
>> > does not take into account the addition of the vlan tag overhead
>> > expected in the drivers. The netdev_dpdk_set_mtu() call will
>> > incorrectly succeed but the subsequent dpdk_ethdev_init() will fail
>> > before the queues have been created for the DPDK device. This coupled
>> > with assumptions regarding reconfiguration requirements for the netdev
>> > will lead to a segfault when the rxq is polled for this device.
>> >
>> > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
>> > validating a requested MTU in netdev_dpdk_set_mtu().
>> > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:
>> >
>> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)
>> >
>> > It's use accounts for the potential of up to two vlan headers being

Since you're spinning a V3 anyway... ;)
<micronit>Its, not "It's"</micronit>

>> > included in the overhead required by some devices in MTU frame length
>> > at the netdev_dpdk_set_mtu() stage. This allows OVS to flag an error
>> > rather than the DPDK driver if the MTU exceeds the max DPDK frame
>> > length. OVS can fail gracefully at this point by falling back to a

I think 'falling back' here is a bit misleading, as it suggest some kind of rollback, or previous state restoration.
In truth, the MTU in this case never changes; it was initialized to 1500, as stays as such, since the new value was rejected.

>> > default MTU of 1500 and continue to configure the port.
>> >
>> > Note: this fix is a work around, a better approach would be if DPDK
>> > devices could report the maximum MTU value that can be requested on a
>> > per device basis. This capability however is not currently available.
>> > A downside of this patch is that the MTU upper limit will be reduced
>> > by 8 bytes for DPDK devices that do not need to account for vlan tags
>> > in MTU frame length driver calculations e.g. ixgbe devices upper MTU
>> > limit is reduced from the OVS point of view from 9710 to 9702.
>> >
>> > CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
>> > Signed-off-by: Ian Stokes <ian.stokes@intel.com>
>> >
>> > ---
>> > v1->v2
>> > * Add note to limitations in DPDK documentation regarding MTU.
>> > * Correct MTU calculation in commit message.
>> > * Flag that we provision 8 bytes (2 x vlan header) by using
>> >   MTU_TO_MAX_FRAME_LEN in commit message and code comments.
>> > ---
>> >  Documentation/intro/install/dpdk.rst |   12 ++++++++++++
>> >  lib/netdev-dpdk.c                    |   13 ++++++++++++-
>> >  2 files changed, 24 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/Documentation/intro/install/dpdk.rst
>> > b/Documentation/intro/install/dpdk.rst
>> > index 3fecb5c..5800096 100644
>> > --- a/Documentation/intro/install/dpdk.rst
>> > +++ b/Documentation/intro/install/dpdk.rst
>> > @@ -585,6 +585,18 @@ Limitations
>> >
>> >  .. _DPDK release notes:
>> > http://dpdk.org/doc/guides/rel_notes/release_17_11.html
>> >
>> > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead is
>> > +  calculated e.g. i40e driver includes 2 x vlan headers in MTU
>> > +overhead,

As a general point, this is technically frame (i.e L2) overhead, whereas MTU is an L3 concept. I'd rephrase it as 'how the L2 overhead for a given MTU value is calculated'.

>> > +  em driver includes 1 x vlan header, ixgbe driver does not include a
>> > +vlan
>> > +  header in overhead. Currently it is not possible for OVS DPDK to
>> > +know what
>> > +  upper bound MTU value is supported for a given device. As such OVS
>> > +DPDK
>> > +  must provision for the case where the  maximum MTU value includes 2
>> > +x vlan
>> > +  headers. This reduces the upper bound MTU value for devices that do
>> > +not
>> > +  include vlan headers in their overhead by 8 bytes e.g. ixgbe
>> > +devices  upper
>>
>> two spaces
>>
>> > +  bound MTU is reduced from 9710 to 9702. This work around is
>> > + temporary and  is expected to be removed once a method is provided
>> > + by DPDK to query the  maximum MTU for a given device.

Has it been confirmed that DPDK will provide such functionality? If not, you may want to rephrase the previous sentence, so as not to set unrealistic expectations.

>> > +
>> >  Reporting Bugs
>> >  --------------
>> >
>> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>> > e32c7f6..2ac76e3 100644
>> > --- a/lib/netdev-dpdk.c
>> > +++ b/lib/netdev-dpdk.c
>> > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int
>> > mtu)  {
>> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>> >
>> > -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> > +    /* XXX: We need to ensure the requested MTU frame length does not

I'd probably rephrase to something along these lines: "ensure that the overall frame length of the requested MTU...".

>> > +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
>> > +     * in how the MTU frame length is calculated when
>> rte_ethdev_set_mtu(mtu)
>> > +     * is called e.g. i40e driver includes 2 x vlan headers in MTU
>> overhead
>> > +     * the em driver includes 1 x vlan tag in MTU overhead, the ixgbe
>> driver
>> > +     * does not include vlan tags in MTU overhead. As such we should
>> use
>> > +     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x vlan
>> headers
>> > +     * (8 bytes) for comparison. This avoids a failure later with
>> > +     * rte_ethdev_set_mtu(). This approach should be used until DPDK
>> > + provides
>>
>> rte_eth_dev_set_mtu
>>
>> > +     * a method to retrieve maximum MTU frame length for a given
>> device.
>> > +     */
>> > +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >          || mtu < ETHER_MIN_MTU) {
>> >          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>> >          return EINVAL;
>>
>> We got a bug report for another related MTU issue because in older
>> OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead, so
>> if the packet length is the maximum allowed, it can't forward packets with
>> VLAN headers included. So, I agree that this needs improvements on the
>> DPDK side to report not only the maximum MTU but also to standardize what
>> means MTU. Today we don't know the overhead each PMD will account.
>>
>> In the hope that the committer can fix the small typos above or if there
>> is another spin:
>>
>> Acked-by: Flavio Leitner <fbl@sysclose.org>
>
>Thanks Flavio,
>
>I agree, we can add this to requirements for DPDK MTU capabilities.
>
>I'll re-spin a v3 and remove the RFC, once there's an ACK from Mark I'll add
>this to the DPDK_MERGE over the weekend.
>
>Thanks
>Ian
>>
>> Thanks Ian!
>>
>>
>> > --
>> > 1.7.0.7
>> >
>> > _______________________________________________
>> > dev mailing list
>> > dev@openvswitch.org
>> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>> --
>> Flavio
Stokes, Ian Jan. 22, 2018, 11:15 a.m. UTC | #4
> Hey Ian,
> 
> Thanks for this patch.
> 
> Apart from the issues that Flavio pointed out, I have a few additional
> comments inline.

Thanks Mark, replies inline.

> 
> Cheers,
> Mark
> 
> >From: Stokes, Ian
> >Sent: Friday, January 19, 2018 2:47 PM
> >To: Flavio Leitner <fbl@sysclose.org>
> >Cc: dev@openvswitch.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
> >Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested
> >MTU size validation.
> >
> >> Hi Ian,
> >>
> >> On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote:
> >> > This commit replaces MTU_TO_FRAME_LEN(mtu) with
> >> > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
> >> > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
> >> >
> >> > When setting an MTU we first check if the requested MTU frame size
> >> > will exceed the maximum packet frame size supported in
> >> > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
> >> > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
> >> > later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
> >>
> >> Perhaps?
> >> dpdk_eth_dev_init()
> >> rte_eth_dev_set_mtu()
> >>
> >>
> >> > However when using rte_ethdev_set_mtu(mtu) the calculation used to
> >> > check that the MTU does not exceed the max frame length for that
> >> > device varies between DPDK device drivers. For example ixgbe driver
> >> > calculates MTU frame length  as
> >> >
> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
> >> >
> >> > i40e driver calculates it as
> >> >
> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
> >> >
> >> > em driver calculates it as
> >> >
> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
> >> >
> >> > Currently it is possible to set an MTU for a netdev_dpdk device
> >> > that exceeds the upper limit MTU for that devices DPDK driver. This
> >> > leads to a segfault. This is because the MTU frame length
> >> > comparison as is, does not take into account the addition of the
> >> > vlan tag overhead expected in the drivers. The
> >> > netdev_dpdk_set_mtu() call will incorrectly succeed but the
> >> > subsequent dpdk_ethdev_init() will fail before the queues have been
> >> > created for the DPDK device. This coupled with assumptions
> >> > regarding reconfiguration requirements for the netdev will lead to a
> segfault when the rxq is polled for this device.
> >> >
> >> > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu)
> >> > when validating a requested MTU in netdev_dpdk_set_mtu().
> >> > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:
> >> >
> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)
> >> >
> >> > It's use accounts for the potential of up to two vlan headers being
> 
> Since you're spinning a V3 anyway... ;)
> <micronit>Its, not "It's"</micronit>

Sure.

> 
> >> > included in the overhead required by some devices in MTU frame
> >> > length at the netdev_dpdk_set_mtu() stage. This allows OVS to flag
> >> > an error rather than the DPDK driver if the MTU exceeds the max
> >> > DPDK frame length. OVS can fail gracefully at this point by falling
> >> > back to a
> 
> I think 'falling back' here is a bit misleading, as it suggest some kind
> of rollback, or previous state restoration.
> In truth, the MTU in this case never changes; it was initialized to 1500,
> as stays as such, since the new value was rejected.

Ok, will remove falling back and replace with 'use default'.

> 
> >> > default MTU of 1500 and continue to configure the port.
> >> >
> >> > Note: this fix is a work around, a better approach would be if DPDK
> >> > devices could report the maximum MTU value that can be requested on
> >> > a per device basis. This capability however is not currently
> available.
> >> > A downside of this patch is that the MTU upper limit will be
> >> > reduced by 8 bytes for DPDK devices that do not need to account for
> >> > vlan tags in MTU frame length driver calculations e.g. ixgbe
> >> > devices upper MTU limit is reduced from the OVS point of view from
> 9710 to 9702.
> >> >
> >> > CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
> >> > Signed-off-by: Ian Stokes <ian.stokes@intel.com>
> >> >
> >> > ---
> >> > v1->v2
> >> > * Add note to limitations in DPDK documentation regarding MTU.
> >> > * Correct MTU calculation in commit message.
> >> > * Flag that we provision 8 bytes (2 x vlan header) by using
> >> >   MTU_TO_MAX_FRAME_LEN in commit message and code comments.
> >> > ---
> >> >  Documentation/intro/install/dpdk.rst |   12 ++++++++++++
> >> >  lib/netdev-dpdk.c                    |   13 ++++++++++++-
> >> >  2 files changed, 24 insertions(+), 1 deletions(-)
> >> >
> >> > diff --git a/Documentation/intro/install/dpdk.rst
> >> > b/Documentation/intro/install/dpdk.rst
> >> > index 3fecb5c..5800096 100644
> >> > --- a/Documentation/intro/install/dpdk.rst
> >> > +++ b/Documentation/intro/install/dpdk.rst
> >> > @@ -585,6 +585,18 @@ Limitations
> >> >
> >> >  .. _DPDK release notes:
> >> > http://dpdk.org/doc/guides/rel_notes/release_17_11.html
> >> >
> >> > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead
> >> > +is
> >> > +  calculated e.g. i40e driver includes 2 x vlan headers in MTU
> >> > +overhead,
> 
> As a general point, this is technically frame (i.e L2) overhead, whereas
> MTU is an L3 concept. I'd rephrase it as 'how the L2 overhead for a given
> MTU value is calculated'.

Will fix in v3.

> 
> >> > +  em driver includes 1 x vlan header, ixgbe driver does not
> >> > +include a vlan
> >> > +  header in overhead. Currently it is not possible for OVS DPDK to
> >> > +know what
> >> > +  upper bound MTU value is supported for a given device. As such
> >> > +OVS DPDK
> >> > +  must provision for the case where the  maximum MTU value
> >> > +includes 2 x vlan
> >> > +  headers. This reduces the upper bound MTU value for devices that
> >> > +do not
> >> > +  include vlan headers in their overhead by 8 bytes e.g. ixgbe
> >> > +devices  upper
> >>
> >> two spaces
> >>
> >> > +  bound MTU is reduced from 9710 to 9702. This work around is
> >> > + temporary and  is expected to be removed once a method is
> >> > + provided by DPDK to query the  maximum MTU for a given device.
> 
> Has it been confirmed that DPDK will provide such functionality? If not,
> you may want to rephrase the previous sentence, so as not to set
> unrealistic expectations.

I don't think it's unrealistic, there's agreement from both the OVS and DPDK community at the community calls that a fix like this is required and would not be controversial. I can rephrase above to be:

'This work around is intended to be temporary until DPDK provides a method to query the  maximum MTU for a given device.'

If that helps?

> 
> >> > +
> >> >  Reporting Bugs
> >> >  --------------
> >> >
> >> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> >> > e32c7f6..2ac76e3 100644
> >> > --- a/lib/netdev-dpdk.c
> >> > +++ b/lib/netdev-dpdk.c
> >> > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev,
> >> > int
> >> > mtu)  {
> >> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >> >
> >> > -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> >> > +    /* XXX: We need to ensure the requested MTU frame length does
> >> > + not
> 
> I'd probably rephrase to something along these lines: "ensure that the
> overall frame length of the requested MTU...".

Sure.

> 
> >> > +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers
> differ
> >> > +     * in how the MTU frame length is calculated when
> >> rte_ethdev_set_mtu(mtu)
> >> > +     * is called e.g. i40e driver includes 2 x vlan headers in MTU
> >> overhead
> >> > +     * the em driver includes 1 x vlan tag in MTU overhead, the
> >> > + ixgbe
> >> driver
> >> > +     * does not include vlan tags in MTU overhead. As such we
> >> > + should
> >> use
> >> > +     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x
> >> > + vlan
> >> headers
> >> > +     * (8 bytes) for comparison. This avoids a failure later with
> >> > +     * rte_ethdev_set_mtu(). This approach should be used until
> >> > + DPDK provides
> >>
> >> rte_eth_dev_set_mtu
> >>
> >> > +     * a method to retrieve maximum MTU frame length for a given
> >> device.
> >> > +     */
> >> > +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> >> >          || mtu < ETHER_MIN_MTU) {
> >> >          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
> >> >          return EINVAL;
> >>
> >> We got a bug report for another related MTU issue because in older
> >> OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead,
> >> so if the packet length is the maximum allowed, it can't forward
> >> packets with VLAN headers included. So, I agree that this needs
> >> improvements on the DPDK side to report not only the maximum MTU but
> >> also to standardize what means MTU. Today we don't know the overhead
> each PMD will account.
> >>
> >> In the hope that the committer can fix the small typos above or if
> >> there is another spin:
> >>
> >> Acked-by: Flavio Leitner <fbl@sysclose.org>
> >
> >Thanks Flavio,
> >
> >I agree, we can add this to requirements for DPDK MTU capabilities.
> >
> >I'll re-spin a v3 and remove the RFC, once there's an ACK from Mark
> >I'll add this to the DPDK_MERGE over the weekend.
> >
> >Thanks
> >Ian
> >>
> >> Thanks Ian!
> >>
> >>
> >> > --
> >> > 1.7.0.7
> >> >
> >> > _______________________________________________
> >> > dev mailing list
> >> > dev@openvswitch.org
> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >>
> >> --
> >> Flavio
Mark Kavanagh Jan. 22, 2018, 11:21 a.m. UTC | #5
>From: Stokes, Ian
>Sent: Monday, January 22, 2018 11:16 AM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; Flavio Leitner
><fbl@sysclose.org>
>Cc: dev@openvswitch.org
>Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested MTU
>size validation.
>
>> Hey Ian,
>>
>> Thanks for this patch.
>>
>> Apart from the issues that Flavio pointed out, I have a few additional
>> comments inline.
>
>Thanks Mark, replies inline.

Thanks Ian - one clarification on the final comment inline ;)
-Mark

>
>>
>> Cheers,
>> Mark
>>
>> >From: Stokes, Ian
>> >Sent: Friday, January 19, 2018 2:47 PM
>> >To: Flavio Leitner <fbl@sysclose.org>
>> >Cc: dev@openvswitch.org; Kavanagh, Mark B <mark.b.kavanagh@intel.com>
>> >Subject: RE: [ovs-dev] [RFC PATCH v2 1/1] netdev-dpdk: Fix requested
>> >MTU size validation.
>> >
>> >> Hi Ian,
>> >>
>> >> On Thu, Jan 18, 2018 at 05:32:22PM +0000, Ian Stokes wrote:
>> >> > This commit replaces MTU_TO_FRAME_LEN(mtu) with
>> >> > MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
>> >> > NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
>> >> >
>> >> > When setting an MTU we first check if the requested MTU frame size
>> >> > will exceed the maximum packet frame size supported in
>> >> > netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
>> >> > ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
>> >> > later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
>> >>
>> >> Perhaps?
>> >> dpdk_eth_dev_init()
>> >> rte_eth_dev_set_mtu()
>> >>
>> >>
>> >> > However when using rte_ethdev_set_mtu(mtu) the calculation used to
>> >> > check that the MTU does not exceed the max frame length for that
>> >> > device varies between DPDK device drivers. For example ixgbe driver
>> >> > calculates MTU frame length  as
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
>> >> >
>> >> > i40e driver calculates it as
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
>> >> >
>> >> > em driver calculates it as
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
>> >> >
>> >> > Currently it is possible to set an MTU for a netdev_dpdk device
>> >> > that exceeds the upper limit MTU for that devices DPDK driver. This
>> >> > leads to a segfault. This is because the MTU frame length
>> >> > comparison as is, does not take into account the addition of the
>> >> > vlan tag overhead expected in the drivers. The
>> >> > netdev_dpdk_set_mtu() call will incorrectly succeed but the
>> >> > subsequent dpdk_ethdev_init() will fail before the queues have been
>> >> > created for the DPDK device. This coupled with assumptions
>> >> > regarding reconfiguration requirements for the netdev will lead to a
>> segfault when the rxq is polled for this device.
>> >> >
>> >> > A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu)
>> >> > when validating a requested MTU in netdev_dpdk_set_mtu().
>> >> > MTU_TO_MAX_FRAME_LEN(mtu) is equivalent to the following:
>> >> >
>> >> > mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + (2 * VLAN_HEADER_LEN)
>> >> >
>> >> > It's use accounts for the potential of up to two vlan headers being
>>
>> Since you're spinning a V3 anyway... ;)
>> <micronit>Its, not "It's"</micronit>
>
>Sure.
>
>>
>> >> > included in the overhead required by some devices in MTU frame
>> >> > length at the netdev_dpdk_set_mtu() stage. This allows OVS to flag
>> >> > an error rather than the DPDK driver if the MTU exceeds the max
>> >> > DPDK frame length. OVS can fail gracefully at this point by falling
>> >> > back to a
>>
>> I think 'falling back' here is a bit misleading, as it suggest some kind
>> of rollback, or previous state restoration.
>> In truth, the MTU in this case never changes; it was initialized to 1500,
>> as stays as such, since the new value was rejected.
>
>Ok, will remove falling back and replace with 'use default'.
>
>>
>> >> > default MTU of 1500 and continue to configure the port.
>> >> >
>> >> > Note: this fix is a work around, a better approach would be if DPDK
>> >> > devices could report the maximum MTU value that can be requested on
>> >> > a per device basis. This capability however is not currently
>> available.
>> >> > A downside of this patch is that the MTU upper limit will be
>> >> > reduced by 8 bytes for DPDK devices that do not need to account for
>> >> > vlan tags in MTU frame length driver calculations e.g. ixgbe
>> >> > devices upper MTU limit is reduced from the OVS point of view from
>> 9710 to 9702.
>> >> >
>> >> > CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> >> > Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
>> >> > Signed-off-by: Ian Stokes <ian.stokes@intel.com>
>> >> >
>> >> > ---
>> >> > v1->v2
>> >> > * Add note to limitations in DPDK documentation regarding MTU.
>> >> > * Correct MTU calculation in commit message.
>> >> > * Flag that we provision 8 bytes (2 x vlan header) by using
>> >> >   MTU_TO_MAX_FRAME_LEN in commit message and code comments.
>> >> > ---
>> >> >  Documentation/intro/install/dpdk.rst |   12 ++++++++++++
>> >> >  lib/netdev-dpdk.c                    |   13 ++++++++++++-
>> >> >  2 files changed, 24 insertions(+), 1 deletions(-)
>> >> >
>> >> > diff --git a/Documentation/intro/install/dpdk.rst
>> >> > b/Documentation/intro/install/dpdk.rst
>> >> > index 3fecb5c..5800096 100644
>> >> > --- a/Documentation/intro/install/dpdk.rst
>> >> > +++ b/Documentation/intro/install/dpdk.rst
>> >> > @@ -585,6 +585,18 @@ Limitations
>> >> >
>> >> >  .. _DPDK release notes:
>> >> > http://dpdk.org/doc/guides/rel_notes/release_17_11.html
>> >> >
>> >> > +- Upper bound MTU: DPDK device drivers differ in how MTU overhead
>> >> > +is
>> >> > +  calculated e.g. i40e driver includes 2 x vlan headers in MTU
>> >> > +overhead,
>>
>> As a general point, this is technically frame (i.e L2) overhead, whereas
>> MTU is an L3 concept. I'd rephrase it as 'how the L2 overhead for a given
>> MTU value is calculated'.
>
>Will fix in v3.
>
>>
>> >> > +  em driver includes 1 x vlan header, ixgbe driver does not
>> >> > +include a vlan
>> >> > +  header in overhead. Currently it is not possible for OVS DPDK to
>> >> > +know what
>> >> > +  upper bound MTU value is supported for a given device. As such
>> >> > +OVS DPDK
>> >> > +  must provision for the case where the  maximum MTU value
>> >> > +includes 2 x vlan
>> >> > +  headers. This reduces the upper bound MTU value for devices that
>> >> > +do not
>> >> > +  include vlan headers in their overhead by 8 bytes e.g. ixgbe
>> >> > +devices  upper
>> >>
>> >> two spaces
>> >>
>> >> > +  bound MTU is reduced from 9710 to 9702. This work around is
>> >> > + temporary and  is expected to be removed once a method is
>> >> > + provided by DPDK to query the  maximum MTU for a given device.
>>
>> Has it been confirmed that DPDK will provide such functionality? If not,
>> you may want to rephrase the previous sentence, so as not to set
>> unrealistic expectations.
>
>I don't think it's unrealistic, there's agreement from both the OVS and DPDK
>community at the community calls that a fix like this is required and would
>not be controversial. I can rephrase above to be:

No need to rephrase Ian - I just wanted to know if DPDK had confirmed that they will provide a fix; since they have, it's perfectly fine to set that expectation here.
 
>
>'This work around is intended to be temporary until DPDK provides a method
>to query the  maximum MTU for a given device.'
>
>If that helps?
>
>>
>> >> > +
>> >> >  Reporting Bugs
>> >> >  --------------
>> >> >
>> >> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>> >> > e32c7f6..2ac76e3 100644
>> >> > --- a/lib/netdev-dpdk.c
>> >> > +++ b/lib/netdev-dpdk.c
>> >> > @@ -2134,7 +2134,18 @@ netdev_dpdk_set_mtu(struct netdev *netdev,
>> >> > int
>> >> > mtu)  {
>> >> >      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>> >> >
>> >> > -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >> > +    /* XXX: We need to ensure the requested MTU frame length does
>> >> > + not
>>
>> I'd probably rephrase to something along these lines: "ensure that the
>> overall frame length of the requested MTU...".
>
>Sure.
>
>>
>> >> > +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers
>> differ
>> >> > +     * in how the MTU frame length is calculated when
>> >> rte_ethdev_set_mtu(mtu)
>> >> > +     * is called e.g. i40e driver includes 2 x vlan headers in MTU
>> >> overhead
>> >> > +     * the em driver includes 1 x vlan tag in MTU overhead, the
>> >> > + ixgbe
>> >> driver
>> >> > +     * does not include vlan tags in MTU overhead. As such we
>> >> > + should
>> >> use
>> >> > +     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x
>> >> > + vlan
>> >> headers
>> >> > +     * (8 bytes) for comparison. This avoids a failure later with
>> >> > +     * rte_ethdev_set_mtu(). This approach should be used until
>> >> > + DPDK provides
>> >>
>> >> rte_eth_dev_set_mtu
>> >>
>> >> > +     * a method to retrieve maximum MTU frame length for a given
>> >> device.
>> >> > +     */
>> >> > +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >> >          || mtu < ETHER_MIN_MTU) {
>> >> >          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>> >> >          return EINVAL;
>> >>
>> >> We got a bug report for another related MTU issue because in older
>> >> OVS/DPDK, the i40e driver didn't include the 2 VLAN headers overhead,
>> >> so if the packet length is the maximum allowed, it can't forward
>> >> packets with VLAN headers included. So, I agree that this needs
>> >> improvements on the DPDK side to report not only the maximum MTU but
>> >> also to standardize what means MTU. Today we don't know the overhead
>> each PMD will account.
>> >>
>> >> In the hope that the committer can fix the small typos above or if
>> >> there is another spin:
>> >>
>> >> Acked-by: Flavio Leitner <fbl@sysclose.org>
>> >
>> >Thanks Flavio,
>> >
>> >I agree, we can add this to requirements for DPDK MTU capabilities.
>> >
>> >I'll re-spin a v3 and remove the RFC, once there's an ACK from Mark
>> >I'll add this to the DPDK_MERGE over the weekend.
>> >
>> >Thanks
>> >Ian
>> >>
>> >> Thanks Ian!
>> >>
>> >>
>> >> > --
>> >> > 1.7.0.7
>> >> >
>> >> > _______________________________________________
>> >> > dev mailing list
>> >> > dev@openvswitch.org
>> >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> >>
>> >> --
>> >> Flavio
diff mbox series

Patch

diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst
index 3fecb5c..5800096 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -585,6 +585,18 @@  Limitations
 
 .. _DPDK release notes: http://dpdk.org/doc/guides/rel_notes/release_17_11.html
 
+- Upper bound MTU: DPDK device drivers differ in how MTU overhead is
+  calculated e.g. i40e driver includes 2 x vlan headers in MTU overhead,
+  em driver includes 1 x vlan header, ixgbe driver does not include a vlan
+  header in overhead. Currently it is not possible for OVS DPDK to know what
+  upper bound MTU value is supported for a given device. As such OVS DPDK
+  must provision for the case where the  maximum MTU value includes 2 x vlan
+  headers. This reduces the upper bound MTU value for devices that do not
+  include vlan headers in their overhead by 8 bytes e.g. ixgbe devices  upper
+  bound MTU is reduced from 9710 to 9702. This work around is temporary and
+  is expected to be removed once a method is provided by DPDK to query the
+  maximum MTU for a given device.
+
 Reporting Bugs
 --------------
 
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index e32c7f6..2ac76e3 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2134,7 +2134,18 @@  netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 
-    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
+    /* XXX: We need to ensure the requested MTU frame length does not
+     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
+     * in how the MTU frame length is calculated when rte_ethdev_set_mtu(mtu)
+     * is called e.g. i40e driver includes 2 x vlan headers in MTU overhead
+     * the em driver includes 1 x vlan tag in MTU overhead, the ixgbe driver
+     * does not include vlan tags in MTU overhead. As such we should use
+     * MTU_TO_MAX_FRAME_LEN(mtu) which includes an additional 2 x vlan headers
+     * (8 bytes) for comparison. This avoids a failure later with
+     * rte_ethdev_set_mtu(). This approach should be used until DPDK provides
+     * a method to retrieve maximum MTU frame length for a given device.
+     */
+    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
         || mtu < ETHER_MIN_MTU) {
         VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
         return EINVAL;