[ovs-dev,RFC,v1,1/1] netdev-dpdk: Fix requested MTU size validation.

Message ID 1515538465-15626-1-git-send-email-ian.stokes@intel.com
State Superseded
Headers show
Series
  • [ovs-dev,RFC,v1,1/1] netdev-dpdk: Fix requested MTU size validation.
Related show

Commit Message

Ian Stokes Jan. 9, 2018, 10:54 p.m.
This commit replaces MTU_TO_FRAME_LEN(mtu) with
MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().

When setting an MTU we first check if the requested MTU frame
size will exceed the maximum packet frame size supported in
netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later
stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).

However when using rte_ethdev_set_mtu(mtu) the calculation used to check
that the MTU does not exceed the max frame length for that device varies
between DPDK device drivers. For example ixgbe driver calculates MTU
frame length  as

mtu + ETHER_HDR_LEN + ETHER_CRC_LEN

i40e driver calculates it as

ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2

em driver calculates it as

mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE

Currently it is possible to set an MTU for a netdev_dpdk device that exceeds
the upper limit MTU for that devices DPDK driver. This leads to a
segfault. This is because the MTU frame length comparison as is, does
not take into account the addition of the vlan tag overhead expected in
the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the
subsequent dpdk_ethdev_init() will fail before before queues have be
created for DPDK device. This coupled with assumptions regarding
reconfiguration requirements for the netdev will lead to a segfault when
the rxq is polled for this device.

A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
validating a requested MTU in netdev_dpdk_set_mtu() to account for vlan
overhead required by some devices in MTU frame length.

Note: this fix is a work around, a better approach would be if DPDK devices
could report the maximum MTU value that can be requested on a per device
basis. This capability however is not currently available. A downside of
this patch is that the MTU upper limit will be reduced by 8 bytes for
DPDK devices that do not need to account for vlan tags in MTU frame length
driver calculations e.g. ixgbe devices upper MTU limit is reduced from
the ovs point of view from 9710 to 9702.

As this patch is RFC it has limited testing to i40e and ixgbe devices.
If acceptable to the community documentation will also have to be
updated under limitations I expect.

CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
---
 lib/netdev-dpdk.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

Comments

Ian Stokes Jan. 9, 2018, 10:57 p.m. | #1
> -----Original Message-----
> From: Stokes, Ian
> Sent: Tuesday, January 9, 2018 10:54 PM
> To: dev@openvswitch.org
> Cc: Stokes, Ian <ian.stokes@intel.com>; Kavanagh, Mark B
> <mark.b.kavanagh@intel.com>
> Subject: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size
> validation.
> 
> This commit replaces MTU_TO_FRAME_LEN(mtu) with
> MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
> NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
> 
> When setting an MTU we first check if the requested MTU frame size will
> exceed the maximum packet frame size supported in netdev_dpdk_set_mtu().
> The MTU frame length is calculated as MTU + ETHER_HEADER + ETHER_CRC. The
> MTU for the device will be set at a later stage in dpdk_ethdev_init()
> using rte_ethdev_set_mtu(mtu).
> 
> However when using rte_ethdev_set_mtu(mtu) the calculation used to check
> that the MTU does not exceed the max frame length for that device varies
> between DPDK device drivers. For example ixgbe driver calculates MTU frame
> length  as
> 
> mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
> 
> i40e driver calculates it as
> 
> ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2

Apologies folks, typo above, calculation should read:

MTU + ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2

Ian
> 
> em driver calculates it as
> 
> mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
> 
> Currently it is possible to set an MTU for a netdev_dpdk device that
> exceeds the upper limit MTU for that devices DPDK driver. This leads to a
> segfault. This is because the MTU frame length comparison as is, does not
> take into account the addition of the vlan tag overhead expected in the
> drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the
> subsequent dpdk_ethdev_init() will fail before before queues have be
> created for DPDK device. This coupled with assumptions regarding
> reconfiguration requirements for the netdev will lead to a segfault when
> the rxq is polled for this device.
> 
> A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
> validating a requested MTU in netdev_dpdk_set_mtu() to account for vlan
> overhead required by some devices in MTU frame length.
> 
> Note: this fix is a work around, a better approach would be if DPDK
> devices could report the maximum MTU value that can be requested on a per
> device basis. This capability however is not currently available. A
> downside of this patch is that the MTU upper limit will be reduced by 8
> bytes for DPDK devices that do not need to account for vlan tags in MTU
> frame length driver calculations e.g. ixgbe devices upper MTU limit is
> reduced from the ovs point of view from 9710 to 9702.
> 
> As this patch is RFC it has limited testing to i40e and ixgbe devices.
> If acceptable to the community documentation will also have to be updated
> under limitations I expect.
> 
> CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
> Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
> ---
>  lib/netdev-dpdk.c |   11 ++++++++++-
>  1 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 364f545..4da5292
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1999,7 +1999,16 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
> {
>      struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> 
> -    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> +    /* XXX: We need to ensure the requested MTU frame length does not
> +     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
> +     * in how the MTU frame length is calculated when
> rte_ethdev_set_mtu(mtu)
> +     * is called e.g. i40e and em drivers include vlan tags as part of
> the
> +     * MTU frame length. As such we should use MTU_TO_MAX_FRAME_LEN(mtu)
> +     * for comparison here to avoid a failure later with
> rte_ethdev_set_mtu().
> +     * This approach should be used until DPDK provides a method to
> retrieve
> +     * maximum MTU frame length for a given device.
> +     */
> +    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>          || mtu < ETHER_MIN_MTU) {
>          VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>          return EINVAL;
> --
> 1.7.0.7
Mark Kavanagh Jan. 12, 2018, 4:59 p.m. | #2
>-----Original Message-----
>From: Stokes, Ian
>Sent: Tuesday, January 9, 2018 10:54 PM
>To: dev@openvswitch.org
>Cc: Stokes, Ian <ian.stokes@intel.com>; Kavanagh, Mark B
><mark.b.kavanagh@intel.com>
>Subject: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size validation.
>

Hey Ian,

Thanks for the patch - some comments inline.

Cheers,
Mark

>This commit replaces MTU_TO_FRAME_LEN(mtu) with
>MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
>NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
>
>When setting an MTU we first check if the requested MTU frame
>size will exceed the maximum packet frame size supported in
>netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
>ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later
>stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
>
>However when using rte_ethdev_set_mtu(mtu) the calculation used to check
>that the MTU does not exceed the max frame length for that device varies
>between DPDK device drivers. For example ixgbe driver calculates MTU
>frame length  as
>
>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
>
>i40e driver calculates it as
>
>ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2

Missing text above, as you've already observed.

>
>em driver calculates it as
>
>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
>
>Currently it is possible to set an MTU for a netdev_dpdk device that exceeds
>the upper limit MTU for that devices DPDK driver. This leads to a
>segfault. This is because the MTU frame length comparison as is, does
>not take into account the addition of the vlan tag overhead expected in
>the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but the
>subsequent dpdk_ethdev_init() will fail before before queues have be

Typos in the line above: repetition of 'before'; also, s/be$/been/ .

>created for DPDK device. This coupled with assumptions regarding
>reconfiguration requirements for the netdev will lead to a segfault when
>the rxq is polled for this device.
>
>A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
>validating a requested MTU in netdev_dpdk_set_mtu() to account for vlan
>overhead required by some devices in MTU frame length.

It would be worthwhile explaining why using MTU_TO_MAX_FRAME_LEN fixes the issue; i.e. MTU_TO_MAX_FRAME_LEN
takes into account 8 additional bytes of overhead when calculating the maximum frame size for a given MTU;
this corresponds to the upper bound of additional overhead factored in to the frame size calculation for
current DPDK drivers. 

>
>Note: this fix is a work around, a better approach would be if DPDK devices
>could report the maximum MTU value that can be requested on a per device
>basis. This capability however is not currently available. A downside of
>this patch is that the MTU upper limit will be reduced by 8 bytes for
>DPDK devices that do not need to account for vlan tags in MTU frame length
>driver calculations e.g. ixgbe devices upper MTU limit is reduced from
>the ovs point of view from 9710 to 9702.

Is this case, how does OVSDB report the port's MTU - as 9710, or as 9702? If the latter, this could be problematic.
At the very least, this behavior should be documented in the errata (as you've outlined below), and also a warning should be issued to the user, to make them aware that MTU truncation has occurred.

>
>As this patch is RFC it has limited testing to i40e and ixgbe devices.
>If acceptable to the community documentation will also have to be
>updated under limitations I expect.
>
>CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
>Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
>Signed-off-by: Ian Stokes <ian.stokes@intel.com>
>---
> lib/netdev-dpdk.c |   11 ++++++++++-
> 1 files changed, 10 insertions(+), 1 deletions(-)
>
>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>index 364f545..4da5292 100644
>--- a/lib/netdev-dpdk.c
>+++ b/lib/netdev-dpdk.c
>@@ -1999,7 +1999,16 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
> {
>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>
>-    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>+    /* XXX: We need to ensure the requested MTU frame length does not
>+     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
>+     * in how the MTU frame length is calculated when
>rte_ethdev_set_mtu(mtu)
>+     * is called e.g. i40e and em drivers include vlan tags as part of the

As you've noted in the commit message, the em driver factors in the length of a single VLAN tag, but i40e (among others) factors in two.
It would be worth noting in the comment that the greater value (i.e. 8B) is taken here.

>+     * MTU frame length. As such we should use MTU_TO_MAX_FRAME_LEN(mtu)
>+     * for comparison here to avoid a failure later with
>rte_ethdev_set_mtu().
>+     * This approach should be used until DPDK provides a method to
>retrieve
>+     * maximum MTU frame length for a given device.
>+     */
>+    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>         || mtu < ETHER_MIN_MTU) {
>         VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>         return EINVAL;
>--
>1.7.0.7
Mark Kavanagh Jan. 12, 2018, 5:03 p.m. | #3
Quick follow-up inline.

>-----Original Message-----
>From: Kavanagh, Mark B
>Sent: Friday, January 12, 2018 5:00 PM
>To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
>Subject: RE: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size
>validation.
>
>
>
>>-----Original Message-----
>>From: Stokes, Ian
>>Sent: Tuesday, January 9, 2018 10:54 PM
>>To: dev@openvswitch.org
>>Cc: Stokes, Ian <ian.stokes@intel.com>; Kavanagh, Mark B
>><mark.b.kavanagh@intel.com>
>>Subject: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size validation.
>>
>
>Hey Ian,
>
>Thanks for the patch - some comments inline.
>
>Cheers,
>Mark
>
>>This commit replaces MTU_TO_FRAME_LEN(mtu) with
>>MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
>>NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
>>
>>When setting an MTU we first check if the requested MTU frame
>>size will exceed the maximum packet frame size supported in
>>netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
>>ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a later
>>stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
>>
>>However when using rte_ethdev_set_mtu(mtu) the calculation used to check
>>that the MTU does not exceed the max frame length for that device varies
>>between DPDK device drivers. For example ixgbe driver calculates MTU
>>frame length  as
>>
>>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
>>
>>i40e driver calculates it as
>>
>>ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
>
>Missing text above, as you've already observed.
>
>>
>>em driver calculates it as
>>
>>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
>>
>>Currently it is possible to set an MTU for a netdev_dpdk device that
>exceeds
>>the upper limit MTU for that devices DPDK driver. This leads to a
>>segfault. This is because the MTU frame length comparison as is, does
>>not take into account the addition of the vlan tag overhead expected in
>>the drivers. The netdev_dpdk_set_mtu() call will incorrectly succeed but
>the
>>subsequent dpdk_ethdev_init() will fail before before queues have be
>
>Typos in the line above: repetition of 'before'; also, s/be$/been/ .
>
>>created for DPDK device. This coupled with assumptions regarding
>>reconfiguration requirements for the netdev will lead to a segfault when
>>the rxq is polled for this device.
>>
>>A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
>>validating a requested MTU in netdev_dpdk_set_mtu() to account for vlan
>>overhead required by some devices in MTU frame length.
>
>It would be worthwhile explaining why using MTU_TO_MAX_FRAME_LEN fixes the
>issue; i.e. MTU_TO_MAX_FRAME_LEN
>takes into account 8 additional bytes of overhead when calculating the
>maximum frame size for a given MTU;
>this corresponds to the upper bound of additional overhead factored in to
>the frame size calculation for
>current DPDK drivers.
>
>>
>>Note: this fix is a work around, a better approach would be if DPDK devices
>>could report the maximum MTU value that can be requested on a per device
>>basis. This capability however is not currently available. A downside of
>>this patch is that the MTU upper limit will be reduced by 8 bytes for
>>DPDK devices that do not need to account for vlan tags in MTU frame length
>>driver calculations e.g. ixgbe devices upper MTU limit is reduced from
>>the ovs point of view from 9710 to 9702.
>
>Is this case, how does OVSDB report the port's MTU - as 9710, or as 9702? If
>the latter, this could be problematic.

Scratch that last comment - I just realized that values 9703-9710 will be rejected when attempting to set the MTU.
Thanks,
Mark

>At the very least, this behavior should be documented in the errata (as
>you've outlined below), and also a warning should be issued to the user, to
>make them aware that MTU truncation has occurred.
>
>>
>>As this patch is RFC it has limited testing to i40e and ixgbe devices.
>>If acceptable to the community documentation will also have to be
>>updated under limitations I expect.
>>
>>CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
>>Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
>>Signed-off-by: Ian Stokes <ian.stokes@intel.com>
>>---
>> lib/netdev-dpdk.c |   11 ++++++++++-
>> 1 files changed, 10 insertions(+), 1 deletions(-)
>>
>>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>index 364f545..4da5292 100644
>>--- a/lib/netdev-dpdk.c
>>+++ b/lib/netdev-dpdk.c
>>@@ -1999,7 +1999,16 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
>> {
>>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>
>>-    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>>+    /* XXX: We need to ensure the requested MTU frame length does not
>>+     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
>>+     * in how the MTU frame length is calculated when
>>rte_ethdev_set_mtu(mtu)
>>+     * is called e.g. i40e and em drivers include vlan tags as part of the
>
>As you've noted in the commit message, the em driver factors in the length
>of a single VLAN tag, but i40e (among others) factors in two.
>It would be worth noting in the comment that the greater value (i.e. 8B) is
>taken here.
>
>>+     * MTU frame length. As such we should use MTU_TO_MAX_FRAME_LEN(mtu)
>>+     * for comparison here to avoid a failure later with
>>rte_ethdev_set_mtu().
>>+     * This approach should be used until DPDK provides a method to
>>retrieve
>>+     * maximum MTU frame length for a given device.
>>+     */
>>+    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>>         || mtu < ETHER_MIN_MTU) {
>>         VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>>         return EINVAL;
>>--
>>1.7.0.7
Ian Stokes Jan. 18, 2018, 5:34 p.m. | #4
> Quick follow-up inline.
> 
> >-----Original Message-----
> >From: Kavanagh, Mark B
> >Sent: Friday, January 12, 2018 5:00 PM
> >To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
> >Subject: RE: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size
> >validation.
> >
> >
> >
> >>-----Original Message-----
> >>From: Stokes, Ian
> >>Sent: Tuesday, January 9, 2018 10:54 PM
> >>To: dev@openvswitch.org
> >>Cc: Stokes, Ian <ian.stokes@intel.com>; Kavanagh, Mark B
> >><mark.b.kavanagh@intel.com>
> >>Subject: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size
> validation.
> >>
> >
> >Hey Ian,
> >
> >Thanks for the patch - some comments inline.
> >
> >Cheers,
> >Mark

Thanks for the feedback Mark, v2 based on it has been posted.

https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/343390.html

Thanks
Ian
> >
> >>This commit replaces MTU_TO_FRAME_LEN(mtu) with
> >>MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
> >>NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
> >>
> >>When setting an MTU we first check if the requested MTU frame size
> >>will exceed the maximum packet frame size supported in
> >>netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
> >>ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
> >>later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
> >>
> >>However when using rte_ethdev_set_mtu(mtu) the calculation used to
> >>check that the MTU does not exceed the max frame length for that
> >>device varies between DPDK device drivers. For example ixgbe driver
> >>calculates MTU frame length  as
> >>
> >>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
> >>
> >>i40e driver calculates it as
> >>
> >>ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
> >
> >Missing text above, as you've already observed.
> >
> >>
> >>em driver calculates it as
> >>
> >>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
> >>
> >>Currently it is possible to set an MTU for a netdev_dpdk device that
> >exceeds
> >>the upper limit MTU for that devices DPDK driver. This leads to a
> >>segfault. This is because the MTU frame length comparison as is, does
> >>not take into account the addition of the vlan tag overhead expected
> >>in the drivers. The netdev_dpdk_set_mtu() call will incorrectly
> >>succeed but
> >the
> >>subsequent dpdk_ethdev_init() will fail before before queues have be
> >
> >Typos in the line above: repetition of 'before'; also, s/be$/been/ .
> >
> >>created for DPDK device. This coupled with assumptions regarding
> >>reconfiguration requirements for the netdev will lead to a segfault
> >>when the rxq is polled for this device.
> >>
> >>A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
> >>validating a requested MTU in netdev_dpdk_set_mtu() to account for
> >>vlan overhead required by some devices in MTU frame length.
> >
> >It would be worthwhile explaining why using MTU_TO_MAX_FRAME_LEN fixes
> >the issue; i.e. MTU_TO_MAX_FRAME_LEN takes into account 8 additional
> >bytes of overhead when calculating the maximum frame size for a given
> >MTU; this corresponds to the upper bound of additional overhead
> >factored in to the frame size calculation for current DPDK drivers.
> >
> >>
> >>Note: this fix is a work around, a better approach would be if DPDK
> >>devices could report the maximum MTU value that can be requested on a
> >>per device basis. This capability however is not currently available.
> >>A downside of this patch is that the MTU upper limit will be reduced
> >>by 8 bytes for DPDK devices that do not need to account for vlan tags
> >>in MTU frame length driver calculations e.g. ixgbe devices upper MTU
> >>limit is reduced from the ovs point of view from 9710 to 9702.
> >
> >Is this case, how does OVSDB report the port's MTU - as 9710, or as
> >9702? If the latter, this could be problematic.
> 
> Scratch that last comment - I just realized that values 9703-9710 will be
> rejected when attempting to set the MTU.
> Thanks,
> Mark
> 
> >At the very least, this behavior should be documented in the errata (as
> >you've outlined below), and also a warning should be issued to the
> >user, to make them aware that MTU truncation has occurred.
> >
> >>
> >>As this patch is RFC it has limited testing to i40e and ixgbe devices.
> >>If acceptable to the community documentation will also have to be
> >>updated under limitations I expect.
> >>
> >>CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
> >>Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
> >>Signed-off-by: Ian Stokes <ian.stokes@intel.com>
> >>---
> >> lib/netdev-dpdk.c |   11 ++++++++++-
> >> 1 files changed, 10 insertions(+), 1 deletions(-)
> >>
> >>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
> >>364f545..4da5292 100644
> >>--- a/lib/netdev-dpdk.c
> >>+++ b/lib/netdev-dpdk.c
> >>@@ -1999,7 +1999,16 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int
> >>mtu)  {
> >>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> >>
> >>-    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> >>+    /* XXX: We need to ensure the requested MTU frame length does not
> >>+     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
> >>+     * in how the MTU frame length is calculated when
> >>rte_ethdev_set_mtu(mtu)
> >>+     * is called e.g. i40e and em drivers include vlan tags as part
> >>+ of the
> >
> >As you've noted in the commit message, the em driver factors in the
> >length of a single VLAN tag, but i40e (among others) factors in two.
> >It would be worth noting in the comment that the greater value (i.e.
> >8B) is taken here.
> >
> >>+     * MTU frame length. As such we should use
> MTU_TO_MAX_FRAME_LEN(mtu)
> >>+     * for comparison here to avoid a failure later with
> >>rte_ethdev_set_mtu().
> >>+     * This approach should be used until DPDK provides a method to
> >>retrieve
> >>+     * maximum MTU frame length for a given device.
> >>+     */
> >>+    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
> >>         || mtu < ETHER_MIN_MTU) {
> >>         VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
> >>         return EINVAL;
> >>--
> >>1.7.0.7
Mark Kavanagh Jan. 18, 2018, 7:37 p.m. | #5
>From: Stokes, Ian
>Sent: Thursday, January 18, 2018 5:35 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; dev@openvswitch.org
>Subject: RE: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size
>validation.
>
>> Quick follow-up inline.
>>
>> >-----Original Message-----
>> >From: Kavanagh, Mark B
>> >Sent: Friday, January 12, 2018 5:00 PM
>> >To: Stokes, Ian <ian.stokes@intel.com>; dev@openvswitch.org
>> >Subject: RE: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size
>> >validation.
>> >
>> >
>> >
>> >>-----Original Message-----
>> >>From: Stokes, Ian
>> >>Sent: Tuesday, January 9, 2018 10:54 PM
>> >>To: dev@openvswitch.org
>> >>Cc: Stokes, Ian <ian.stokes@intel.com>; Kavanagh, Mark B
>> >><mark.b.kavanagh@intel.com>
>> >>Subject: [RFC PATCH v1 1/1] netdev-dpdk: Fix requested MTU size
>> validation.
>> >>
>> >
>> >Hey Ian,
>> >
>> >Thanks for the patch - some comments inline.
>> >
>> >Cheers,
>> >Mark
>
>Thanks for the feedback Mark, v2 based on it has been posted.
>
>https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/343390.html

Cheers Ian - I'll try to get to this tomorrow.
-Mark

>
>Thanks
>Ian
>> >
>> >>This commit replaces MTU_TO_FRAME_LEN(mtu) with
>> >>MTU_TO_MAX_FRAME_LEN(mtu) when validating if an MTU will exceed
>> >>NETDEV_DPDK_MAX_PKT_LEN in netdev_dpdk_set_mtu().
>> >>
>> >>When setting an MTU we first check if the requested MTU frame size
>> >>will exceed the maximum packet frame size supported in
>> >>netdev_dpdk_set_mtu(). The MTU frame length is calculated as MTU +
>> >>ETHER_HEADER + ETHER_CRC. The MTU for the device will be set at a
>> >>later stage in dpdk_ethdev_init() using rte_ethdev_set_mtu(mtu).
>> >>
>> >>However when using rte_ethdev_set_mtu(mtu) the calculation used to
>> >>check that the MTU does not exceed the max frame length for that
>> >>device varies between DPDK device drivers. For example ixgbe driver
>> >>calculates MTU frame length  as
>> >>
>> >>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN
>> >>
>> >>i40e driver calculates it as
>> >>
>> >>ETHER_HDR_LEN + ETHER_CRC_LEN + I40E_VLAN_TAG_SIZE * 2
>> >
>> >Missing text above, as you've already observed.
>> >
>> >>
>> >>em driver calculates it as
>> >>
>> >>mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE
>> >>
>> >>Currently it is possible to set an MTU for a netdev_dpdk device that
>> >exceeds
>> >>the upper limit MTU for that devices DPDK driver. This leads to a
>> >>segfault. This is because the MTU frame length comparison as is, does
>> >>not take into account the addition of the vlan tag overhead expected
>> >>in the drivers. The netdev_dpdk_set_mtu() call will incorrectly
>> >>succeed but
>> >the
>> >>subsequent dpdk_ethdev_init() will fail before before queues have be
>> >
>> >Typos in the line above: repetition of 'before'; also, s/be$/been/ .
>> >
>> >>created for DPDK device. This coupled with assumptions regarding
>> >>reconfiguration requirements for the netdev will lead to a segfault
>> >>when the rxq is polled for this device.
>> >>
>> >>A simple way to avoid this is by using MTU_TO_MAX_FRAME_LEN(mtu) when
>> >>validating a requested MTU in netdev_dpdk_set_mtu() to account for
>> >>vlan overhead required by some devices in MTU frame length.
>> >
>> >It would be worthwhile explaining why using MTU_TO_MAX_FRAME_LEN fixes
>> >the issue; i.e. MTU_TO_MAX_FRAME_LEN takes into account 8 additional
>> >bytes of overhead when calculating the maximum frame size for a given
>> >MTU; this corresponds to the upper bound of additional overhead
>> >factored in to the frame size calculation for current DPDK drivers.
>> >
>> >>
>> >>Note: this fix is a work around, a better approach would be if DPDK
>> >>devices could report the maximum MTU value that can be requested on a
>> >>per device basis. This capability however is not currently available.
>> >>A downside of this patch is that the MTU upper limit will be reduced
>> >>by 8 bytes for DPDK devices that do not need to account for vlan tags
>> >>in MTU frame length driver calculations e.g. ixgbe devices upper MTU
>> >>limit is reduced from the ovs point of view from 9710 to 9702.
>> >
>> >Is this case, how does OVSDB report the port's MTU - as 9710, or as
>> >9702? If the latter, this could be problematic.
>>
>> Scratch that last comment - I just realized that values 9703-9710 will be
>> rejected when attempting to set the MTU.
>> Thanks,
>> Mark
>>
>> >At the very least, this behavior should be documented in the errata (as
>> >you've outlined below), and also a warning should be issued to the
>> >user, to make them aware that MTU truncation has occurred.
>> >
>> >>
>> >>As this patch is RFC it has limited testing to i40e and ixgbe devices.
>> >>If acceptable to the community documentation will also have to be
>> >>updated under limitations I expect.
>> >>
>> >>CC: Mark Kavanagh <mark.b.kavanagh@intel.com>
>> >>Fixes: 0072e931 ("netdev-dpdk: add support for jumbo frames")
>> >>Signed-off-by: Ian Stokes <ian.stokes@intel.com>
>> >>---
>> >> lib/netdev-dpdk.c |   11 ++++++++++-
>> >> 1 files changed, 10 insertions(+), 1 deletions(-)
>> >>
>> >>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>> >>364f545..4da5292 100644
>> >>--- a/lib/netdev-dpdk.c
>> >>+++ b/lib/netdev-dpdk.c
>> >>@@ -1999,7 +1999,16 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int
>> >>mtu)  {
>> >>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>> >>
>> >>-    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >>+    /* XXX: We need to ensure the requested MTU frame length does not
>> >>+     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
>> >>+     * in how the MTU frame length is calculated when
>> >>rte_ethdev_set_mtu(mtu)
>> >>+     * is called e.g. i40e and em drivers include vlan tags as part
>> >>+ of the
>> >
>> >As you've noted in the commit message, the em driver factors in the
>> >length of a single VLAN tag, but i40e (among others) factors in two.
>> >It would be worth noting in the comment that the greater value (i.e.
>> >8B) is taken here.
>> >
>> >>+     * MTU frame length. As such we should use
>> MTU_TO_MAX_FRAME_LEN(mtu)
>> >>+     * for comparison here to avoid a failure later with
>> >>rte_ethdev_set_mtu().
>> >>+     * This approach should be used until DPDK provides a method to
>> >>retrieve
>> >>+     * maximum MTU frame length for a given device.
>> >>+     */
>> >>+    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
>> >>         || mtu < ETHER_MIN_MTU) {
>> >>         VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
>> >>         return EINVAL;
>> >>--
>> >>1.7.0.7

Patch

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 364f545..4da5292 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1999,7 +1999,16 @@  netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 
-    if (MTU_TO_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
+    /* XXX: We need to ensure the requested MTU frame length does not
+     * surpass the NETDEV_DPDK_MAX_PKT_LEN. DPDK device drivers differ
+     * in how the MTU frame length is calculated when rte_ethdev_set_mtu(mtu)
+     * is called e.g. i40e and em drivers include vlan tags as part of the
+     * MTU frame length. As such we should use MTU_TO_MAX_FRAME_LEN(mtu)
+     * for comparison here to avoid a failure later with rte_ethdev_set_mtu().
+     * This approach should be used until DPDK provides a method to retrieve
+     * maximum MTU frame length for a given device.
+     */
+    if (MTU_TO_MAX_FRAME_LEN(mtu) > NETDEV_DPDK_MAX_PKT_LEN
         || mtu < ETHER_MIN_MTU) {
         VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu);
         return EINVAL;