Message ID | 20180112204459.2311-1-blp@ovn.org |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] dpif-netlink-rtnl: Work around MTU bug in kernel GRE driver. | expand |
On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > overrides the MTU to 1472 bytes. This commit works around the problem by > following up a request to create a GRE device with a second request to set > the MTU. > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > Reported-by: Eric Garver <e@erig.me> > Reported-by: James Page <james.page@ubuntu.com> > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- > This is not properly tested. It needs to be tested before it makes sense > to commit it. > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > index fe9c8ed7104f..0bf965d29f41 100644 > --- a/lib/dpif-netlink-rtnl.c > +++ b/lib/dpif-netlink-rtnl.c > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > nl_msg_end_nested(&request, linkinfo_off); > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > + if (!err && type == OVS_VPORT_TYPE_GRE) { > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > + * RTM_NEWLINK, by setting the MTU again. See > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > + ofpbuf_clear(&request); > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > + NLM_F_REQUEST | NLM_F_ACK); > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > + nl_msg_put_string(&request, IFLA_IFNAME, name); > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > + > + err = nl_transact(NETLINK_ROUTE, &request, NULL); > + } > > exit: > ofpbuf_uninit(&request); > -- > 2.10.2 > This looks sane to me. It would be nice if James could test it though. Acked-by: Eric Garver <e@erig.me>
On Fri, Jan 12, 2018 at 04:35:06PM -0500, Eric Garver wrote: > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > overrides the MTU to 1472 bytes. This commit works around the problem by > > following up a request to create a GRE device with a second request to set > > the MTU. > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > Reported-by: Eric Garver <e@erig.me> > > Reported-by: James Page <james.page@ubuntu.com> > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > --- > > This is not properly tested. It needs to be tested before it makes sense > > to commit it. > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > index fe9c8ed7104f..0bf965d29f41 100644 > > --- a/lib/dpif-netlink-rtnl.c > > +++ b/lib/dpif-netlink-rtnl.c > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > > nl_msg_end_nested(&request, linkinfo_off); > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > > + * RTM_NEWLINK, by setting the MTU again. See > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > > + ofpbuf_clear(&request); > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > + NLM_F_REQUEST | NLM_F_ACK); > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > + > > + err = nl_transact(NETLINK_ROUTE, &request, NULL); > > + } > > > > exit: > > ofpbuf_uninit(&request); > > -- > > 2.10.2 > > > > This looks sane to me. It would be nice if James could test it though. On second thought.. if SETLINK fails we'll report "err" that the tunnel didn't get created, but it actually did. We can either ignore the err from SETLINK or destroy the interface. Depends on if you prefer low MTU tunnel or no tunnel. :)
On Fri, Jan 12, 2018 at 04:51:00PM -0500, Eric Garver wrote: > On Fri, Jan 12, 2018 at 04:35:06PM -0500, Eric Garver wrote: > > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > > overrides the MTU to 1472 bytes. This commit works around the problem by > > > following up a request to create a GRE device with a second request to set > > > the MTU. > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > > Reported-by: Eric Garver <e@erig.me> > > > Reported-by: James Page <james.page@ubuntu.com> > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > --- > > > This is not properly tested. It needs to be tested before it makes sense > > > to commit it. > > > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > > 1 file changed, 13 insertions(+) > > > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > > index fe9c8ed7104f..0bf965d29f41 100644 > > > --- a/lib/dpif-netlink-rtnl.c > > > +++ b/lib/dpif-netlink-rtnl.c > > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > > > nl_msg_end_nested(&request, linkinfo_off); > > > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > > > + * RTM_NEWLINK, by setting the MTU again. See > > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > > > + ofpbuf_clear(&request); > > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > > + NLM_F_REQUEST | NLM_F_ACK); > > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > > + > > > + err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > + } > > > > > > exit: > > > ofpbuf_uninit(&request); > > > -- > > > 2.10.2 > > > > > > > This looks sane to me. It would be nice if James could test it though. > > On second thought.. if SETLINK fails we'll report "err" that the tunnel > didn't get created, but it actually did. We can either ignore the err > from SETLINK or destroy the interface. Depends on if you prefer low MTU > tunnel or no tunnel. :) I guess it would be slightly better to report no error (and log the real error), because there's no guarantee that DELLINK would succeed, and what do we do if it doesn't?
On Fri, Jan 12, 2018 at 02:04:11PM -0800, Ben Pfaff wrote: > On Fri, Jan 12, 2018 at 04:51:00PM -0500, Eric Garver wrote: > > On Fri, Jan 12, 2018 at 04:35:06PM -0500, Eric Garver wrote: > > > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > > > overrides the MTU to 1472 bytes. This commit works around the problem by > > > > following up a request to create a GRE device with a second request to set > > > > the MTU. > > > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > > > Reported-by: Eric Garver <e@erig.me> > > > > Reported-by: James Page <james.page@ubuntu.com> > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > > --- > > > > This is not properly tested. It needs to be tested before it makes sense > > > > to commit it. > > > > > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > > > 1 file changed, 13 insertions(+) > > > > > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > > > index fe9c8ed7104f..0bf965d29f41 100644 > > > > --- a/lib/dpif-netlink-rtnl.c > > > > +++ b/lib/dpif-netlink-rtnl.c > > > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > > > > nl_msg_end_nested(&request, linkinfo_off); > > > > > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > > > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > > > > + * RTM_NEWLINK, by setting the MTU again. See > > > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > > > > + ofpbuf_clear(&request); > > > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > > > + NLM_F_REQUEST | NLM_F_ACK); > > > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > > > + > > > > + err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > > + } > > > > > > > > exit: > > > > ofpbuf_uninit(&request); > > > > -- > > > > 2.10.2 > > > > > > > > > > This looks sane to me. It would be nice if James could test it though. > > > > On second thought.. if SETLINK fails we'll report "err" that the tunnel > > didn't get created, but it actually did. We can either ignore the err > > from SETLINK or destroy the interface. Depends on if you prefer low MTU > > tunnel or no tunnel. :) > > I guess it would be slightly better to report no error (and log the real > error), because there's no guarantee that DELLINK would succeed, and > what do we do if it doesn't? Right. I'm good with ignoring the SETLINK error.
On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > overrides the MTU to 1472 bytes. This commit works around the problem by > following up a request to create a GRE device with a second request to set > the MTU. > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > Reported-by: Eric Garver <e@erig.me> > Reported-by: James Page <james.page@ubuntu.com> > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- > This is not properly tested. It needs to be tested before it makes sense > to commit it. > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > index fe9c8ed7104f..0bf965d29f41 100644 > --- a/lib/dpif-netlink-rtnl.c > +++ b/lib/dpif-netlink-rtnl.c > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > nl_msg_end_nested(&request, linkinfo_off); > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > + if (!err && type == OVS_VPORT_TYPE_GRE) { > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > + * RTM_NEWLINK, by setting the MTU again. See > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > + ofpbuf_clear(&request); > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > + NLM_F_REQUEST | NLM_F_ACK); > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > + nl_msg_put_string(&request, IFLA_IFNAME, name); > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); James' testing found that this value will be rejected by the kernel. GRE driver uses ip_tunnel_change_mtu() with the "strict" parameter, so it will return an error instead of clamping the value. GENEVE/VXLAN always clamp the value. I think 65478 is the max_mtu for gretap accounting for optional csum, key, seq in gre_calc_hlen(). It may be slightly higher for how OVS configures gretap. 0xFFF8 - eth(14) - ipv4(20) - gre(16) = 65478 > + > + err = nl_transact(NETLINK_ROUTE, &request, NULL); > + } > > exit: > ofpbuf_uninit(&request); > -- > 2.10.2
On Tue, Jan 16, 2018 at 10:42:16AM -0500, Eric Garver wrote: > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > overrides the MTU to 1472 bytes. This commit works around the problem by > > following up a request to create a GRE device with a second request to set > > the MTU. > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > Reported-by: Eric Garver <e@erig.me> > > Reported-by: James Page <james.page@ubuntu.com> > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > --- > > This is not properly tested. It needs to be tested before it makes sense > > to commit it. > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > index fe9c8ed7104f..0bf965d29f41 100644 > > --- a/lib/dpif-netlink-rtnl.c > > +++ b/lib/dpif-netlink-rtnl.c > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > > nl_msg_end_nested(&request, linkinfo_off); > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > > + * RTM_NEWLINK, by setting the MTU again. See > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > > + ofpbuf_clear(&request); > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > + NLM_F_REQUEST | NLM_F_ACK); > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > James' testing found that this value will be rejected by the kernel. GRE > driver uses ip_tunnel_change_mtu() with the "strict" parameter, so it > will return an error instead of clamping the value. GENEVE/VXLAN always > clamp the value. > > I think 65478 is the max_mtu for gretap accounting for optional csum, > key, seq in gre_calc_hlen(). It may be slightly higher for how OVS > configures gretap. > > 0xFFF8 - eth(14) - ipv4(20) - gre(16) = 65478 Do you think that it's worthwhile using the maximum possible MTU, or should we be a little conservative and use 65400 or 65000? Either of those values would be future-proof, I think. Thanks, Ben.
On Tue, Jan 16, 2018 at 09:16:47AM -0800, Ben Pfaff wrote: > On Tue, Jan 16, 2018 at 10:42:16AM -0500, Eric Garver wrote: > > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > > overrides the MTU to 1472 bytes. This commit works around the problem by > > > following up a request to create a GRE device with a second request to set > > > the MTU. > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > > Reported-by: Eric Garver <e@erig.me> > > > Reported-by: James Page <james.page@ubuntu.com> > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > --- > > > This is not properly tested. It needs to be tested before it makes sense > > > to commit it. > > > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > > 1 file changed, 13 insertions(+) > > > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > > index fe9c8ed7104f..0bf965d29f41 100644 > > > --- a/lib/dpif-netlink-rtnl.c > > > +++ b/lib/dpif-netlink-rtnl.c > > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > > > nl_msg_end_nested(&request, linkinfo_off); > > > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > > > + * RTM_NEWLINK, by setting the MTU again. See > > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > > > + ofpbuf_clear(&request); > > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > > + NLM_F_REQUEST | NLM_F_ACK); > > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > > > James' testing found that this value will be rejected by the kernel. GRE > > driver uses ip_tunnel_change_mtu() with the "strict" parameter, so it > > will return an error instead of clamping the value. GENEVE/VXLAN always > > clamp the value. > > > > I think 65478 is the max_mtu for gretap accounting for optional csum, > > key, seq in gre_calc_hlen(). It may be slightly higher for how OVS > > configures gretap. > > > > 0xFFF8 - eth(14) - ipv4(20) - gre(16) = 65478 > > Do you think that it's worthwhile using the maximum possible MTU, or I don't think it's worth possibly hitting another one of these issues. > should we be a little conservative and use 65400 or 65000? Either of > those values would be future-proof, I think. I agree 65000 would be okay, but if we're going to do it then lets do it for everything (GENEVE/VXLAN/GRE) instead of assuming the kernel will clamp the value for us on newlink/setlink.
On Tue, Jan 16, 2018 at 02:08:45PM -0500, Eric Garver wrote: > On Tue, Jan 16, 2018 at 09:16:47AM -0800, Ben Pfaff wrote: > > On Tue, Jan 16, 2018 at 10:42:16AM -0500, Eric Garver wrote: > > > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > > > overrides the MTU to 1472 bytes. This commit works around the problem by > > > > following up a request to create a GRE device with a second request to set > > > > the MTU. > > > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > > > Reported-by: Eric Garver <e@erig.me> > > > > Reported-by: James Page <james.page@ubuntu.com> > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > > --- > > > > This is not properly tested. It needs to be tested before it makes sense > > > > to commit it. > > > > > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > > > 1 file changed, 13 insertions(+) > > > > > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > > > index fe9c8ed7104f..0bf965d29f41 100644 > > > > --- a/lib/dpif-netlink-rtnl.c > > > > +++ b/lib/dpif-netlink-rtnl.c > > > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, > > > > nl_msg_end_nested(&request, linkinfo_off); > > > > > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > > > + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in > > > > + * RTM_NEWLINK, by setting the MTU again. See > > > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ > > > > + ofpbuf_clear(&request); > > > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > > > + NLM_F_REQUEST | NLM_F_ACK); > > > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > > > > > James' testing found that this value will be rejected by the kernel. GRE > > > driver uses ip_tunnel_change_mtu() with the "strict" parameter, so it > > > will return an error instead of clamping the value. GENEVE/VXLAN always > > > clamp the value. > > > > > > I think 65478 is the max_mtu for gretap accounting for optional csum, > > > key, seq in gre_calc_hlen(). It may be slightly higher for how OVS > > > configures gretap. > > > > > > 0xFFF8 - eth(14) - ipv4(20) - gre(16) = 65478 > > > > Do you think that it's worthwhile using the maximum possible MTU, or > > I don't think it's worth possibly hitting another one of these issues. > > > should we be a little conservative and use 65400 or 65000? Either of > > those values would be future-proof, I think. > > I agree 65000 would be okay, but if we're going to do it then lets do it > for everything (GENEVE/VXLAN/GRE) instead of assuming the kernel will > clamp the value for us on newlink/setlink. OK, great. I posted v3 that does this and makes the error handling fix we agreed on: https://patchwork.ozlabs.org/project/openvswitch/list/?series=23965
Hi Ben On Wed, 17 Jan 2018 at 20:05 Ben Pfaff <blp@ovn.org> wrote: > On Tue, Jan 16, 2018 at 02:08:45PM -0500, Eric Garver wrote: > > On Tue, Jan 16, 2018 at 09:16:47AM -0800, Ben Pfaff wrote: > > > On Tue, Jan 16, 2018 at 10:42:16AM -0500, Eric Garver wrote: > > > > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > > > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > > > > overrides the MTU to 1472 bytes. This commit works around the > problem by > > > > > following up a request to create a GRE device with a second > request to set > > > > > the MTU. > > > > > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > > > > Reported-by: Eric Garver <e@erig.me> > > > > > Reported-by: James Page <james.page@ubuntu.com> > > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > > > --- > > > > > This is not properly tested. It needs to be tested before it > makes sense > > > > > to commit it. > > > > > > > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > > > > 1 file changed, 13 insertions(+) > > > > > > > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > > > > index fe9c8ed7104f..0bf965d29f41 100644 > > > > > --- a/lib/dpif-netlink-rtnl.c > > > > > +++ b/lib/dpif-netlink-rtnl.c > > > > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct > netdev_tunnel_config *tnl_cfg, > > > > > nl_msg_end_nested(&request, linkinfo_off); > > > > > > > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > > > > + /* Work around a bug in kernel GRE driver, which ignores > IFLA_MTU in > > > > > + * RTM_NEWLINK, by setting the MTU again. See > > > > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. > */ > > > > > + ofpbuf_clear(&request); > > > > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > > > > + NLM_F_REQUEST | NLM_F_ACK); > > > > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > > > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > > > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > > > > > > > James' testing found that this value will be rejected by the kernel. > GRE > > > > driver uses ip_tunnel_change_mtu() with the "strict" parameter, so it > > > > will return an error instead of clamping the value. GENEVE/VXLAN > always > > > > clamp the value. > > > > > > > > I think 65478 is the max_mtu for gretap accounting for optional csum, > > > > key, seq in gre_calc_hlen(). It may be slightly higher for how OVS > > > > configures gretap. > > > > > > > > 0xFFF8 - eth(14) - ipv4(20) - gre(16) = 65478 > > > > > > Do you think that it's worthwhile using the maximum possible MTU, or > > > > I don't think it's worth possibly hitting another one of these issues. > > > > > should we be a little conservative and use 65400 or 65000? Either of > > > those values would be future-proof, I think. > > > > I agree 65000 would be okay, but if we're going to do it then lets do it > > for everything (GENEVE/VXLAN/GRE) instead of assuming the kernel will > > clamp the value for us on newlink/setlink. > > OK, great. > > I posted v3 that does this and makes the error handling fix we agreed > on: > https://patchwork.ozlabs.org/project/openvswitch/list/?series=23965 retested with your new patches; confirmed that MTU for GRE and VXLAN ports are set consistently to 65000. Cheers James
On Thu, Jan 18, 2018 at 08:19:06AM +0000, James Page wrote: > Hi Ben > > On Wed, 17 Jan 2018 at 20:05 Ben Pfaff <blp@ovn.org> wrote: > > > On Tue, Jan 16, 2018 at 02:08:45PM -0500, Eric Garver wrote: > > > On Tue, Jan 16, 2018 at 09:16:47AM -0800, Ben Pfaff wrote: > > > > On Tue, Jan 16, 2018 at 10:42:16AM -0500, Eric Garver wrote: > > > > > On Fri, Jan 12, 2018 at 12:44:59PM -0800, Ben Pfaff wrote: > > > > > > The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and > > > > > > overrides the MTU to 1472 bytes. This commit works around the > > problem by > > > > > > following up a request to create a GRE device with a second > > request to set > > > > > > the MTU. > > > > > > > > > > > > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 > > > > > > Reported-by: Eric Garver <e@erig.me> > > > > > > Reported-by: James Page <james.page@ubuntu.com> > > > > > > Signed-off-by: Ben Pfaff <blp@ovn.org> > > > > > > --- > > > > > > This is not properly tested. It needs to be tested before it > > makes sense > > > > > > to commit it. > > > > > > > > > > > > lib/dpif-netlink-rtnl.c | 13 +++++++++++++ > > > > > > 1 file changed, 13 insertions(+) > > > > > > > > > > > > diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c > > > > > > index fe9c8ed7104f..0bf965d29f41 100644 > > > > > > --- a/lib/dpif-netlink-rtnl.c > > > > > > +++ b/lib/dpif-netlink-rtnl.c > > > > > > @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct > > netdev_tunnel_config *tnl_cfg, > > > > > > nl_msg_end_nested(&request, linkinfo_off); > > > > > > > > > > > > err = nl_transact(NETLINK_ROUTE, &request, NULL); > > > > > > + if (!err && type == OVS_VPORT_TYPE_GRE) { > > > > > > + /* Work around a bug in kernel GRE driver, which ignores > > IFLA_MTU in > > > > > > + * RTM_NEWLINK, by setting the MTU again. See > > > > > > + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. > > */ > > > > > > + ofpbuf_clear(&request); > > > > > > + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, > > > > > > + NLM_F_REQUEST | NLM_F_ACK); > > > > > > + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); > > > > > > + nl_msg_put_string(&request, IFLA_IFNAME, name); > > > > > > + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); > > > > > > > > > > James' testing found that this value will be rejected by the kernel. > > GRE > > > > > driver uses ip_tunnel_change_mtu() with the "strict" parameter, so it > > > > > will return an error instead of clamping the value. GENEVE/VXLAN > > always > > > > > clamp the value. > > > > > > > > > > I think 65478 is the max_mtu for gretap accounting for optional csum, > > > > > key, seq in gre_calc_hlen(). It may be slightly higher for how OVS > > > > > configures gretap. > > > > > > > > > > 0xFFF8 - eth(14) - ipv4(20) - gre(16) = 65478 > > > > > > > > Do you think that it's worthwhile using the maximum possible MTU, or > > > > > > I don't think it's worth possibly hitting another one of these issues. > > > > > > > should we be a little conservative and use 65400 or 65000? Either of > > > > those values would be future-proof, I think. > > > > > > I agree 65000 would be okay, but if we're going to do it then lets do it > > > for everything (GENEVE/VXLAN/GRE) instead of assuming the kernel will > > > clamp the value for us on newlink/setlink. > > > > OK, great. > > > > I posted v3 that does this and makes the error handling fix we agreed > > on: > > https://patchwork.ozlabs.org/project/openvswitch/list/?series=23965 > > > retested with your new patches; confirmed that MTU for GRE and VXLAN ports > are set consistently to 65000. > > Cheers > > James Thanks for the testing! I applied these patches to master and branch-2.9.
diff --git a/lib/dpif-netlink-rtnl.c b/lib/dpif-netlink-rtnl.c index fe9c8ed7104f..0bf965d29f41 100644 --- a/lib/dpif-netlink-rtnl.c +++ b/lib/dpif-netlink-rtnl.c @@ -329,6 +329,19 @@ dpif_netlink_rtnl_create(const struct netdev_tunnel_config *tnl_cfg, nl_msg_end_nested(&request, linkinfo_off); err = nl_transact(NETLINK_ROUTE, &request, NULL); + if (!err && type == OVS_VPORT_TYPE_GRE) { + /* Work around a bug in kernel GRE driver, which ignores IFLA_MTU in + * RTM_NEWLINK, by setting the MTU again. See + * https://bugzilla.redhat.com/show_bug.cgi?id=1488484. */ + ofpbuf_clear(&request); + nl_msg_put_nlmsghdr(&request, 0, RTM_SETLINK, + NLM_F_REQUEST | NLM_F_ACK); + ofpbuf_put_zeros(&request, sizeof(struct ifinfomsg)); + nl_msg_put_string(&request, IFLA_IFNAME, name); + nl_msg_put_u32(&request, IFLA_MTU, UINT16_MAX); + + err = nl_transact(NETLINK_ROUTE, &request, NULL); + } exit: ofpbuf_uninit(&request);
The kernel GRE driver ignores IFLA_MTU in RTM_NEWLINK requests and overrides the MTU to 1472 bytes. This commit works around the problem by following up a request to create a GRE device with a second request to set the MTU. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1488484 Reported-by: Eric Garver <e@erig.me> Reported-by: James Page <james.page@ubuntu.com> Signed-off-by: Ben Pfaff <blp@ovn.org> --- This is not properly tested. It needs to be tested before it makes sense to commit it. lib/dpif-netlink-rtnl.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)