Message ID | MEYP282MB330286C88D76C634B380C4E4CD6F9@MEYP282MB3302.AUSP282.PROD.OUTLOOK.COM |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,v4] netdev-vport : Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
On 12/8/21 03:57, lin huang wrote: > From: linhuang <linhuang@ruijie.com.cn> > > Userspace tunnel doesn't have a valid device in the kernel. So > get_ifindex() function (ioctl) always get error during > adding a port, deleting a port or updating a port status. > > The info log is > "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) > on vxlan_sys_4789 device failed: No such device" > > If there are a lot of userspace tunnel ports on a bridge, the > iface_refresh_netdev_status() function will spend a lot of time. > > So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just > return -ENODEV. > > Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> > Test-by: Mike Pattrick <mkp@redhat.com> > --- > lib/netdev-vport.c | 6 ++++++ > vswitchd/bridge.c | 2 ++ > 2 files changed, 8 insertions(+) > Applied. Thanks! Best regards, Ilya Maximets.
On 2021-12-09 3:16 PM, Ilya Maximets wrote: > On 12/8/21 03:57, lin huang wrote: >> From: linhuang <linhuang@ruijie.com.cn> >> >> Userspace tunnel doesn't have a valid device in the kernel. So >> get_ifindex() function (ioctl) always get error during >> adding a port, deleting a port or updating a port status. >> >> The info log is >> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) >> on vxlan_sys_4789 device failed: No such device" >> >> If there are a lot of userspace tunnel ports on a bridge, the >> iface_refresh_netdev_status() function will spend a lot of time. >> >> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just >> return -ENODEV. >> >> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >> Test-by: Mike Pattrick <mkp@redhat.com> >> --- >> lib/netdev-vport.c | 6 ++++++ >> vswitchd/bridge.c | 2 ++ >> 2 files changed, 8 insertions(+) >> > > Applied. Thanks! > > Best regards, Ilya Maximets. > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Croid%40nvidia.com%7C8865bff084424563a30308d9bb16357f%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637746526329615023%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=53XCrfcyaY1orDpnmjdf7FV5espcWKPElCMDJKVU5Jw%3D&reserved=0 Hi, We encounter an offload issue with this commit that OVS doesn't add rules to TC and keeps on with OVS dp. To reproduce the issue, configure ovs with bridge and 2 ports, restart openvswitch service and then do ping between the ports. Removing/readding port to bridge or calling dpctl/dump-flows "fix" the issue and ovs calls TC again. From first look it's because br->ofproto->type is not the same pointer usually set in netdev_ports_insert() which can be dpi_class->type or static "system" and in the netdev_ports_* functions the code uses pointer equal instead of strcmp() to check the type. So netdev_ports_lookup() returns error now. 2021-12-14T10:56:18.240Z|00010|dpif_netlink(handler1)|ERR|XXX parse_flow_put 2243 get port for type system ret (nil) Even after changing all netdev_ports_* to use strcmp() and netdev_ports_lookup() is ok and we do get to tc flow put. Looks like we get an error from tc to offload. so checking this part now. 2021-12-14T11:23:09.230Z|00035|dpif_netlink(handler1)|ERR|failed to offload flow: No such device: enp8s0f0_0 So still looking on this but wanted to share about the behavior. Thanks, Roi
On 2021-12-14 1:59 PM, Roi Dayan wrote: > > > On 2021-12-09 3:16 PM, Ilya Maximets wrote: >> On 12/8/21 03:57, lin huang wrote: >>> From: linhuang <linhuang@ruijie.com.cn> >>> >>> Userspace tunnel doesn't have a valid device in the kernel. So >>> get_ifindex() function (ioctl) always get error during >>> adding a port, deleting a port or updating a port status. >>> >>> The info log is >>> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) >>> on vxlan_sys_4789 device failed: No such device" >>> >>> If there are a lot of userspace tunnel ports on a bridge, the >>> iface_refresh_netdev_status() function will spend a lot of time. >>> >>> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just >>> return -ENODEV. >>> >>> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >>> Test-by: Mike Pattrick <mkp@redhat.com> >>> --- >>> lib/netdev-vport.c | 6 ++++++ >>> vswitchd/bridge.c | 2 ++ >>> 2 files changed, 8 insertions(+) >>> >> >> Applied. Thanks! >> >> Best regards, Ilya Maximets. >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Croid%40nvidia.com%7C8865bff084424563a30308d9bb16357f%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637746526329615023%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=53XCrfcyaY1orDpnmjdf7FV5espcWKPElCMDJKVU5Jw%3D&reserved=0 > > > Hi, > > We encounter an offload issue with this commit that OVS doesn't add > rules to TC and keeps on with OVS dp. > > To reproduce the issue, configure ovs with bridge and 2 ports, > restart openvswitch service and then do ping between the ports. > Removing/readding port to bridge or calling dpctl/dump-flows "fix" the > issue and ovs calls TC again. > > From first look it's because br->ofproto->type is not the same pointer > usually set in netdev_ports_insert() which can be dpi_class->type or > static "system" and in the netdev_ports_* functions the code uses > pointer equal instead of strcmp() to check the type. > So netdev_ports_lookup() returns error now. > > 2021-12-14T10:56:18.240Z|00010|dpif_netlink(handler1)|ERR|XXX > parse_flow_put 2243 get port for type system ret (nil) > > Even after changing all netdev_ports_* to use strcmp() and > netdev_ports_lookup() is ok and we do get to tc flow put. > Looks like we get an error from tc to offload. so checking this part > now. > > 2021-12-14T11:23:09.230Z|00035|dpif_netlink(handler1)|ERR|failed to > offload flow: No such device: enp8s0f0_0 > > So still looking on this but wanted to share about the behavior. > > Thanks, > Roi I forgot to mention but just to make it clear, need to set hw-offload=true.
On 12/14/21 13:07, Roi Dayan wrote: > > > On 2021-12-14 1:59 PM, Roi Dayan wrote: >> >> >> On 2021-12-09 3:16 PM, Ilya Maximets wrote: >>> On 12/8/21 03:57, lin huang wrote: >>>> From: linhuang <linhuang@ruijie.com.cn> >>>> >>>> Userspace tunnel doesn't have a valid device in the kernel. So >>>> get_ifindex() function (ioctl) always get error during >>>> adding a port, deleting a port or updating a port status. >>>> >>>> The info log is >>>> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) >>>> on vxlan_sys_4789 device failed: No such device" >>>> >>>> If there are a lot of userspace tunnel ports on a bridge, the >>>> iface_refresh_netdev_status() function will spend a lot of time. >>>> >>>> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just >>>> return -ENODEV. >>>> >>>> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >>>> Test-by: Mike Pattrick <mkp@redhat.com> >>>> --- >>>> lib/netdev-vport.c | 6 ++++++ >>>> vswitchd/bridge.c | 2 ++ >>>> 2 files changed, 8 insertions(+) >>>> >>> >>> Applied. Thanks! >>> >>> Best regards, Ilya Maximets. >>> _______________________________________________ >>> dev mailing list >>> dev@openvswitch.org >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Croid%40nvidia.com%7C8865bff084424563a30308d9bb16357f%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637746526329615023%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=53XCrfcyaY1orDpnmjdf7FV5espcWKPElCMDJKVU5Jw%3D&reserved=0 >> >> >> Hi, >> >> We encounter an offload issue with this commit that OVS doesn't add >> rules to TC and keeps on with OVS dp. >> >> To reproduce the issue, configure ovs with bridge and 2 ports, >> restart openvswitch service and then do ping between the ports. >> Removing/readding port to bridge or calling dpctl/dump-flows "fix" the >> issue and ovs calls TC again. >> >> From first look it's because br->ofproto->type is not the same pointer >> usually set in netdev_ports_insert() which can be dpi_class->type or >> static "system" and in the netdev_ports_* functions the code uses >> pointer equal instead of strcmp() to check the type. >> So netdev_ports_lookup() returns error now. >> >> 2021-12-14T10:56:18.240Z|00010|dpif_netlink(handler1)|ERR|XXX parse_flow_put 2243 get port for type system ret (nil) >> >> Even after changing all netdev_ports_* to use strcmp() and >> netdev_ports_lookup() is ok and we do get to tc flow put. >> Looks like we get an error from tc to offload. so checking this part >> now. >> >> 2021-12-14T11:23:09.230Z|00035|dpif_netlink(handler1)|ERR|failed to offload flow: No such device: enp8s0f0_0 >> >> So still looking on this but wanted to share about the behavior. >> >> Thanks, >> Roi > > I forgot to mention but just to make it clear, > need to set hw-offload=true. > Ugh. Sorry, my fault. Looking at the patch again now and this version seems weird. I don't know WTF I was thinking. The main thing is that we're calling netdev_set_dpif_type() twice with different arguments. First time with br->ofproto->type as a type and second time with dpif_normalize_type(dpif_type(dpif)). And these are not the same. And as you correctly noticed br->ofproto->type is not a constant string unlike the normalized one, hence cannot be checked by a pointer comparison. Could you try this: diff --git a/lib/dpif.c b/lib/dpif.c index 38bcb47cb..cff0bc2db 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -364,7 +364,8 @@ do_open(const char *name, const char *type, bool create, struct dpif **dpifp) err = netdev_open(dpif_port.name, dpif_port.type, &netdev); if (!err) { - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); + netdev_set_dpif_type(netdev, dpif_type_str); + netdev_ports_insert(netdev, &dpif_port); netdev_close(netdev); } else { VLOG_WARN("could not open netdev %s type %s: %s", @@ -602,10 +603,12 @@ dpif_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t *port_nop) const char *dpif_type_str = dpif_normalize_type(dpif_type(dpif)); struct dpif_port dpif_port; + netdev_set_dpif_type(netdev, dpif_type_str); + dpif_port.type = CONST_CAST(char *, netdev_get_type(netdev)); dpif_port.name = CONST_CAST(char *, netdev_name); dpif_port.port_no = port_no; - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); + netdev_ports_insert(netdev, &dpif_port); } } else { VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s", diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c index 8075cfbd8..b8dc108f3 100644 --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -573,12 +573,14 @@ netdev_ports_lookup(odp_port_t port_no, const char *dpif_type) } int -netdev_ports_insert(struct netdev *netdev, const char *dpif_type, - struct dpif_port *dpif_port) +netdev_ports_insert(struct netdev *netdev, struct dpif_port *dpif_port) { + const char *dpif_type = netdev_get_dpif_type(netdev); struct port_to_netdev_data *data; int ifindex = netdev_get_ifindex(netdev); + ovs_assert(dpif_type); + ovs_rwlock_wrlock(&netdev_hmap_rwlock); if (netdev_ports_lookup(dpif_port->port_no, dpif_type)) { ovs_rwlock_unlock(&netdev_hmap_rwlock); @@ -596,8 +598,6 @@ netdev_ports_insert(struct netdev *netdev, const char *dpif_type, data->ifindex = -1; } - netdev_set_dpif_type(netdev, dpif_type); - hmap_insert(&port_to_netdev, &data->portno_node, netdev_ports_hash(dpif_port->port_no, dpif_type)); ovs_rwlock_unlock(&netdev_hmap_rwlock); diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h index e7fcedae9..43a98d499 100644 --- a/lib/netdev-offload.h +++ b/lib/netdev-offload.h @@ -108,8 +108,7 @@ bool netdev_is_offload_rebalance_policy_enabled(void); int netdev_flow_get_n_flows(struct netdev *netdev, uint64_t *n_flows); struct dpif_port; -int netdev_ports_insert(struct netdev *, const char *dpif_type, - struct dpif_port *); +int netdev_ports_insert(struct netdev *, struct dpif_port *); struct netdev *netdev_ports_get(odp_port_t port, const char *dpif_type); int netdev_ports_remove(odp_port_t port, const char *dpif_type); odp_port_t netdev_ifindex_to_odp_port(int ifindex); diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index 513ef7ea9..5223aa897 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -2052,8 +2052,6 @@ iface_do_create(const struct bridge *br, goto error; } - netdev_set_dpif_type(netdev, br->ofproto->type); - error = iface_set_netdev_config(iface_cfg, netdev, errp); if (error) { goto error; --- ? Or else, we can just revert the patch. Best regards, Ilya Maximets.
On 2021-12-14 3:14 PM, Ilya Maximets wrote: > On 12/14/21 13:07, Roi Dayan wrote: >> >> >> On 2021-12-14 1:59 PM, Roi Dayan wrote: >>> >>> >>> On 2021-12-09 3:16 PM, Ilya Maximets wrote: >>>> On 12/8/21 03:57, lin huang wrote: >>>>> From: linhuang <linhuang@ruijie.com.cn> >>>>> >>>>> Userspace tunnel doesn't have a valid device in the kernel. So >>>>> get_ifindex() function (ioctl) always get error during >>>>> adding a port, deleting a port or updating a port status. >>>>> >>>>> The info log is >>>>> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) >>>>> on vxlan_sys_4789 device failed: No such device" >>>>> >>>>> If there are a lot of userspace tunnel ports on a bridge, the >>>>> iface_refresh_netdev_status() function will spend a lot of time. >>>>> >>>>> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just >>>>> return -ENODEV. >>>>> >>>>> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >>>>> Test-by: Mike Pattrick <mkp@redhat.com> >>>>> --- >>>>> lib/netdev-vport.c | 6 ++++++ >>>>> vswitchd/bridge.c | 2 ++ >>>>> 2 files changed, 8 insertions(+) >>>>> >>>> >>>> Applied. Thanks! >>>> >>>> Best regards, Ilya Maximets. >>>> _______________________________________________ >>>> dev mailing list >>>> dev@openvswitch.org >>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Croid%40nvidia.com%7C6700a97a81bd41ff4a2208d9bf03a65f%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637750844672247450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=jdOCdoJ9gWoj4kbl6dRTRAg0vIXdgoHYKAfVRjg1z44%3D&reserved=0 >>> >>> >>> Hi, >>> >>> We encounter an offload issue with this commit that OVS doesn't add >>> rules to TC and keeps on with OVS dp. >>> >>> To reproduce the issue, configure ovs with bridge and 2 ports, >>> restart openvswitch service and then do ping between the ports. >>> Removing/readding port to bridge or calling dpctl/dump-flows "fix" the >>> issue and ovs calls TC again. >>> >>> From first look it's because br->ofproto->type is not the same pointer >>> usually set in netdev_ports_insert() which can be dpi_class->type or >>> static "system" and in the netdev_ports_* functions the code uses >>> pointer equal instead of strcmp() to check the type. >>> So netdev_ports_lookup() returns error now. >>> >>> 2021-12-14T10:56:18.240Z|00010|dpif_netlink(handler1)|ERR|XXX parse_flow_put 2243 get port for type system ret (nil) >>> >>> Even after changing all netdev_ports_* to use strcmp() and >>> netdev_ports_lookup() is ok and we do get to tc flow put. >>> Looks like we get an error from tc to offload. so checking this part >>> now. >>> >>> 2021-12-14T11:23:09.230Z|00035|dpif_netlink(handler1)|ERR|failed to offload flow: No such device: enp8s0f0_0 >>> >>> So still looking on this but wanted to share about the behavior. >>> >>> Thanks, >>> Roi >> >> I forgot to mention but just to make it clear, >> need to set hw-offload=true. >> > > Ugh. Sorry, my fault. Looking at the patch again now and this version > seems weird. I don't know WTF I was thinking. The main thing is that > we're calling netdev_set_dpif_type() twice with different arguments. > First time with br->ofproto->type as a type and second time with > dpif_normalize_type(dpif_type(dpif)). And these are not the same. > And as you correctly noticed br->ofproto->type is not a constant string > unlike the normalized one, hence cannot be checked by a pointer comparison. > > > Could you try this: > > diff --git a/lib/dpif.c b/lib/dpif.c > index 38bcb47cb..cff0bc2db 100644 > --- a/lib/dpif.c > +++ b/lib/dpif.c > @@ -364,7 +364,8 @@ do_open(const char *name, const char *type, bool create, struct dpif **dpifp) > err = netdev_open(dpif_port.name, dpif_port.type, &netdev); > > if (!err) { > - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); > + netdev_set_dpif_type(netdev, dpif_type_str); > + netdev_ports_insert(netdev, &dpif_port); > netdev_close(netdev); > } else { > VLOG_WARN("could not open netdev %s type %s: %s", > @@ -602,10 +603,12 @@ dpif_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t *port_nop) > const char *dpif_type_str = dpif_normalize_type(dpif_type(dpif)); > struct dpif_port dpif_port; > > + netdev_set_dpif_type(netdev, dpif_type_str); > + > dpif_port.type = CONST_CAST(char *, netdev_get_type(netdev)); > dpif_port.name = CONST_CAST(char *, netdev_name); > dpif_port.port_no = port_no; > - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); > + netdev_ports_insert(netdev, &dpif_port); > } > } else { > VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s", > diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c > index 8075cfbd8..b8dc108f3 100644 > --- a/lib/netdev-offload.c > +++ b/lib/netdev-offload.c > @@ -573,12 +573,14 @@ netdev_ports_lookup(odp_port_t port_no, const char *dpif_type) > } > > int > -netdev_ports_insert(struct netdev *netdev, const char *dpif_type, > - struct dpif_port *dpif_port) > +netdev_ports_insert(struct netdev *netdev, struct dpif_port *dpif_port) > { > + const char *dpif_type = netdev_get_dpif_type(netdev); > struct port_to_netdev_data *data; > int ifindex = netdev_get_ifindex(netdev); > > + ovs_assert(dpif_type); > + > ovs_rwlock_wrlock(&netdev_hmap_rwlock); > if (netdev_ports_lookup(dpif_port->port_no, dpif_type)) { > ovs_rwlock_unlock(&netdev_hmap_rwlock); > @@ -596,8 +598,6 @@ netdev_ports_insert(struct netdev *netdev, const char *dpif_type, > data->ifindex = -1; > } > > - netdev_set_dpif_type(netdev, dpif_type); > - > hmap_insert(&port_to_netdev, &data->portno_node, > netdev_ports_hash(dpif_port->port_no, dpif_type)); > ovs_rwlock_unlock(&netdev_hmap_rwlock); > diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h > index e7fcedae9..43a98d499 100644 > --- a/lib/netdev-offload.h > +++ b/lib/netdev-offload.h > @@ -108,8 +108,7 @@ bool netdev_is_offload_rebalance_policy_enabled(void); > int netdev_flow_get_n_flows(struct netdev *netdev, uint64_t *n_flows); > > struct dpif_port; > -int netdev_ports_insert(struct netdev *, const char *dpif_type, > - struct dpif_port *); > +int netdev_ports_insert(struct netdev *, struct dpif_port *); > struct netdev *netdev_ports_get(odp_port_t port, const char *dpif_type); > int netdev_ports_remove(odp_port_t port, const char *dpif_type); > odp_port_t netdev_ifindex_to_odp_port(int ifindex); > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c > index 513ef7ea9..5223aa897 100644 > --- a/vswitchd/bridge.c > +++ b/vswitchd/bridge.c > @@ -2052,8 +2052,6 @@ iface_do_create(const struct bridge *br, > goto error; > } > > - netdev_set_dpif_type(netdev, br->ofproto->type); > - > error = iface_set_netdev_config(iface_cfg, netdev, errp); > if (error) { > goto error; > --- > > ? > > Or else, we can just revert the patch. > > Best regards, Ilya Maximets. Hi Ilya, Thanks for the patch. The patch is fine but I think it doesn't do much. You remove the set dpif type of br->ofproto->type which is eliminating the issue of not offloading to tc. Leaving only the strcmp when trying to get ifindex. You refactored netdev_ports_insert() not to accept dpif_type to set to the netdev and instead you set it just before calling. So it doesn't change any logic or am I missing something? it's ok by me if you think its more readable and tc issue is resolved. I also tried to reproduce the original log msg that fails to get ifindex on vxlan_sys_4789 and I didn't reproduce so I can't verify about it. The origin patch assumption was that in iface_do_create() we were missing netdev dpif type and thus got the error trying to config some devices. so maybe we were not? Thanks, Roi
On 12/15/21 10:50, Roi Dayan wrote: > > > On 2021-12-14 3:14 PM, Ilya Maximets wrote: >> On 12/14/21 13:07, Roi Dayan wrote: >>> >>> >>> On 2021-12-14 1:59 PM, Roi Dayan wrote: >>>> >>>> >>>> On 2021-12-09 3:16 PM, Ilya Maximets wrote: >>>>> On 12/8/21 03:57, lin huang wrote: >>>>>> From: linhuang <linhuang@ruijie.com.cn> >>>>>> >>>>>> Userspace tunnel doesn't have a valid device in the kernel. So >>>>>> get_ifindex() function (ioctl) always get error during >>>>>> adding a port, deleting a port or updating a port status. >>>>>> >>>>>> The info log is >>>>>> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) >>>>>> on vxlan_sys_4789 device failed: No such device" >>>>>> >>>>>> If there are a lot of userspace tunnel ports on a bridge, the >>>>>> iface_refresh_netdev_status() function will spend a lot of time. >>>>>> >>>>>> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just >>>>>> return -ENODEV. >>>>>> >>>>>> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >>>>>> Test-by: Mike Pattrick <mkp@redhat.com> >>>>>> --- >>>>>> lib/netdev-vport.c | 6 ++++++ >>>>>> vswitchd/bridge.c | 2 ++ >>>>>> 2 files changed, 8 insertions(+) >>>>>> >>>>> >>>>> Applied. Thanks! >>>>> >>>>> Best regards, Ilya Maximets. >>>>> _______________________________________________ >>>>> dev mailing list >>>>> dev@openvswitch.org >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Croid%40nvidia.com%7C6700a97a81bd41ff4a2208d9bf03a65f%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637750844672247450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=jdOCdoJ9gWoj4kbl6dRTRAg0vIXdgoHYKAfVRjg1z44%3D&reserved=0 >>>> >>>> >>>> Hi, >>>> >>>> We encounter an offload issue with this commit that OVS doesn't add >>>> rules to TC and keeps on with OVS dp. >>>> >>>> To reproduce the issue, configure ovs with bridge and 2 ports, >>>> restart openvswitch service and then do ping between the ports. >>>> Removing/readding port to bridge or calling dpctl/dump-flows "fix" the >>>> issue and ovs calls TC again. >>>> >>>> From first look it's because br->ofproto->type is not the same pointer >>>> usually set in netdev_ports_insert() which can be dpi_class->type or >>>> static "system" and in the netdev_ports_* functions the code uses >>>> pointer equal instead of strcmp() to check the type. >>>> So netdev_ports_lookup() returns error now. >>>> >>>> 2021-12-14T10:56:18.240Z|00010|dpif_netlink(handler1)|ERR|XXX parse_flow_put 2243 get port for type system ret (nil) >>>> >>>> Even after changing all netdev_ports_* to use strcmp() and >>>> netdev_ports_lookup() is ok and we do get to tc flow put. >>>> Looks like we get an error from tc to offload. so checking this part >>>> now. >>>> >>>> 2021-12-14T11:23:09.230Z|00035|dpif_netlink(handler1)|ERR|failed to offload flow: No such device: enp8s0f0_0 >>>> >>>> So still looking on this but wanted to share about the behavior. >>>> >>>> Thanks, >>>> Roi >>> >>> I forgot to mention but just to make it clear, >>> need to set hw-offload=true. >>> >> >> Ugh. Sorry, my fault. Looking at the patch again now and this version >> seems weird. I don't know WTF I was thinking. The main thing is that >> we're calling netdev_set_dpif_type() twice with different arguments. >> First time with br->ofproto->type as a type and second time with >> dpif_normalize_type(dpif_type(dpif)). And these are not the same. >> And as you correctly noticed br->ofproto->type is not a constant string >> unlike the normalized one, hence cannot be checked by a pointer comparison. >> >> >> Could you try this: >> >> diff --git a/lib/dpif.c b/lib/dpif.c >> index 38bcb47cb..cff0bc2db 100644 >> --- a/lib/dpif.c >> +++ b/lib/dpif.c >> @@ -364,7 +364,8 @@ do_open(const char *name, const char *type, bool create, struct dpif **dpifp) >> err = netdev_open(dpif_port.name, dpif_port.type, &netdev); >> if (!err) { >> - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); >> + netdev_set_dpif_type(netdev, dpif_type_str); >> + netdev_ports_insert(netdev, &dpif_port); >> netdev_close(netdev); >> } else { >> VLOG_WARN("could not open netdev %s type %s: %s", >> @@ -602,10 +603,12 @@ dpif_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t *port_nop) >> const char *dpif_type_str = dpif_normalize_type(dpif_type(dpif)); >> struct dpif_port dpif_port; >> + netdev_set_dpif_type(netdev, dpif_type_str); >> + >> dpif_port.type = CONST_CAST(char *, netdev_get_type(netdev)); >> dpif_port.name = CONST_CAST(char *, netdev_name); >> dpif_port.port_no = port_no; >> - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); >> + netdev_ports_insert(netdev, &dpif_port); >> } >> } else { >> VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s", >> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c >> index 8075cfbd8..b8dc108f3 100644 >> --- a/lib/netdev-offload.c >> +++ b/lib/netdev-offload.c >> @@ -573,12 +573,14 @@ netdev_ports_lookup(odp_port_t port_no, const char *dpif_type) >> } >> int >> -netdev_ports_insert(struct netdev *netdev, const char *dpif_type, >> - struct dpif_port *dpif_port) >> +netdev_ports_insert(struct netdev *netdev, struct dpif_port *dpif_port) >> { >> + const char *dpif_type = netdev_get_dpif_type(netdev); >> struct port_to_netdev_data *data; >> int ifindex = netdev_get_ifindex(netdev); >> + ovs_assert(dpif_type); >> + >> ovs_rwlock_wrlock(&netdev_hmap_rwlock); >> if (netdev_ports_lookup(dpif_port->port_no, dpif_type)) { >> ovs_rwlock_unlock(&netdev_hmap_rwlock); >> @@ -596,8 +598,6 @@ netdev_ports_insert(struct netdev *netdev, const char *dpif_type, >> data->ifindex = -1; >> } >> - netdev_set_dpif_type(netdev, dpif_type); >> - >> hmap_insert(&port_to_netdev, &data->portno_node, >> netdev_ports_hash(dpif_port->port_no, dpif_type)); >> ovs_rwlock_unlock(&netdev_hmap_rwlock); >> diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h >> index e7fcedae9..43a98d499 100644 >> --- a/lib/netdev-offload.h >> +++ b/lib/netdev-offload.h >> @@ -108,8 +108,7 @@ bool netdev_is_offload_rebalance_policy_enabled(void); >> int netdev_flow_get_n_flows(struct netdev *netdev, uint64_t *n_flows); >> struct dpif_port; >> -int netdev_ports_insert(struct netdev *, const char *dpif_type, >> - struct dpif_port *); >> +int netdev_ports_insert(struct netdev *, struct dpif_port *); >> struct netdev *netdev_ports_get(odp_port_t port, const char *dpif_type); >> int netdev_ports_remove(odp_port_t port, const char *dpif_type); >> odp_port_t netdev_ifindex_to_odp_port(int ifindex); >> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c >> index 513ef7ea9..5223aa897 100644 >> --- a/vswitchd/bridge.c >> +++ b/vswitchd/bridge.c >> @@ -2052,8 +2052,6 @@ iface_do_create(const struct bridge *br, >> goto error; >> } >> - netdev_set_dpif_type(netdev, br->ofproto->type); >> - >> error = iface_set_netdev_config(iface_cfg, netdev, errp); >> if (error) { >> goto error; >> --- >> >> ? >> >> Or else, we can just revert the patch. >> >> Best regards, Ilya Maximets. > > > Hi Ilya, > > Thanks for the patch. The patch is fine but I think it doesn't do much. > > You remove the set dpif type of br->ofproto->type which is eliminating > the issue of not offloading to tc. > Leaving only the strcmp when trying to get ifindex. > > You refactored netdev_ports_insert() not to accept dpif_type to set > to the netdev and instead you set it just before calling. > So it doesn't change any logic or am I missing something? The idea is to set the dpif_type before the first call to netdev_get_ifindex. Previously we had the netdev_get_ifindex() right at the start of netdev_ports_insert() and that triggered the INFO log. The patch added another netdev_set_dpif_type() to the bridge.c which used an incorrect argument, and didn't remove the "extra" call inside the netdev_ports_insert() for some reason. What I proposed above is to keep calling netdev_set_dpif_type() before the netdev_ports_insert(), so the first call to the netdev_get_ifindex() happens after the type is already set, so we don't have the INFO log. Also removing the old "extra" call inside the netdev_ports_insert(), since it's not needed. The call moved from bridge.c to dpif.c, because we need to have access to the dpif class, but bridge.c should not. Not trying to set the dpif_type inside the netdev_ports_insert(), because it's used now outside the offloading context. So, I think, it's cleaner to move the netdev_set_dpif_type() call outside of the netdev-offload module. So, yeah, the logic did not change much. Using the correct argument for the netdev_set_dpif_type() and refactoring along a way to make the code a bit cleaner. Does that make sense? > it's ok by me if you think its more readable and tc issue is resolved. > > I also tried to reproduce the original log msg that fails to get ifindex > on vxlan_sys_4789 and I didn't reproduce so I can't verify about it. > > The origin patch assumption was that in iface_do_create() we were > missing netdev dpif type and thus got the error trying to config some > devices. so maybe we were not? The issue was that netdev_ports_insert() requests ifindex before setting the dpif type, so we wasn't able to check it. And since the userspace tunnel devices doesn't exist in the kernel, attempt to get the ifindex fails and prints the INFO log. I can confirm that netdev_ports_insert() is a first place where ifindex is requested, so setting dpif type before calling this function should fix the original logging issue. The offloading works fine with the above change applied, right? If so, I'll send an official patch. Best regards, Ilya Maximets.
On 2021-12-15 2:44 PM, Ilya Maximets wrote: > On 12/15/21 10:50, Roi Dayan wrote: >> >> >> On 2021-12-14 3:14 PM, Ilya Maximets wrote: >>> On 12/14/21 13:07, Roi Dayan wrote: >>>> >>>> >>>> On 2021-12-14 1:59 PM, Roi Dayan wrote: >>>>> >>>>> >>>>> On 2021-12-09 3:16 PM, Ilya Maximets wrote: >>>>>> On 12/8/21 03:57, lin huang wrote: >>>>>>> From: linhuang <linhuang@ruijie.com.cn> >>>>>>> >>>>>>> Userspace tunnel doesn't have a valid device in the kernel. So >>>>>>> get_ifindex() function (ioctl) always get error during >>>>>>> adding a port, deleting a port or updating a port status. >>>>>>> >>>>>>> The info log is >>>>>>> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) >>>>>>> on vxlan_sys_4789 device failed: No such device" >>>>>>> >>>>>>> If there are a lot of userspace tunnel ports on a bridge, the >>>>>>> iface_refresh_netdev_status() function will spend a lot of time. >>>>>>> >>>>>>> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just >>>>>>> return -ENODEV. >>>>>>> >>>>>>> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >>>>>>> Test-by: Mike Pattrick <mkp@redhat.com> >>>>>>> --- >>>>>>> lib/netdev-vport.c | 6 ++++++ >>>>>>> vswitchd/bridge.c | 2 ++ >>>>>>> 2 files changed, 8 insertions(+) >>>>>>> >>>>>> >>>>>> Applied. Thanks! >>>>>> >>>>>> Best regards, Ilya Maximets. >>>>>> _______________________________________________ >>>>>> dev mailing list >>>>>> dev@openvswitch.org >>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Croid%40nvidia.com%7C68774137f0cd4286502e08d9bfc8b9a8%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637751691102561850%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=OxGVmhBKUZpiBAkH9e3OLWPV%2F4O7baVYmosmBsXwQb8%3D&reserved=0 >>>>> >>>>> >>>>> Hi, >>>>> >>>>> We encounter an offload issue with this commit that OVS doesn't add >>>>> rules to TC and keeps on with OVS dp. >>>>> >>>>> To reproduce the issue, configure ovs with bridge and 2 ports, >>>>> restart openvswitch service and then do ping between the ports. >>>>> Removing/readding port to bridge or calling dpctl/dump-flows "fix" the >>>>> issue and ovs calls TC again. >>>>> >>>>> From first look it's because br->ofproto->type is not the same pointer >>>>> usually set in netdev_ports_insert() which can be dpi_class->type or >>>>> static "system" and in the netdev_ports_* functions the code uses >>>>> pointer equal instead of strcmp() to check the type. >>>>> So netdev_ports_lookup() returns error now. >>>>> >>>>> 2021-12-14T10:56:18.240Z|00010|dpif_netlink(handler1)|ERR|XXX parse_flow_put 2243 get port for type system ret (nil) >>>>> >>>>> Even after changing all netdev_ports_* to use strcmp() and >>>>> netdev_ports_lookup() is ok and we do get to tc flow put. >>>>> Looks like we get an error from tc to offload. so checking this part >>>>> now. >>>>> >>>>> 2021-12-14T11:23:09.230Z|00035|dpif_netlink(handler1)|ERR|failed to offload flow: No such device: enp8s0f0_0 >>>>> >>>>> So still looking on this but wanted to share about the behavior. >>>>> >>>>> Thanks, >>>>> Roi >>>> >>>> I forgot to mention but just to make it clear, >>>> need to set hw-offload=true. >>>> >>> >>> Ugh. Sorry, my fault. Looking at the patch again now and this version >>> seems weird. I don't know WTF I was thinking. The main thing is that >>> we're calling netdev_set_dpif_type() twice with different arguments. >>> First time with br->ofproto->type as a type and second time with >>> dpif_normalize_type(dpif_type(dpif)). And these are not the same. >>> And as you correctly noticed br->ofproto->type is not a constant string >>> unlike the normalized one, hence cannot be checked by a pointer comparison. >>> >>> >>> Could you try this: >>> >>> diff --git a/lib/dpif.c b/lib/dpif.c >>> index 38bcb47cb..cff0bc2db 100644 >>> --- a/lib/dpif.c >>> +++ b/lib/dpif.c >>> @@ -364,7 +364,8 @@ do_open(const char *name, const char *type, bool create, struct dpif **dpifp) >>> err = netdev_open(dpif_port.name, dpif_port.type, &netdev); >>> if (!err) { >>> - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); >>> + netdev_set_dpif_type(netdev, dpif_type_str); >>> + netdev_ports_insert(netdev, &dpif_port); >>> netdev_close(netdev); >>> } else { >>> VLOG_WARN("could not open netdev %s type %s: %s", >>> @@ -602,10 +603,12 @@ dpif_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t *port_nop) >>> const char *dpif_type_str = dpif_normalize_type(dpif_type(dpif)); >>> struct dpif_port dpif_port; >>> + netdev_set_dpif_type(netdev, dpif_type_str); >>> + >>> dpif_port.type = CONST_CAST(char *, netdev_get_type(netdev)); >>> dpif_port.name = CONST_CAST(char *, netdev_name); >>> dpif_port.port_no = port_no; >>> - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); >>> + netdev_ports_insert(netdev, &dpif_port); >>> } >>> } else { >>> VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s", >>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c >>> index 8075cfbd8..b8dc108f3 100644 >>> --- a/lib/netdev-offload.c >>> +++ b/lib/netdev-offload.c >>> @@ -573,12 +573,14 @@ netdev_ports_lookup(odp_port_t port_no, const char *dpif_type) >>> } >>> int >>> -netdev_ports_insert(struct netdev *netdev, const char *dpif_type, >>> - struct dpif_port *dpif_port) >>> +netdev_ports_insert(struct netdev *netdev, struct dpif_port *dpif_port) >>> { >>> + const char *dpif_type = netdev_get_dpif_type(netdev); >>> struct port_to_netdev_data *data; >>> int ifindex = netdev_get_ifindex(netdev); >>> + ovs_assert(dpif_type); >>> + >>> ovs_rwlock_wrlock(&netdev_hmap_rwlock); >>> if (netdev_ports_lookup(dpif_port->port_no, dpif_type)) { >>> ovs_rwlock_unlock(&netdev_hmap_rwlock); >>> @@ -596,8 +598,6 @@ netdev_ports_insert(struct netdev *netdev, const char *dpif_type, >>> data->ifindex = -1; >>> } >>> - netdev_set_dpif_type(netdev, dpif_type); >>> - >>> hmap_insert(&port_to_netdev, &data->portno_node, >>> netdev_ports_hash(dpif_port->port_no, dpif_type)); >>> ovs_rwlock_unlock(&netdev_hmap_rwlock); >>> diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h >>> index e7fcedae9..43a98d499 100644 >>> --- a/lib/netdev-offload.h >>> +++ b/lib/netdev-offload.h >>> @@ -108,8 +108,7 @@ bool netdev_is_offload_rebalance_policy_enabled(void); >>> int netdev_flow_get_n_flows(struct netdev *netdev, uint64_t *n_flows); >>> struct dpif_port; >>> -int netdev_ports_insert(struct netdev *, const char *dpif_type, >>> - struct dpif_port *); >>> +int netdev_ports_insert(struct netdev *, struct dpif_port *); >>> struct netdev *netdev_ports_get(odp_port_t port, const char *dpif_type); >>> int netdev_ports_remove(odp_port_t port, const char *dpif_type); >>> odp_port_t netdev_ifindex_to_odp_port(int ifindex); >>> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c >>> index 513ef7ea9..5223aa897 100644 >>> --- a/vswitchd/bridge.c >>> +++ b/vswitchd/bridge.c >>> @@ -2052,8 +2052,6 @@ iface_do_create(const struct bridge *br, >>> goto error; >>> } >>> - netdev_set_dpif_type(netdev, br->ofproto->type); >>> - >>> error = iface_set_netdev_config(iface_cfg, netdev, errp); >>> if (error) { >>> goto error; >>> --- >>> >>> ? >>> >>> Or else, we can just revert the patch. >>> >>> Best regards, Ilya Maximets. >> >> >> Hi Ilya, >> >> Thanks for the patch. The patch is fine but I think it doesn't do much. >> >> You remove the set dpif type of br->ofproto->type which is eliminating >> the issue of not offloading to tc. >> Leaving only the strcmp when trying to get ifindex. >> >> You refactored netdev_ports_insert() not to accept dpif_type to set >> to the netdev and instead you set it just before calling. >> So it doesn't change any logic or am I missing something? > > The idea is to set the dpif_type before the first call to netdev_get_ifindex. > Previously we had the netdev_get_ifindex() right at the start of > netdev_ports_insert() and that triggered the INFO log. The patch added > another netdev_set_dpif_type() to the bridge.c which used an incorrect > argument, and didn't remove the "extra" call inside the netdev_ports_insert() > for some reason. > > What I proposed above is to keep calling netdev_set_dpif_type() before the > netdev_ports_insert(), so the first call to the netdev_get_ifindex() happens > after the type is already set, so we don't have the INFO log. Also removing > the old "extra" call inside the netdev_ports_insert(), since it's not needed. > > The call moved from bridge.c to dpif.c, because we need to have access to the > dpif class, but bridge.c should not. > > Not trying to set the dpif_type inside the netdev_ports_insert(), because it's > used now outside the offloading context. So, I think, it's cleaner to move the > netdev_set_dpif_type() call outside of the netdev-offload module. > > So, yeah, the logic did not change much. Using the correct argument for > the netdev_set_dpif_type() and refactoring along a way to make the code > a bit cleaner. > > Does that make sense? > >> it's ok by me if you think its more readable and tc issue is resolved. >> >> I also tried to reproduce the original log msg that fails to get ifindex >> on vxlan_sys_4789 and I didn't reproduce so I can't verify about it. >> >> The origin patch assumption was that in iface_do_create() we were >> missing netdev dpif type and thus got the error trying to config some >> devices. so maybe we were not? > > The issue was that netdev_ports_insert() requests ifindex before > setting the dpif type, so we wasn't able to check it. And since > the userspace tunnel devices doesn't exist in the kernel, attempt > to get the ifindex fails and prints the INFO log. > > I can confirm that netdev_ports_insert() is a first place where > ifindex is requested, so setting dpif type before calling this > function should fix the original logging issue. > > > The offloading works fine with the above change applied, right? > If so, I'll send an official patch. > > Best regards, Ilya Maximets. right. I missed the get_ifindex call at the top of netdev_ports_insert() when reading. so everything looks good. thanks. if important, you can also move get_ifindex call inside netdev_ports_insert() to right before we use it so if port exist there won't be a call to get_ifindex at all.
On 12/16/21 14:25, Roi Dayan wrote: > > > On 2021-12-15 2:44 PM, Ilya Maximets wrote: >> On 12/15/21 10:50, Roi Dayan wrote: >>> >>> >>> On 2021-12-14 3:14 PM, Ilya Maximets wrote: >>>> On 12/14/21 13:07, Roi Dayan wrote: >>>>> >>>>> >>>>> On 2021-12-14 1:59 PM, Roi Dayan wrote: >>>>>> >>>>>> >>>>>> On 2021-12-09 3:16 PM, Ilya Maximets wrote: >>>>>>> On 12/8/21 03:57, lin huang wrote: >>>>>>>> From: linhuang <linhuang@ruijie.com.cn> >>>>>>>> >>>>>>>> Userspace tunnel doesn't have a valid device in the kernel. So >>>>>>>> get_ifindex() function (ioctl) always get error during >>>>>>>> adding a port, deleting a port or updating a port status. >>>>>>>> >>>>>>>> The info log is >>>>>>>> "2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX) >>>>>>>> on vxlan_sys_4789 device failed: No such device" >>>>>>>> >>>>>>>> If there are a lot of userspace tunnel ports on a bridge, the >>>>>>>> iface_refresh_netdev_status() function will spend a lot of time. >>>>>>>> >>>>>>>> So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just >>>>>>>> return -ENODEV. >>>>>>>> >>>>>>>> Signed-off-by: Lin Huang <linhuang@ruijie.com.cn> >>>>>>>> Test-by: Mike Pattrick <mkp@redhat.com> >>>>>>>> --- >>>>>>>> lib/netdev-vport.c | 6 ++++++ >>>>>>>> vswitchd/bridge.c | 2 ++ >>>>>>>> 2 files changed, 8 insertions(+) >>>>>>>> >>>>>>> >>>>>>> Applied. Thanks! >>>>>>> >>>>>>> Best regards, Ilya Maximets. >>>>>>> _______________________________________________ >>>>>>> dev mailing list >>>>>>> dev@openvswitch.org >>>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=04%7C01%7Croid%40nvidia.com%7C68774137f0cd4286502e08d9bfc8b9a8%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637751691102561850%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=OxGVmhBKUZpiBAkH9e3OLWPV%2F4O7baVYmosmBsXwQb8%3D&reserved=0 >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> We encounter an offload issue with this commit that OVS doesn't add >>>>>> rules to TC and keeps on with OVS dp. >>>>>> >>>>>> To reproduce the issue, configure ovs with bridge and 2 ports, >>>>>> restart openvswitch service and then do ping between the ports. >>>>>> Removing/readding port to bridge or calling dpctl/dump-flows "fix" the >>>>>> issue and ovs calls TC again. >>>>>> >>>>>> From first look it's because br->ofproto->type is not the same pointer >>>>>> usually set in netdev_ports_insert() which can be dpi_class->type or >>>>>> static "system" and in the netdev_ports_* functions the code uses >>>>>> pointer equal instead of strcmp() to check the type. >>>>>> So netdev_ports_lookup() returns error now. >>>>>> >>>>>> 2021-12-14T10:56:18.240Z|00010|dpif_netlink(handler1)|ERR|XXX parse_flow_put 2243 get port for type system ret (nil) >>>>>> >>>>>> Even after changing all netdev_ports_* to use strcmp() and >>>>>> netdev_ports_lookup() is ok and we do get to tc flow put. >>>>>> Looks like we get an error from tc to offload. so checking this part >>>>>> now. >>>>>> >>>>>> 2021-12-14T11:23:09.230Z|00035|dpif_netlink(handler1)|ERR|failed to offload flow: No such device: enp8s0f0_0 >>>>>> >>>>>> So still looking on this but wanted to share about the behavior. >>>>>> >>>>>> Thanks, >>>>>> Roi >>>>> >>>>> I forgot to mention but just to make it clear, >>>>> need to set hw-offload=true. >>>>> >>>> >>>> Ugh. Sorry, my fault. Looking at the patch again now and this version >>>> seems weird. I don't know WTF I was thinking. The main thing is that >>>> we're calling netdev_set_dpif_type() twice with different arguments. >>>> First time with br->ofproto->type as a type and second time with >>>> dpif_normalize_type(dpif_type(dpif)). And these are not the same. >>>> And as you correctly noticed br->ofproto->type is not a constant string >>>> unlike the normalized one, hence cannot be checked by a pointer comparison. >>>> >>>> >>>> Could you try this: >>>> >>>> diff --git a/lib/dpif.c b/lib/dpif.c >>>> index 38bcb47cb..cff0bc2db 100644 >>>> --- a/lib/dpif.c >>>> +++ b/lib/dpif.c >>>> @@ -364,7 +364,8 @@ do_open(const char *name, const char *type, bool create, struct dpif **dpifp) >>>> err = netdev_open(dpif_port.name, dpif_port.type, &netdev); >>>> if (!err) { >>>> - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); >>>> + netdev_set_dpif_type(netdev, dpif_type_str); >>>> + netdev_ports_insert(netdev, &dpif_port); >>>> netdev_close(netdev); >>>> } else { >>>> VLOG_WARN("could not open netdev %s type %s: %s", >>>> @@ -602,10 +603,12 @@ dpif_port_add(struct dpif *dpif, struct netdev *netdev, odp_port_t *port_nop) >>>> const char *dpif_type_str = dpif_normalize_type(dpif_type(dpif)); >>>> struct dpif_port dpif_port; >>>> + netdev_set_dpif_type(netdev, dpif_type_str); >>>> + >>>> dpif_port.type = CONST_CAST(char *, netdev_get_type(netdev)); >>>> dpif_port.name = CONST_CAST(char *, netdev_name); >>>> dpif_port.port_no = port_no; >>>> - netdev_ports_insert(netdev, dpif_type_str, &dpif_port); >>>> + netdev_ports_insert(netdev, &dpif_port); >>>> } >>>> } else { >>>> VLOG_WARN_RL(&error_rl, "%s: failed to add %s as port: %s", >>>> diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c >>>> index 8075cfbd8..b8dc108f3 100644 >>>> --- a/lib/netdev-offload.c >>>> +++ b/lib/netdev-offload.c >>>> @@ -573,12 +573,14 @@ netdev_ports_lookup(odp_port_t port_no, const char *dpif_type) >>>> } >>>> int >>>> -netdev_ports_insert(struct netdev *netdev, const char *dpif_type, >>>> - struct dpif_port *dpif_port) >>>> +netdev_ports_insert(struct netdev *netdev, struct dpif_port *dpif_port) >>>> { >>>> + const char *dpif_type = netdev_get_dpif_type(netdev); >>>> struct port_to_netdev_data *data; >>>> int ifindex = netdev_get_ifindex(netdev); >>>> + ovs_assert(dpif_type); >>>> + >>>> ovs_rwlock_wrlock(&netdev_hmap_rwlock); >>>> if (netdev_ports_lookup(dpif_port->port_no, dpif_type)) { >>>> ovs_rwlock_unlock(&netdev_hmap_rwlock); >>>> @@ -596,8 +598,6 @@ netdev_ports_insert(struct netdev *netdev, const char *dpif_type, >>>> data->ifindex = -1; >>>> } >>>> - netdev_set_dpif_type(netdev, dpif_type); >>>> - >>>> hmap_insert(&port_to_netdev, &data->portno_node, >>>> netdev_ports_hash(dpif_port->port_no, dpif_type)); >>>> ovs_rwlock_unlock(&netdev_hmap_rwlock); >>>> diff --git a/lib/netdev-offload.h b/lib/netdev-offload.h >>>> index e7fcedae9..43a98d499 100644 >>>> --- a/lib/netdev-offload.h >>>> +++ b/lib/netdev-offload.h >>>> @@ -108,8 +108,7 @@ bool netdev_is_offload_rebalance_policy_enabled(void); >>>> int netdev_flow_get_n_flows(struct netdev *netdev, uint64_t *n_flows); >>>> struct dpif_port; >>>> -int netdev_ports_insert(struct netdev *, const char *dpif_type, >>>> - struct dpif_port *); >>>> +int netdev_ports_insert(struct netdev *, struct dpif_port *); >>>> struct netdev *netdev_ports_get(odp_port_t port, const char *dpif_type); >>>> int netdev_ports_remove(odp_port_t port, const char *dpif_type); >>>> odp_port_t netdev_ifindex_to_odp_port(int ifindex); >>>> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c >>>> index 513ef7ea9..5223aa897 100644 >>>> --- a/vswitchd/bridge.c >>>> +++ b/vswitchd/bridge.c >>>> @@ -2052,8 +2052,6 @@ iface_do_create(const struct bridge *br, >>>> goto error; >>>> } >>>> - netdev_set_dpif_type(netdev, br->ofproto->type); >>>> - >>>> error = iface_set_netdev_config(iface_cfg, netdev, errp); >>>> if (error) { >>>> goto error; >>>> --- >>>> >>>> ? >>>> >>>> Or else, we can just revert the patch. >>>> >>>> Best regards, Ilya Maximets. >>> >>> >>> Hi Ilya, >>> >>> Thanks for the patch. The patch is fine but I think it doesn't do much. >>> >>> You remove the set dpif type of br->ofproto->type which is eliminating >>> the issue of not offloading to tc. >>> Leaving only the strcmp when trying to get ifindex. >>> >>> You refactored netdev_ports_insert() not to accept dpif_type to set >>> to the netdev and instead you set it just before calling. >>> So it doesn't change any logic or am I missing something? >> >> The idea is to set the dpif_type before the first call to netdev_get_ifindex. >> Previously we had the netdev_get_ifindex() right at the start of >> netdev_ports_insert() and that triggered the INFO log. The patch added >> another netdev_set_dpif_type() to the bridge.c which used an incorrect >> argument, and didn't remove the "extra" call inside the netdev_ports_insert() >> for some reason. >> >> What I proposed above is to keep calling netdev_set_dpif_type() before the >> netdev_ports_insert(), so the first call to the netdev_get_ifindex() happens >> after the type is already set, so we don't have the INFO log. Also removing >> the old "extra" call inside the netdev_ports_insert(), since it's not needed. >> >> The call moved from bridge.c to dpif.c, because we need to have access to the >> dpif class, but bridge.c should not. >> >> Not trying to set the dpif_type inside the netdev_ports_insert(), because it's >> used now outside the offloading context. So, I think, it's cleaner to move the >> netdev_set_dpif_type() call outside of the netdev-offload module. >> >> So, yeah, the logic did not change much. Using the correct argument for >> the netdev_set_dpif_type() and refactoring along a way to make the code >> a bit cleaner. >> >> Does that make sense? >> >>> it's ok by me if you think its more readable and tc issue is resolved. >>> >>> I also tried to reproduce the original log msg that fails to get ifindex >>> on vxlan_sys_4789 and I didn't reproduce so I can't verify about it. >>> >>> The origin patch assumption was that in iface_do_create() we were >>> missing netdev dpif type and thus got the error trying to config some >>> devices. so maybe we were not? >> >> The issue was that netdev_ports_insert() requests ifindex before >> setting the dpif type, so we wasn't able to check it. And since >> the userspace tunnel devices doesn't exist in the kernel, attempt >> to get the ifindex fails and prints the INFO log. >> >> I can confirm that netdev_ports_insert() is a first place where >> ifindex is requested, so setting dpif type before calling this >> function should fix the original logging issue. >> >> >> The offloading works fine with the above change applied, right? >> If so, I'll send an official patch. >> >> Best regards, Ilya Maximets. > > right. I missed the get_ifindex call at the top of > netdev_ports_insert() when reading. > > so everything looks good. thanks. OK. Thanks! I'll prepare and send a proper patch. > > if important, you can also move get_ifindex call inside > netdev_ports_insert() to right before we use it so if port > exist there won't be a call to get_ifindex at all. That's an interesting point. But it might be better to not move a syscall under the write lock here. I'll keep as-is for now. We can change that later, if that will be an issue. Best regards, Ilya Maximets.
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index 499c0291c9..64331f74bf 100644 --- a/lib/netdev-vport.c +++ b/lib/netdev-vport.c @@ -1151,6 +1151,12 @@ netdev_vport_get_ifindex(const struct netdev *netdev_) { char buf[NETDEV_VPORT_NAME_BUFSIZE]; const char *name = netdev_vport_get_dpif_port(netdev_, buf, sizeof(buf)); + const char *dpif_type = netdev_get_dpif_type(netdev_); + + if (dpif_type && strcmp(dpif_type, "system")) { + /* Not a system device. */ + return -ENODEV; + } return linux_get_ifindex(name); } diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index 5223aa8970..513ef7ea9c 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -2052,6 +2052,8 @@ iface_do_create(const struct bridge *br, goto error; } + netdev_set_dpif_type(netdev, br->ofproto->type); + error = iface_set_netdev_config(iface_cfg, netdev, errp); if (error) { goto error;