Message ID | 20220704224505.1117988-3-i.maximets@ovn.org |
---|---|
State | Superseded |
Headers | show |
Series | tc: Fixes for tunnel offloading. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 5 Jul 2022, at 0:45, Ilya Maximets wrote: > TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may > create a masked match on this field. > > This change is important as we're not clearing the masks which wasn't > really used, so if OVS requests match on ports, we should use the > mask and clear, otherwise offloading will fail. > > Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Changes look good to me, and all system-traffic.at tests that were passing without the change are still passing, including the failing ERSPAN ones. Acked-by: Eelco Chaudron <echaudro@redhat.com>
On 2022-07-06 1:26 PM, Eelco Chaudron wrote: > > > On 5 Jul 2022, at 0:45, Ilya Maximets wrote: > >> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >> create a masked match on this field. >> >> This change is important as we're not clearing the masks which wasn't >> really used, so if OVS requests match on ports, we should use the >> mask and clear, otherwise offloading will fail. >> >> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> > > Changes look good to me, and all system-traffic.at tests that were passing without the change are still passing, including the failing ERSPAN ones. > > Acked-by: Eelco Chaudron <echaudro@redhat.com> > can we postpone to merge this? After this patch simple vxlan rules won't have enc dst port on decap rules and currently mlx5 driver must have enc dst port to do the offload of the decap rule. I'm checking about this limitation and if we can remove it but if we can postpone a bit so we won't break users with our nics.. thanks, Roi
On 2022-07-06 3:34 PM, Roi Dayan wrote: > > > On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >> >> >> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >> >>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>> create a masked match on this field. >>> >>> This change is important as we're not clearing the masks which wasn't >>> really used, so if OVS requests match on ports, we should use the >>> mask and clear, otherwise offloading will fail. >>> >>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >> >> Changes look good to me, and all system-traffic.at tests that were >> passing without the change are still passing, including the failing >> ERSPAN ones. >> >> Acked-by: Eelco Chaudron <echaudro@redhat.com> >> > > can we postpone to merge this? > After this patch simple vxlan rules won't have enc dst port on decap > rules and currently mlx5 driver must have enc dst port to do the > offload of the decap rule. > I'm checking about this limitation and if we can remove it but > if we can postpone a bit so we won't break users with our nics.. > > thanks, > Roi I finished with the testing and checking I needed. I don't have a problem with the patch. if something else will raise i'll start a different discussion. Acked-by: Roi Dayan <roid@nvidia.com>
On 2022-07-06 5:00 PM, Roi Dayan wrote: > > > On 2022-07-06 3:34 PM, Roi Dayan wrote: >> >> >> On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >>> >>> >>> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >>> >>>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>>> create a masked match on this field. >>>> >>>> This change is important as we're not clearing the masks which wasn't >>>> really used, so if OVS requests match on ports, we should use the >>>> mask and clear, otherwise offloading will fail. >>>> >>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>> >>> Changes look good to me, and all system-traffic.at tests that were >>> passing without the change are still passing, including the failing >>> ERSPAN ones. >>> >>> Acked-by: Eelco Chaudron <echaudro@redhat.com> >>> >> >> can we postpone to merge this? >> After this patch simple vxlan rules won't have enc dst port on decap >> rules and currently mlx5 driver must have enc dst port to do the >> offload of the decap rule. >> I'm checking about this limitation and if we can remove it but >> if we can postpone a bit so we won't break users with our nics.. >> >> thanks, >> Roi > > > I finished with the testing and checking I needed. > I don't have a problem with the patch. if something else will raise i'll > start a different discussion. > > Acked-by: Roi Dayan <roid@nvidia.com> I have a concern now. today for an offload driver to offload a tunnel decap rule it registers to TC indirect callback. Adding a TC rule on a tunnel device calls all registered drivers to offload the rule. With a single tunnel on a device this is ok but if a system have multiple tunnels with same src/dst ip but different enc dst port, adding a tc rule on one of the tunnel devices without matching enc dst port will call a driver to offload such a rule in the hw, but that rule is potentially matching all sw tunnels. a driver will have to match implicit on the the enc_dst_port to distinguish between the tunnels. OVS doesn't match explicit on enc_dst_port also when having multiple tunnels as the in SW the packets will be on the correct tunnel device and then pass tc or ovs datapath. After this patch offload drivers must to implicit match the enc_dst_port to avoid having hw rule matching all the tunnels with same src/dst ip. taking the tunnel maybe. e.g. for vxlan tunnel from vxlan->cfg.dst_port What do you think?
On 2022-07-07 9:48 AM, Roi Dayan wrote: > > > On 2022-07-06 5:00 PM, Roi Dayan wrote: >> >> >> On 2022-07-06 3:34 PM, Roi Dayan wrote: >>> >>> >>> On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >>>> >>>> >>>> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >>>> >>>>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>>>> create a masked match on this field. >>>>> >>>>> This change is important as we're not clearing the masks which wasn't >>>>> really used, so if OVS requests match on ports, we should use the >>>>> mask and clear, otherwise offloading will fail. >>>>> >>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>> >>>> Changes look good to me, and all system-traffic.at tests that were >>>> passing without the change are still passing, including the failing >>>> ERSPAN ones. >>>> >>>> Acked-by: Eelco Chaudron <echaudro@redhat.com> >>>> >>> >>> can we postpone to merge this? >>> After this patch simple vxlan rules won't have enc dst port on decap >>> rules and currently mlx5 driver must have enc dst port to do the >>> offload of the decap rule. >>> I'm checking about this limitation and if we can remove it but >>> if we can postpone a bit so we won't break users with our nics.. >>> >>> thanks, >>> Roi >> >> >> I finished with the testing and checking I needed. >> I don't have a problem with the patch. if something else will raise i'll >> start a different discussion. >> >> Acked-by: Roi Dayan <roid@nvidia.com> > > > I have a concern now. today for an offload driver to offload > a tunnel decap rule it registers to TC indirect callback. > > Adding a TC rule on a tunnel device calls all registered drivers > to offload the rule. > With a single tunnel on a device this is ok but if a system have > multiple tunnels with same src/dst ip but different enc dst port, > adding a tc rule on one of the tunnel devices without matching > enc dst port will call a driver to offload such a rule in the hw, > but that rule is potentially matching all sw tunnels. a driver > will have to match implicit on the the enc_dst_port to distinguish > between the tunnels. > > OVS doesn't match explicit on enc_dst_port also when having multiple > tunnels as the in SW the packets will be on the correct > tunnel device and then pass tc or ovs datapath. > > After this patch offload drivers must to implicit match the > enc_dst_port to avoid having hw rule matching all the tunnels > with same src/dst ip. taking the tunnel maybe. > e.g. for vxlan tunnel from vxlan->cfg.dst_port > > What do you think? to support what i said about implicit match for geneve and mpls for example will require kernel change to actually expose their struct in a header file. I think we should keep ovs TC to keep explicit match on enc_dst_port. We can add a comment explaining why.
On 7/7/22 09:01, Roi Dayan wrote: > > > On 2022-07-07 9:48 AM, Roi Dayan wrote: >> >> >> On 2022-07-06 5:00 PM, Roi Dayan wrote: >>> >>> >>> On 2022-07-06 3:34 PM, Roi Dayan wrote: >>>> >>>> >>>> On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >>>>> >>>>> >>>>> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >>>>> >>>>>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>>>>> create a masked match on this field. >>>>>> >>>>>> This change is important as we're not clearing the masks which wasn't >>>>>> really used, so if OVS requests match on ports, we should use the >>>>>> mask and clear, otherwise offloading will fail. >>>>>> >>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>> >>>>> Changes look good to me, and all system-traffic.at tests that were passing without the change are still passing, including the failing ERSPAN ones. >>>>> >>>>> Acked-by: Eelco Chaudron <echaudro@redhat.com> >>>>> >>>> >>>> can we postpone to merge this? >>>> After this patch simple vxlan rules won't have enc dst port on decap >>>> rules and currently mlx5 driver must have enc dst port to do the >>>> offload of the decap rule. >>>> I'm checking about this limitation and if we can remove it but >>>> if we can postpone a bit so we won't break users with our nics.. >>>> >>>> thanks, >>>> Roi >>> >>> >>> I finished with the testing and checking I needed. >>> I don't have a problem with the patch. if something else will raise i'll >>> start a different discussion. >>> >>> Acked-by: Roi Dayan <roid@nvidia.com> >> >> >> I have a concern now. today for an offload driver to offload >> a tunnel decap rule it registers to TC indirect callback. >> >> Adding a TC rule on a tunnel device calls all registered drivers >> to offload the rule. >> With a single tunnel on a device this is ok but if a system have multiple tunnels with same src/dst ip but different enc dst port, >> adding a tc rule on one of the tunnel devices without matching >> enc dst port will call a driver to offload such a rule in the hw, >> but that rule is potentially matching all sw tunnels. a driver >> will have to match implicit on the the enc_dst_port to distinguish >> between the tunnels. >> >> OVS doesn't match explicit on enc_dst_port also when having multiple >> tunnels as the in SW the packets will be on the correct >> tunnel device and then pass tc or ovs datapath. >> >> After this patch offload drivers must to implicit match the >> enc_dst_port to avoid having hw rule matching all the tunnels >> with same src/dst ip. taking the tunnel maybe. >> e.g. for vxlan tunnel from vxlan->cfg.dst_port >> >> What do you think? > > to support what i said about implicit match for geneve and mpls for > example will require kernel change to actually expose their struct > in a header file. But the match contains the input port (tunnel port) right? And that should be sufficient to distinguish tunnels. Does it mean that HW offload in the kernel ignores that match? Or am I reading that wrong? > > I think we should keep ovs TC to keep explicit match on enc_dst_port. > We can add a comment explaining why.
On 2022-07-12 12:23 PM, Ilya Maximets wrote: > On 7/7/22 09:01, Roi Dayan wrote: >> >> >> On 2022-07-07 9:48 AM, Roi Dayan wrote: >>> >>> >>> On 2022-07-06 5:00 PM, Roi Dayan wrote: >>>> >>>> >>>> On 2022-07-06 3:34 PM, Roi Dayan wrote: >>>>> >>>>> >>>>> On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >>>>>> >>>>>> >>>>>> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >>>>>> >>>>>>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>>>>>> create a masked match on this field. >>>>>>> >>>>>>> This change is important as we're not clearing the masks which wasn't >>>>>>> really used, so if OVS requests match on ports, we should use the >>>>>>> mask and clear, otherwise offloading will fail. >>>>>>> >>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>>> >>>>>> Changes look good to me, and all system-traffic.at tests that were passing without the change are still passing, including the failing ERSPAN ones. >>>>>> >>>>>> Acked-by: Eelco Chaudron <echaudro@redhat.com> >>>>>> >>>>> >>>>> can we postpone to merge this? >>>>> After this patch simple vxlan rules won't have enc dst port on decap >>>>> rules and currently mlx5 driver must have enc dst port to do the >>>>> offload of the decap rule. >>>>> I'm checking about this limitation and if we can remove it but >>>>> if we can postpone a bit so we won't break users with our nics.. >>>>> >>>>> thanks, >>>>> Roi >>>> >>>> >>>> I finished with the testing and checking I needed. >>>> I don't have a problem with the patch. if something else will raise i'll >>>> start a different discussion. >>>> >>>> Acked-by: Roi Dayan <roid@nvidia.com> >>> >>> >>> I have a concern now. today for an offload driver to offload >>> a tunnel decap rule it registers to TC indirect callback. >>> >>> Adding a TC rule on a tunnel device calls all registered drivers >>> to offload the rule. >>> With a single tunnel on a device this is ok but if a system have multiple tunnels with same src/dst ip but different enc dst port, >>> adding a tc rule on one of the tunnel devices without matching >>> enc dst port will call a driver to offload such a rule in the hw, >>> but that rule is potentially matching all sw tunnels. a driver >>> will have to match implicit on the the enc_dst_port to distinguish >>> between the tunnels. >>> >>> OVS doesn't match explicit on enc_dst_port also when having multiple >>> tunnels as the in SW the packets will be on the correct >>> tunnel device and then pass tc or ovs datapath. >>> >>> After this patch offload drivers must to implicit match the >>> enc_dst_port to avoid having hw rule matching all the tunnels >>> with same src/dst ip. taking the tunnel maybe. >>> e.g. for vxlan tunnel from vxlan->cfg.dst_port >>> >>> What do you think? >> >> to support what i said about implicit match for geneve and mpls for >> example will require kernel change to actually expose their struct >> in a header file. > > But the match contains the input port (tunnel port) right? And that > should be sufficient to distinguish tunnels. Does it mean that HW > offload in the kernel ignores that match? Or am I reading that wrong? > in sw the matching of the rules is done after the packet is already placed on the sw port. so rule on vxlan1 will patch only packets on vxlan1. for offloaded rule, the input port is ignored for tunnels as once offloaded, the hw match an incoming packet and is not aware of the sw ports (tunnels). >> >> I think we should keep ovs TC to keep explicit match on enc_dst_port. >> We can add a comment explaining why. >
On 7/12/22 11:47, Roi Dayan wrote: > > > On 2022-07-12 12:23 PM, Ilya Maximets wrote: >> On 7/7/22 09:01, Roi Dayan wrote: >>> >>> >>> On 2022-07-07 9:48 AM, Roi Dayan wrote: >>>> >>>> >>>> On 2022-07-06 5:00 PM, Roi Dayan wrote: >>>>> >>>>> >>>>> On 2022-07-06 3:34 PM, Roi Dayan wrote: >>>>>> >>>>>> >>>>>> On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >>>>>>> >>>>>>> >>>>>>> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >>>>>>> >>>>>>>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>>>>>>> create a masked match on this field. >>>>>>>> >>>>>>>> This change is important as we're not clearing the masks which wasn't >>>>>>>> really used, so if OVS requests match on ports, we should use the >>>>>>>> mask and clear, otherwise offloading will fail. >>>>>>>> >>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>>>> >>>>>>> Changes look good to me, and all system-traffic.at tests that were passing without the change are still passing, including the failing ERSPAN ones. >>>>>>> >>>>>>> Acked-by: Eelco Chaudron <echaudro@redhat.com> >>>>>>> >>>>>> >>>>>> can we postpone to merge this? >>>>>> After this patch simple vxlan rules won't have enc dst port on decap >>>>>> rules and currently mlx5 driver must have enc dst port to do the >>>>>> offload of the decap rule. >>>>>> I'm checking about this limitation and if we can remove it but >>>>>> if we can postpone a bit so we won't break users with our nics.. >>>>>> >>>>>> thanks, >>>>>> Roi >>>>> >>>>> >>>>> I finished with the testing and checking I needed. >>>>> I don't have a problem with the patch. if something else will raise i'll >>>>> start a different discussion. >>>>> >>>>> Acked-by: Roi Dayan <roid@nvidia.com> >>>> >>>> >>>> I have a concern now. today for an offload driver to offload >>>> a tunnel decap rule it registers to TC indirect callback. >>>> >>>> Adding a TC rule on a tunnel device calls all registered drivers >>>> to offload the rule. >>>> With a single tunnel on a device this is ok but if a system have multiple tunnels with same src/dst ip but different enc dst port, >>>> adding a tc rule on one of the tunnel devices without matching >>>> enc dst port will call a driver to offload such a rule in the hw, >>>> but that rule is potentially matching all sw tunnels. a driver >>>> will have to match implicit on the the enc_dst_port to distinguish >>>> between the tunnels. >>>> >>>> OVS doesn't match explicit on enc_dst_port also when having multiple >>>> tunnels as the in SW the packets will be on the correct >>>> tunnel device and then pass tc or ovs datapath. >>>> >>>> After this patch offload drivers must to implicit match the >>>> enc_dst_port to avoid having hw rule matching all the tunnels >>>> with same src/dst ip. taking the tunnel maybe. >>>> e.g. for vxlan tunnel from vxlan->cfg.dst_port >>>> >>>> What do you think? >>> >>> to support what i said about implicit match for geneve and mpls for >>> example will require kernel change to actually expose their struct >>> in a header file. >> >> But the match contains the input port (tunnel port) right? And that >> should be sufficient to distinguish tunnels. Does it mean that HW >> offload in the kernel ignores that match? Or am I reading that wrong? >> > > in sw the matching of the rules is done after the packet is already > placed on the sw port. so rule on vxlan1 will patch only packets on > vxlan1. > > for offloaded rule, the input port is ignored for tunnels as once > offloaded, the hw match an incoming packet and is not aware of the > sw ports (tunnels). Sounds like an architectural bug in the kernel. Should the driver reject offload of rules with non-exact match on tunnel dst port in this case? OTOH, doesn't the original flow that is matching on the tunnel packet have a match on tp_dst? I'm not 100% sure that is always true though, but it probably is in most cases. > >>> >>> I think we should keep ovs TC to keep explicit match on enc_dst_port. >>> We can add a comment explaining why. >>
On 2022-07-13 7:46 PM, Ilya Maximets wrote: > On 7/12/22 11:47, Roi Dayan wrote: >> >> >> On 2022-07-12 12:23 PM, Ilya Maximets wrote: >>> On 7/7/22 09:01, Roi Dayan wrote: >>>> >>>> >>>> On 2022-07-07 9:48 AM, Roi Dayan wrote: >>>>> >>>>> >>>>> On 2022-07-06 5:00 PM, Roi Dayan wrote: >>>>>> >>>>>> >>>>>> On 2022-07-06 3:34 PM, Roi Dayan wrote: >>>>>>> >>>>>>> >>>>>>> On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >>>>>>>> >>>>>>>>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>>>>>>>> create a masked match on this field. >>>>>>>>> >>>>>>>>> This change is important as we're not clearing the masks which wasn't >>>>>>>>> really used, so if OVS requests match on ports, we should use the >>>>>>>>> mask and clear, otherwise offloading will fail. >>>>>>>>> >>>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>>>>> >>>>>>>> Changes look good to me, and all system-traffic.at tests that were passing without the change are still passing, including the failing ERSPAN ones. >>>>>>>> >>>>>>>> Acked-by: Eelco Chaudron <echaudro@redhat.com> >>>>>>>> >>>>>>> >>>>>>> can we postpone to merge this? >>>>>>> After this patch simple vxlan rules won't have enc dst port on decap >>>>>>> rules and currently mlx5 driver must have enc dst port to do the >>>>>>> offload of the decap rule. >>>>>>> I'm checking about this limitation and if we can remove it but >>>>>>> if we can postpone a bit so we won't break users with our nics.. >>>>>>> >>>>>>> thanks, >>>>>>> Roi >>>>>> >>>>>> >>>>>> I finished with the testing and checking I needed. >>>>>> I don't have a problem with the patch. if something else will raise i'll >>>>>> start a different discussion. >>>>>> >>>>>> Acked-by: Roi Dayan <roid@nvidia.com> >>>>> >>>>> >>>>> I have a concern now. today for an offload driver to offload >>>>> a tunnel decap rule it registers to TC indirect callback. >>>>> >>>>> Adding a TC rule on a tunnel device calls all registered drivers >>>>> to offload the rule. >>>>> With a single tunnel on a device this is ok but if a system have multiple tunnels with same src/dst ip but different enc dst port, >>>>> adding a tc rule on one of the tunnel devices without matching >>>>> enc dst port will call a driver to offload such a rule in the hw, >>>>> but that rule is potentially matching all sw tunnels. a driver >>>>> will have to match implicit on the the enc_dst_port to distinguish >>>>> between the tunnels. >>>>> >>>>> OVS doesn't match explicit on enc_dst_port also when having multiple >>>>> tunnels as the in SW the packets will be on the correct >>>>> tunnel device and then pass tc or ovs datapath. >>>>> >>>>> After this patch offload drivers must to implicit match the >>>>> enc_dst_port to avoid having hw rule matching all the tunnels >>>>> with same src/dst ip. taking the tunnel maybe. >>>>> e.g. for vxlan tunnel from vxlan->cfg.dst_port >>>>> >>>>> What do you think? >>>> >>>> to support what i said about implicit match for geneve and mpls for >>>> example will require kernel change to actually expose their struct >>>> in a header file. >>> >>> But the match contains the input port (tunnel port) right? And that >>> should be sufficient to distinguish tunnels. Does it mean that HW >>> offload in the kernel ignores that match? Or am I reading that wrong? >>> >> >> in sw the matching of the rules is done after the packet is already >> placed on the sw port. so rule on vxlan1 will patch only packets on >> vxlan1. >> >> for offloaded rule, the input port is ignored for tunnels as once >> offloaded, the hw match an incoming packet and is not aware of the >> sw ports (tunnels). > > Sounds like an architectural bug in the kernel. Should the driver > reject offload of rules with non-exact match on tunnel dst port in > this case? The drivers do reject such rules to avoid the issue described. > > OTOH, doesn't the original flow that is matching on the tunnel > packet have a match on tp_dst? I'm not 100% sure that is always > true though, but it probably is in most cases. > I didn't understand the question. what original flow? with this commit the match doesn't pass tp_dst match to tc. >> >>>> >>>> I think we should keep ovs TC to keep explicit match on enc_dst_port. >>>> We can add a comment explaining why. >>> >
On 7/14/22 14:23, Roi Dayan wrote: > > > On 2022-07-13 7:46 PM, Ilya Maximets wrote: >> On 7/12/22 11:47, Roi Dayan wrote: >>> >>> >>> On 2022-07-12 12:23 PM, Ilya Maximets wrote: >>>> On 7/7/22 09:01, Roi Dayan wrote: >>>>> >>>>> >>>>> On 2022-07-07 9:48 AM, Roi Dayan wrote: >>>>>> >>>>>> >>>>>> On 2022-07-06 5:00 PM, Roi Dayan wrote: >>>>>>> >>>>>>> >>>>>>> On 2022-07-06 3:34 PM, Roi Dayan wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 2022-07-06 1:26 PM, Eelco Chaudron wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 5 Jul 2022, at 0:45, Ilya Maximets wrote: >>>>>>>>> >>>>>>>>>> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may >>>>>>>>>> create a masked match on this field. >>>>>>>>>> >>>>>>>>>> This change is important as we're not clearing the masks which wasn't >>>>>>>>>> really used, so if OVS requests match on ports, we should use the >>>>>>>>>> mask and clear, otherwise offloading will fail. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> >>>>>>>>> >>>>>>>>> Changes look good to me, and all system-traffic.at tests that were passing without the change are still passing, including the failing ERSPAN ones. >>>>>>>>> >>>>>>>>> Acked-by: Eelco Chaudron <echaudro@redhat.com> >>>>>>>>> >>>>>>>> >>>>>>>> can we postpone to merge this? >>>>>>>> After this patch simple vxlan rules won't have enc dst port on decap >>>>>>>> rules and currently mlx5 driver must have enc dst port to do the >>>>>>>> offload of the decap rule. >>>>>>>> I'm checking about this limitation and if we can remove it but >>>>>>>> if we can postpone a bit so we won't break users with our nics.. >>>>>>>> >>>>>>>> thanks, >>>>>>>> Roi >>>>>>> >>>>>>> >>>>>>> I finished with the testing and checking I needed. >>>>>>> I don't have a problem with the patch. if something else will raise i'll >>>>>>> start a different discussion. >>>>>>> >>>>>>> Acked-by: Roi Dayan <roid@nvidia.com> >>>>>> >>>>>> >>>>>> I have a concern now. today for an offload driver to offload >>>>>> a tunnel decap rule it registers to TC indirect callback. >>>>>> >>>>>> Adding a TC rule on a tunnel device calls all registered drivers >>>>>> to offload the rule. >>>>>> With a single tunnel on a device this is ok but if a system have multiple tunnels with same src/dst ip but different enc dst port, >>>>>> adding a tc rule on one of the tunnel devices without matching >>>>>> enc dst port will call a driver to offload such a rule in the hw, >>>>>> but that rule is potentially matching all sw tunnels. a driver >>>>>> will have to match implicit on the the enc_dst_port to distinguish >>>>>> between the tunnels. >>>>>> >>>>>> OVS doesn't match explicit on enc_dst_port also when having multiple >>>>>> tunnels as the in SW the packets will be on the correct >>>>>> tunnel device and then pass tc or ovs datapath. >>>>>> >>>>>> After this patch offload drivers must to implicit match the >>>>>> enc_dst_port to avoid having hw rule matching all the tunnels >>>>>> with same src/dst ip. taking the tunnel maybe. >>>>>> e.g. for vxlan tunnel from vxlan->cfg.dst_port >>>>>> >>>>>> What do you think? >>>>> >>>>> to support what i said about implicit match for geneve and mpls for >>>>> example will require kernel change to actually expose their struct >>>>> in a header file. >>>> >>>> But the match contains the input port (tunnel port) right? And that >>>> should be sufficient to distinguish tunnels. Does it mean that HW >>>> offload in the kernel ignores that match? Or am I reading that wrong? >>>> >>> >>> in sw the matching of the rules is done after the packet is already >>> placed on the sw port. so rule on vxlan1 will patch only packets on >>> vxlan1. >>> >>> for offloaded rule, the input port is ignored for tunnels as once >>> offloaded, the hw match an incoming packet and is not aware of the >>> sw ports (tunnels). >> >> Sounds like an architectural bug in the kernel. Should the driver >> reject offload of rules with non-exact match on tunnel dst port in >> this case? > > The drivers do reject such rules to avoid the issue described. So the behavior is correct, it just won't be offloaded. OK. > >> >> OTOH, doesn't the original flow that is matching on the tunnel >> packet have a match on tp_dst? I'm not 100% sure that is always >> true though, but it probably is in most cases. >> > > I didn't understand the question. what original flow? > with this commit the match doesn't pass tp_dst match to tc. I meant 'tp_dst' match in the first datapath flow that matches on the encapsulated packet and sends it to a tunnel port for decapsulation. The second flow that matches on tunnel metadata after decapsulation will have tunnel metadata tp_dst with a zero mask, while the first flow will likely have an exact match on the tp_dst, because it was used to decide to which tunnel port this packet should go for decapsulation. I just was thinking that if the driver/HW can view these two flows as one, there should be sufficient data to not swallow the unrelated traffic. But I don't really know that looks like on a driver/HW level. > >>> >>>>> >>>>> I think we should keep ovs TC to keep explicit match on enc_dst_port. >>>>> We can add a comment explaining why. >>>> >>
diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c index e62687c82..3f166599c 100644 --- a/lib/netdev-offload-tc.c +++ b/lib/netdev-offload-tc.c @@ -1651,6 +1651,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match, flower.mask.tunnel.ipv6.ipv6_dst = tnl_mask->ipv6_dst; flower.mask.tunnel.tos = tnl_mask->ip_tos; flower.mask.tunnel.ttl = tnl_mask->ip_ttl; + flower.mask.tunnel.tp_src = tnl_mask->tp_src; + flower.mask.tunnel.tp_dst = tnl_mask->tp_dst; flower.mask.tunnel.id = (tnl->flags & FLOW_TNL_F_KEY) ? tnl_mask->tun_id : 0; memset(&tnl_mask->tun_id, 0, sizeof tnl_mask->tun_id); memset(&tnl_mask->ip_src, 0, sizeof tnl_mask->ip_src); @@ -1659,6 +1661,8 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match, memset(&tnl_mask->ipv6_dst, 0, sizeof tnl_mask->ipv6_dst); memset(&tnl_mask->ip_tos, 0, sizeof tnl_mask->ip_tos); memset(&tnl_mask->ip_ttl, 0, sizeof tnl_mask->ip_ttl); + memset(&tnl_mask->tp_src, 0, sizeof tnl_mask->tp_src); + memset(&tnl_mask->tp_dst, 0, sizeof tnl_mask->tp_dst); tnl_mask->flags &= ~FLOW_TNL_F_KEY; /* XXX: This is wrong! We're ignoring DF and CSUM flags configuration diff --git a/lib/tc.c b/lib/tc.c index bbb8c86f7..17df1ee77 100644 --- a/lib/tc.c +++ b/lib/tc.c @@ -760,9 +760,11 @@ nl_parse_flower_tunnel(struct nlattr **attrs, struct tc_flower *flower) flower->key.tunnel.ipv6.ipv6_dst = nl_attr_get_in6_addr(attrs[TCA_FLOWER_KEY_ENC_IPV6_DST]); } - if (attrs[TCA_FLOWER_KEY_ENC_UDP_DST_PORT]) { + if (attrs[TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK]) { flower->key.tunnel.tp_dst = nl_attr_get_be16(attrs[TCA_FLOWER_KEY_ENC_UDP_DST_PORT]); + flower->mask.tunnel.tp_dst = + nl_attr_get_be16(attrs[TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK]); } if (attrs[TCA_FLOWER_KEY_ENC_IP_TOS_MASK]) { flower->key.tunnel.tos = @@ -2849,13 +2851,14 @@ nl_msg_put_flower_tunnel(struct ofpbuf *request, struct tc_flower *flower) struct in6_addr *ipv6_dst_mask = &flower->mask.tunnel.ipv6.ipv6_dst; struct in6_addr *ipv6_src = &flower->key.tunnel.ipv6.ipv6_src; struct in6_addr *ipv6_dst = &flower->key.tunnel.ipv6.ipv6_dst; - ovs_be16 tp_dst = flower->key.tunnel.tp_dst; ovs_be32 id = be64_to_be32(flower->key.tunnel.id); + ovs_be16 tp_dst = flower->key.tunnel.tp_dst; uint8_t tos = flower->key.tunnel.tos; uint8_t ttl = flower->key.tunnel.ttl; uint8_t tos_mask = flower->mask.tunnel.tos; uint8_t ttl_mask = flower->mask.tunnel.ttl; ovs_be64 id_mask = flower->mask.tunnel.id; + ovs_be16 tp_dst_mask = flower->mask.tunnel.tp_dst; if (ipv4_dst_mask || ipv4_src_mask) { nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_IPV4_DST_MASK, @@ -2881,8 +2884,10 @@ nl_msg_put_flower_tunnel(struct ofpbuf *request, struct tc_flower *flower) nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TTL, ttl); nl_msg_put_u8(request, TCA_FLOWER_KEY_ENC_IP_TTL_MASK, ttl_mask); } - if (tp_dst) { + if (tp_dst_mask) { nl_msg_put_be16(request, TCA_FLOWER_KEY_ENC_UDP_DST_PORT, tp_dst); + nl_msg_put_be16(request, TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, + tp_dst_mask); } if (id_mask) { nl_msg_put_be32(request, TCA_FLOWER_KEY_ENC_KEY_ID, id);
TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may create a masked match on this field. This change is important as we're not clearing the masks which wasn't really used, so if OVS requests match on ports, we should use the mask and clear, otherwise offloading will fail. Signed-off-by: Ilya Maximets <i.maximets@ovn.org> --- lib/netdev-offload-tc.c | 4 ++++ lib/tc.c | 11 ++++++++--- 2 files changed, 12 insertions(+), 3 deletions(-)