Message ID | 20230313102751.594104-1-roid@nvidia.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev,1/1] tc: translate mirror/stolen to mirred | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
On 13 Mar 2023, at 11:27, Roi Dayan wrote: > From: Oz Shlomo <ozsh@nvidia.com> > > Currently jumping over a output-to-port action is translated to tc > mirror action and stolen control action. > However, the tc control action is not propagated to the hw offload action, > thus the hardware action will mirror the packet and continue to the next > action. > > Transalte mirror/stolen to "mirred egress redirect" action which terminates > the action list. I did not review the code, but before I do, can you add a test case for this scenario? Asking as there is quite some translation logic involved and I want to make sure we are not breaking any of this. > Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.") > Signed-off-by: Oz Shlomo <ozsh@nvidia.com> > Reviewed-by: Roi Dayan <roid@nvidia.com> > --- > lib/tc.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/lib/tc.c b/lib/tc.c > index 4c07e22162e7..fe06d3c110b6 100644 > --- a/lib/tc.c > +++ b/lib/tc.c > @@ -3341,13 +3341,16 @@ nl_msg_put_flower_acts(struct ofpbuf *request, struct tc_flower *flower) > } > action->jump_action = JUMP_ACTION_STOP; > } else { > - if (ingress) { > - nl_msg_put_act_mirred(request, ifindex, action_pc, > - TCA_INGRESS_MIRROR); > + int out_action; > + > + if (action_pc == TC_ACT_STOLEN) { > + out_action = TCA_EGRESS_REDIR; > } else { > - nl_msg_put_act_mirred(request, ifindex, action_pc, > - TCA_EGRESS_MIRROR); > + out_action = TCA_EGRESS_MIRROR; > } > + > + nl_msg_put_act_mirred(request, ifindex, action_pc, > + out_action); > } > nl_msg_put_act_cookie(request, &flower->act_cookie); > nl_msg_put_act_flags(request); > -- > 2.38.0
On 13/03/2023 12:35, Eelco Chaudron wrote: > > > On 13 Mar 2023, at 11:27, Roi Dayan wrote: > >> From: Oz Shlomo <ozsh@nvidia.com> >> >> Currently jumping over a output-to-port action is translated to tc >> mirror action and stolen control action. >> However, the tc control action is not propagated to the hw offload action, >> thus the hardware action will mirror the packet and continue to the next >> action. >> >> Transalte mirror/stolen to "mirred egress redirect" action which terminates >> the action list. > > I did not review the code, but before I do, can you add a test case for this scenario? Asking as there is quite some translation logic involved and I want to make sure we are not breaking any of this. thanks Eelco. We'll take a look about a test case for this and the fix cleaning chains commit. > >> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.") >> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> >> Reviewed-by: Roi Dayan <roid@nvidia.com> >> --- >> lib/tc.c | 13 ++++++++----- >> 1 file changed, 8 insertions(+), 5 deletions(-) >> >> diff --git a/lib/tc.c b/lib/tc.c >> index 4c07e22162e7..fe06d3c110b6 100644 >> --- a/lib/tc.c >> +++ b/lib/tc.c >> @@ -3341,13 +3341,16 @@ nl_msg_put_flower_acts(struct ofpbuf *request, struct tc_flower *flower) >> } >> action->jump_action = JUMP_ACTION_STOP; >> } else { >> - if (ingress) { >> - nl_msg_put_act_mirred(request, ifindex, action_pc, >> - TCA_INGRESS_MIRROR); >> + int out_action; >> + >> + if (action_pc == TC_ACT_STOLEN) { >> + out_action = TCA_EGRESS_REDIR; >> } else { >> - nl_msg_put_act_mirred(request, ifindex, action_pc, >> - TCA_EGRESS_MIRROR); >> + out_action = TCA_EGRESS_MIRROR; >> } >> + >> + nl_msg_put_act_mirred(request, ifindex, action_pc, >> + out_action); >> } >> nl_msg_put_act_cookie(request, &flower->act_cookie); >> nl_msg_put_act_flags(request); >> -- >> 2.38.0 >
On 3/13/23 11:27, Roi Dayan wrote: > From: Oz Shlomo <ozsh@nvidia.com> > > Currently jumping over a output-to-port action is translated to tc > mirror action and stolen control action. > However, the tc control action is not propagated to the hw offload action, > thus the hardware action will mirror the packet and continue to the next > action. This sounds like a bug in the hardware offloading in the kernel/driver. Shouldn't this be fixed in the kernel? Bets regards, Ilya Maximets.
On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote: > On 3/13/23 11:27, Roi Dayan wrote: > > From: Oz Shlomo <ozsh@nvidia.com> > > > > Currently jumping over a output-to-port action is translated to tc > > mirror action and stolen control action. > > However, the tc control action is not propagated to the hw offload action, > > thus the hardware action will mirror the packet and continue to the next > > action. > > This sounds like a bug in the hardware offloading in the kernel/driver. > Shouldn't this be fixed in the kernel? Hi Ilya, I think what we are seeing is a mismatch between how OVS translates flows to TC, and how TC actually works (since forever). It seems to me that fixing the translation, as this patch does, is the right approach. But I could be missing something (often the case).
On 3/14/23 12:15, Simon Horman wrote: > On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote: >> On 3/13/23 11:27, Roi Dayan wrote: >>> From: Oz Shlomo <ozsh@nvidia.com> >>> >>> Currently jumping over a output-to-port action is translated to tc >>> mirror action and stolen control action. >>> However, the tc control action is not propagated to the hw offload action, >>> thus the hardware action will mirror the packet and continue to the next >>> action. >> >> This sounds like a bug in the hardware offloading in the kernel/driver. >> Shouldn't this be fixed in the kernel? > > Hi Ilya, > > I think what we are seeing is a mismatch between how > OVS translates flows to TC, and how TC actually works (since forever). > > It seems to me that fixing the translation, as this patch does, > is the right approach. But I could be missing something (often the case). I'm not a TC expert, but the description sounded like the driver or some other code in the kernel is ignoring the TC_ACT_STOLEN. If that's correct, that must be a bug in the kernel. We could still change the way OVS translates flows for TC in order to avoid the kernel bug as long as we're not introducing anything nasty in the flow translation logic. But it still remains a kernel bug that needs to be fixed regardless. So, I was just making sure that the issue is worked on from the kernel side. And if it can be fixed in the kernel in a reasonable time frame, then maybe we don't need to change OVS after all. This jumping code is fairly complex already. Best regards, Ilya Maximets.
On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote: > On 3/14/23 12:15, Simon Horman wrote: > > On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote: > >> On 3/13/23 11:27, Roi Dayan wrote: > >>> From: Oz Shlomo <ozsh@nvidia.com> > >>> > >>> Currently jumping over a output-to-port action is translated to tc > >>> mirror action and stolen control action. > >>> However, the tc control action is not propagated to the hw offload action, > >>> thus the hardware action will mirror the packet and continue to the next > >>> action. > >> > >> This sounds like a bug in the hardware offloading in the kernel/driver. > >> Shouldn't this be fixed in the kernel? > > > > Hi Ilya, > > > > I think what we are seeing is a mismatch between how > > OVS translates flows to TC, and how TC actually works (since forever). > > > > It seems to me that fixing the translation, as this patch does, > > is the right approach. But I could be missing something (often the case). > > I'm not a TC expert, but the description sounded like the driver > or some other code in the kernel is ignoring the TC_ACT_STOLEN. > If that's correct, that must be a bug in the kernel. > > We could still change the way OVS translates flows for TC in order > to avoid the kernel bug as long as we're not introducing anything > nasty in the flow translation logic. But it still remains a kernel > bug that needs to be fixed regardless. So, I was just making sure > that the issue is worked on from the kernel side. And if it can > be fixed in the kernel in a reasonable time frame, then maybe we > don't need to change OVS after all. This jumping code is fairly > complex already. Thanks Ilya, I understand your point. And I agree that the kernel treatment of TC_ACT_STOLEN needs to be understood/investigated.
On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote: > From: Oz Shlomo <ozsh@nvidia.com> > > Currently jumping over a output-to-port action is translated to tc > mirror action and stolen control action. > However, the tc control action is not propagated to the hw offload action, > thus the hardware action will mirror the packet and continue to the next > action. > > Transalte mirror/stolen to "mirred egress redirect" action which terminates > the action list. > > Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.") > Signed-off-by: Oz Shlomo <ozsh@nvidia.com> > Reviewed-by: Roi Dayan <roid@nvidia.com> Perhaps I messed something up, but I am seeing a lot of failed check-offloads tests [1] with this patch applied on top of Eelco's patches to add those tests to the GitHub workflow [2]. [1] https://github.com/horms/ovs/actions/runs/4415894043/jobs/7741856452 [2] [PATCH v3 2/2] ci: Run tc offload tests in GitHub Actions. https://mail.openvswitch.org/pipermail/ovs-dev/2023-March/402672.html
On 15/03/2023 11:00, Simon Horman wrote: > On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote: >> From: Oz Shlomo <ozsh@nvidia.com> >> >> Currently jumping over a output-to-port action is translated to tc >> mirror action and stolen control action. >> However, the tc control action is not propagated to the hw offload action, >> thus the hardware action will mirror the packet and continue to the next >> action. >> >> Transalte mirror/stolen to "mirred egress redirect" action which terminates >> the action list. >> >> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.") >> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> >> Reviewed-by: Roi Dayan <roid@nvidia.com> > Perhaps I messed something up, but I am seeing a lot of failed > check-offloads tests [1] with this patch applied on top of > Eelco's patches to add those tests to the GitHub workflow [2]. Thanks Simon, We will check this patch again in our lab > > [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms%2Fovs%2Factions%2Fruns%2F4415894043%2Fjobs%2F7741856452&data=05%7C01%7Cozsh%40nvidia.com%7C07c987cf05514686c62b08db2533cd3a%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638144676641214551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VKe3N4zZAXxlZrVwfmneLB5fN0vJxlAuwQKIp4RqwIo%3D&reserved=0 > [2] [PATCH v3 2/2] ci: Run tc offload tests in GitHub Actions. > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2023-March%2F402672.html&data=05%7C01%7Cozsh%40nvidia.com%7C07c987cf05514686c62b08db2533cd3a%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638144676641214551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JjH%2Bfay7I9kSOlJefXWjboA%2FFRGRWy6TB5kq1BOaFgc%3D&reserved=0
On Wed, Mar 15, 2023 at 11:52:16AM +0200, Oz Shlomo wrote: > > On 15/03/2023 11:00, Simon Horman wrote: > > On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote: > > > From: Oz Shlomo <ozsh@nvidia.com> > > > > > > Currently jumping over a output-to-port action is translated to tc > > > mirror action and stolen control action. > > > However, the tc control action is not propagated to the hw offload action, > > > thus the hardware action will mirror the packet and continue to the next > > > action. > > > > > > Transalte mirror/stolen to "mirred egress redirect" action which terminates > > > the action list. > > > > > > Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.") > > > Signed-off-by: Oz Shlomo <ozsh@nvidia.com> > > > Reviewed-by: Roi Dayan <roid@nvidia.com> > > Perhaps I messed something up, but I am seeing a lot of failed > > check-offloads tests [1] with this patch applied on top of > > Eelco's patches to add those tests to the GitHub workflow [2]. > > Thanks Simon, > > We will check this patch again in our lab Thanks Oz, much appreciated.
On Wed, Mar 15, 2023 at 09:33:09AM +0100, Simon Horman wrote: > On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote: > > On 3/14/23 12:15, Simon Horman wrote: > > > On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote: > > >> On 3/13/23 11:27, Roi Dayan wrote: > > >>> From: Oz Shlomo <ozsh@nvidia.com> > > >>> > > >>> Currently jumping over a output-to-port action is translated to tc > > >>> mirror action and stolen control action. > > >>> However, the tc control action is not propagated to the hw offload action, > > >>> thus the hardware action will mirror the packet and continue to the next > > >>> action. > > >> > > >> This sounds like a bug in the hardware offloading in the kernel/driver. > > >> Shouldn't this be fixed in the kernel? > > > > > > Hi Ilya, > > > > > > I think what we are seeing is a mismatch between how > > > OVS translates flows to TC, and how TC actually works (since forever). > > > > > > It seems to me that fixing the translation, as this patch does, > > > is the right approach. But I could be missing something (often the case). > > > > I'm not a TC expert, but the description sounded like the driver > > or some other code in the kernel is ignoring the TC_ACT_STOLEN. > > If that's correct, that must be a bug in the kernel. That's correct, but it's more like a tc feature that hasn't been implemented in the HWOL side of it and that shouldn't have been used here. Or a gap, or so, but still. It's not a small piece of code. Tc is super flexible and that flexibility is not there in the driver sometimes. This is only becoming a problem now because with check_pkt_larger() ovs will generate action lists that don't execute entirely. > > > > We could still change the way OVS translates flows for TC in order > > to avoid the kernel bug as long as we're not introducing anything > > nasty in the flow translation logic. But it still remains a kernel > > bug that needs to be fixed regardless. So, I was just making sure > > that the issue is worked on from the kernel side. And if it can > > be fixed in the kernel in a reasonable time frame, then maybe we > > don't need to change OVS after all. This jumping code is fairly > > complex already. > > Thanks Ilya, > > I understand your point. And I agree that the kernel treatment of > TC_ACT_STOLEN needs to be understood/investigated. Nvidia folks please feel free to chime in here, but it seems to me that it that it would be nice to avoid having to implement this in the driver. Perhaps some safety checks, to reject offloading stuff that it knows that it will be broken, and that's it. Well, as Ilya said, as long as it's not nasty from ovs side, of course. As I was saying above, I think the motivation for this patch is check_pkt_larger, on which it could have: actions:check_pkt_larger(value,action1,action2) and we want to stop the execution after action1. Tc allows more than one way of doing it, and from tc perspective, the new way this patch is proposing is fine. But then, it could also be: actions:check_pkt_larger(value,action1,action2),action3 and then we don't want to stop the execution after action1, but continue on action3. I don't know the details on the translation here and optimizations that are done, but there will be a moment that 2 jumps are needed (j1: action1->action3, j2: match->action2), and then the control action that is getting ignored here now, won't even be there (well, it will be, but it will be a PIPE one). Hopefully this long text (sorry) makes sense? Marcelo
On 3/15/23 16:18, Marcelo Ricardo Leitner wrote: > On Wed, Mar 15, 2023 at 09:33:09AM +0100, Simon Horman wrote: >> On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote: >>> On 3/14/23 12:15, Simon Horman wrote: >>>> On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote: >>>>> On 3/13/23 11:27, Roi Dayan wrote: >>>>>> From: Oz Shlomo <ozsh@nvidia.com> >>>>>> >>>>>> Currently jumping over a output-to-port action is translated to tc >>>>>> mirror action and stolen control action. >>>>>> However, the tc control action is not propagated to the hw offload action, >>>>>> thus the hardware action will mirror the packet and continue to the next >>>>>> action. >>>>> >>>>> This sounds like a bug in the hardware offloading in the kernel/driver. >>>>> Shouldn't this be fixed in the kernel? >>>> >>>> Hi Ilya, >>>> >>>> I think what we are seeing is a mismatch between how >>>> OVS translates flows to TC, and how TC actually works (since forever). >>>> >>>> It seems to me that fixing the translation, as this patch does, >>>> is the right approach. But I could be missing something (often the case). >>> >>> I'm not a TC expert, but the description sounded like the driver >>> or some other code in the kernel is ignoring the TC_ACT_STOLEN. >>> If that's correct, that must be a bug in the kernel. > > That's correct, but it's more like a tc feature that hasn't been > implemented in the HWOL side of it and that shouldn't have been used > here. Or a gap, or so, but still. It's not a small piece of code. Tc > is super flexible and that flexibility is not there in the driver > sometimes. Sure, but "something is complex" is not really an excuse for that something to not operate correctly. :) Unfortunately, it's a common theme with hardware offload in TC incorrectly accepting and misinterpreting what shouldn't have been accepted in the first place, or ignoring things that shouldn't have been ignored. And this issue in TC should have been addressed when the hardware offload parts of TC were first introduced. If not by implementing the "missing feature", then, at least, by making sure that rules that are using this feature are not going to hardware. Either way it's an internal TC bug and not OVS'es fault. Again, to be clear, I'm OK with fixing it on OVS side by not using broken API, but the API should be fixed in the kernel, regardless. Otherwise, someone else will hit that issue. It may even be us again in the future. Best regards, Ilya Maximets.
On Tue, Mar 21, 2023 at 06:05:20PM +0100, Ilya Maximets wrote: > On 3/15/23 16:18, Marcelo Ricardo Leitner wrote: > > On Wed, Mar 15, 2023 at 09:33:09AM +0100, Simon Horman wrote: > >> On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote: > >>> On 3/14/23 12:15, Simon Horman wrote: > >>>> On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote: > >>>>> On 3/13/23 11:27, Roi Dayan wrote: > >>>>>> From: Oz Shlomo <ozsh@nvidia.com> > >>>>>> > >>>>>> Currently jumping over a output-to-port action is translated to tc > >>>>>> mirror action and stolen control action. > >>>>>> However, the tc control action is not propagated to the hw offload action, > >>>>>> thus the hardware action will mirror the packet and continue to the next > >>>>>> action. > >>>>> > >>>>> This sounds like a bug in the hardware offloading in the kernel/driver. > >>>>> Shouldn't this be fixed in the kernel? > >>>> > >>>> Hi Ilya, > >>>> > >>>> I think what we are seeing is a mismatch between how > >>>> OVS translates flows to TC, and how TC actually works (since forever). > >>>> > >>>> It seems to me that fixing the translation, as this patch does, > >>>> is the right approach. But I could be missing something (often the case). > >>> > >>> I'm not a TC expert, but the description sounded like the driver > >>> or some other code in the kernel is ignoring the TC_ACT_STOLEN. > >>> If that's correct, that must be a bug in the kernel. > > > > That's correct, but it's more like a tc feature that hasn't been > > implemented in the HWOL side of it and that shouldn't have been used > > here. Or a gap, or so, but still. It's not a small piece of code. Tc > > is super flexible and that flexibility is not there in the driver > > sometimes. > > Sure, but "something is complex" is not really an excuse for that > something to not operate correctly. :) > > Unfortunately, it's a common theme with hardware offload in TC > incorrectly accepting and misinterpreting what shouldn't have been > accepted in the first place, or ignoring things that shouldn't have > been ignored. > > And this issue in TC should have been addressed when the hardware > offload parts of TC were first introduced. If not by implementing > the "missing feature", then, at least, by making sure that rules that > are using this feature are not going to hardware. > > Either way it's an internal TC bug and not OVS'es fault. > > Again, to be clear, I'm OK with fixing it on OVS side by not using > broken API, but the API should be fixed in the kernel, regardless. > Otherwise, someone else will hit that issue. It may even be us > again in the future. Ack. I agree if the API is broken it should (also) be fixed. Oz, can we work on making that so?
On 3/15/23 11:51, Simon Horman wrote: > On Wed, Mar 15, 2023 at 11:52:16AM +0200, Oz Shlomo wrote: >> >> On 15/03/2023 11:00, Simon Horman wrote: >>> On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote: >>>> From: Oz Shlomo <ozsh@nvidia.com> >>>> >>>> Currently jumping over a output-to-port action is translated to tc >>>> mirror action and stolen control action. >>>> However, the tc control action is not propagated to the hw offload action, >>>> thus the hardware action will mirror the packet and continue to the next >>>> action. >>>> >>>> Transalte mirror/stolen to "mirred egress redirect" action which terminates >>>> the action list. >>>> >>>> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.") >>>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> >>>> Reviewed-by: Roi Dayan <roid@nvidia.com> >>> Perhaps I messed something up, but I am seeing a lot of failed >>> check-offloads tests [1] with this patch applied on top of >>> Eelco's patches to add those tests to the GitHub workflow [2]. >> >> Thanks Simon, >> >> We will check this patch again in our lab > > Thanks Oz, much appreciated. I see a lot of failures as well. They are caused by 'Kernel flower acknowledgment does not match request!' message. And that doesn't sound good. The system is Ubuntu 22.10 in my case. I'm marking this patch as 'Changes Requested' for now. Best regards, Ilya Maximets.
diff --git a/lib/tc.c b/lib/tc.c index 4c07e22162e7..fe06d3c110b6 100644 --- a/lib/tc.c +++ b/lib/tc.c @@ -3341,13 +3341,16 @@ nl_msg_put_flower_acts(struct ofpbuf *request, struct tc_flower *flower) } action->jump_action = JUMP_ACTION_STOP; } else { - if (ingress) { - nl_msg_put_act_mirred(request, ifindex, action_pc, - TCA_INGRESS_MIRROR); + int out_action; + + if (action_pc == TC_ACT_STOLEN) { + out_action = TCA_EGRESS_REDIR; } else { - nl_msg_put_act_mirred(request, ifindex, action_pc, - TCA_EGRESS_MIRROR); + out_action = TCA_EGRESS_MIRROR; } + + nl_msg_put_act_mirred(request, ifindex, action_pc, + out_action); } nl_msg_put_act_cookie(request, &flower->act_cookie); nl_msg_put_act_flags(request);