diff mbox series

[ovs-dev,1/1] tc: translate mirror/stolen to mirred

Message ID 20230313102751.594104-1-roid@nvidia.com
State Changes Requested
Headers show
Series [ovs-dev,1/1] tc: translate mirror/stolen to mirred | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Roi Dayan March 13, 2023, 10:27 a.m. UTC
From: Oz Shlomo <ozsh@nvidia.com>

Currently jumping over a output-to-port action is translated to tc
mirror action and stolen control action.
However, the tc control action is not propagated to the hw offload action,
thus the hardware action will mirror the packet and continue to the next
action.

Transalte mirror/stolen to "mirred egress redirect" action which terminates
the action list.

Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.")
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
---
 lib/tc.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Comments

Eelco Chaudron March 13, 2023, 10:35 a.m. UTC | #1
On 13 Mar 2023, at 11:27, Roi Dayan wrote:

> From: Oz Shlomo <ozsh@nvidia.com>
>
> Currently jumping over a output-to-port action is translated to tc
> mirror action and stolen control action.
> However, the tc control action is not propagated to the hw offload action,
> thus the hardware action will mirror the packet and continue to the next
> action.
>
> Transalte mirror/stolen to "mirred egress redirect" action which terminates
> the action list.

I did not review the code, but before I do, can you add a test case for this scenario? Asking as there is quite some translation logic involved and I want to make sure we are not breaking any of this.

> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.")
> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
> Reviewed-by: Roi Dayan <roid@nvidia.com>
> ---
>  lib/tc.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/lib/tc.c b/lib/tc.c
> index 4c07e22162e7..fe06d3c110b6 100644
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -3341,13 +3341,16 @@ nl_msg_put_flower_acts(struct ofpbuf *request, struct tc_flower *flower)
>                      }
>                      action->jump_action = JUMP_ACTION_STOP;
>                  } else {
> -                    if (ingress) {
> -                        nl_msg_put_act_mirred(request, ifindex, action_pc,
> -                                              TCA_INGRESS_MIRROR);
> +                    int out_action;
> +
> +                    if (action_pc == TC_ACT_STOLEN) {
> +                        out_action = TCA_EGRESS_REDIR;
>                      } else {
> -                        nl_msg_put_act_mirred(request, ifindex, action_pc,
> -                                              TCA_EGRESS_MIRROR);
> +                        out_action = TCA_EGRESS_MIRROR;
>                      }
> +
> +                    nl_msg_put_act_mirred(request, ifindex, action_pc,
> +                                          out_action);
>                  }
>                  nl_msg_put_act_cookie(request, &flower->act_cookie);
>                  nl_msg_put_act_flags(request);
> -- 
> 2.38.0
Roi Dayan March 13, 2023, 10:38 a.m. UTC | #2
On 13/03/2023 12:35, Eelco Chaudron wrote:
> 
> 
> On 13 Mar 2023, at 11:27, Roi Dayan wrote:
> 
>> From: Oz Shlomo <ozsh@nvidia.com>
>>
>> Currently jumping over a output-to-port action is translated to tc
>> mirror action and stolen control action.
>> However, the tc control action is not propagated to the hw offload action,
>> thus the hardware action will mirror the packet and continue to the next
>> action.
>>
>> Transalte mirror/stolen to "mirred egress redirect" action which terminates
>> the action list.
> 
> I did not review the code, but before I do, can you add a test case for this scenario? Asking as there is quite some translation logic involved and I want to make sure we are not breaking any of this.

thanks Eelco.
We'll take a look about a test case for this and the fix cleaning chains commit.


> 
>> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.")
>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
>> Reviewed-by: Roi Dayan <roid@nvidia.com>
>> ---
>>  lib/tc.c | 13 ++++++++-----
>>  1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/lib/tc.c b/lib/tc.c
>> index 4c07e22162e7..fe06d3c110b6 100644
>> --- a/lib/tc.c
>> +++ b/lib/tc.c
>> @@ -3341,13 +3341,16 @@ nl_msg_put_flower_acts(struct ofpbuf *request, struct tc_flower *flower)
>>                      }
>>                      action->jump_action = JUMP_ACTION_STOP;
>>                  } else {
>> -                    if (ingress) {
>> -                        nl_msg_put_act_mirred(request, ifindex, action_pc,
>> -                                              TCA_INGRESS_MIRROR);
>> +                    int out_action;
>> +
>> +                    if (action_pc == TC_ACT_STOLEN) {
>> +                        out_action = TCA_EGRESS_REDIR;
>>                      } else {
>> -                        nl_msg_put_act_mirred(request, ifindex, action_pc,
>> -                                              TCA_EGRESS_MIRROR);
>> +                        out_action = TCA_EGRESS_MIRROR;
>>                      }
>> +
>> +                    nl_msg_put_act_mirred(request, ifindex, action_pc,
>> +                                          out_action);
>>                  }
>>                  nl_msg_put_act_cookie(request, &flower->act_cookie);
>>                  nl_msg_put_act_flags(request);
>> -- 
>> 2.38.0
>
Ilya Maximets March 13, 2023, 8:34 p.m. UTC | #3
On 3/13/23 11:27, Roi Dayan wrote:
> From: Oz Shlomo <ozsh@nvidia.com>
> 
> Currently jumping over a output-to-port action is translated to tc
> mirror action and stolen control action.
> However, the tc control action is not propagated to the hw offload action,
> thus the hardware action will mirror the packet and continue to the next
> action.

This sounds like a bug in the hardware offloading in the kernel/driver.
Shouldn't this be fixed in the kernel?

Bets regards, Ilya Maximets.
Simon Horman March 14, 2023, 11:15 a.m. UTC | #4
On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote:
> On 3/13/23 11:27, Roi Dayan wrote:
> > From: Oz Shlomo <ozsh@nvidia.com>
> > 
> > Currently jumping over a output-to-port action is translated to tc
> > mirror action and stolen control action.
> > However, the tc control action is not propagated to the hw offload action,
> > thus the hardware action will mirror the packet and continue to the next
> > action.
> 
> This sounds like a bug in the hardware offloading in the kernel/driver.
> Shouldn't this be fixed in the kernel?

Hi Ilya,

I think what we are seeing is a mismatch between how
OVS translates flows to TC, and how TC actually works (since forever).

It seems to me that fixing the translation, as this patch does,
is the right approach. But I could be missing something (often the case).
Ilya Maximets March 14, 2023, 5:49 p.m. UTC | #5
On 3/14/23 12:15, Simon Horman wrote:
> On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote:
>> On 3/13/23 11:27, Roi Dayan wrote:
>>> From: Oz Shlomo <ozsh@nvidia.com>
>>>
>>> Currently jumping over a output-to-port action is translated to tc
>>> mirror action and stolen control action.
>>> However, the tc control action is not propagated to the hw offload action,
>>> thus the hardware action will mirror the packet and continue to the next
>>> action.
>>
>> This sounds like a bug in the hardware offloading in the kernel/driver.
>> Shouldn't this be fixed in the kernel?
> 
> Hi Ilya,
> 
> I think what we are seeing is a mismatch between how
> OVS translates flows to TC, and how TC actually works (since forever).
> 
> It seems to me that fixing the translation, as this patch does,
> is the right approach. But I could be missing something (often the case).

I'm not a TC expert, but the description sounded like the driver
or some other code in the kernel is ignoring the TC_ACT_STOLEN.
If that's correct, that must be a bug in the kernel.

We could still change the way OVS translates flows for TC in order
to avoid the kernel bug as long as we're not introducing anything
nasty in the flow translation logic.  But it still remains a kernel
bug that needs to be fixed regardless.  So, I was just making sure
that the issue is worked on from the kernel side.  And if it can
be fixed in the kernel in a reasonable time frame, then maybe we
don't need to change OVS after all.  This jumping code is fairly
complex already.

Best regards, Ilya Maximets.
Simon Horman March 15, 2023, 8:33 a.m. UTC | #6
On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote:
> On 3/14/23 12:15, Simon Horman wrote:
> > On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote:
> >> On 3/13/23 11:27, Roi Dayan wrote:
> >>> From: Oz Shlomo <ozsh@nvidia.com>
> >>>
> >>> Currently jumping over a output-to-port action is translated to tc
> >>> mirror action and stolen control action.
> >>> However, the tc control action is not propagated to the hw offload action,
> >>> thus the hardware action will mirror the packet and continue to the next
> >>> action.
> >>
> >> This sounds like a bug in the hardware offloading in the kernel/driver.
> >> Shouldn't this be fixed in the kernel?
> > 
> > Hi Ilya,
> > 
> > I think what we are seeing is a mismatch between how
> > OVS translates flows to TC, and how TC actually works (since forever).
> > 
> > It seems to me that fixing the translation, as this patch does,
> > is the right approach. But I could be missing something (often the case).
> 
> I'm not a TC expert, but the description sounded like the driver
> or some other code in the kernel is ignoring the TC_ACT_STOLEN.
> If that's correct, that must be a bug in the kernel.
> 
> We could still change the way OVS translates flows for TC in order
> to avoid the kernel bug as long as we're not introducing anything
> nasty in the flow translation logic.  But it still remains a kernel
> bug that needs to be fixed regardless.  So, I was just making sure
> that the issue is worked on from the kernel side.  And if it can
> be fixed in the kernel in a reasonable time frame, then maybe we
> don't need to change OVS after all.  This jumping code is fairly
> complex already.

Thanks Ilya,

I understand your point. And I agree that the kernel treatment of
TC_ACT_STOLEN needs to be understood/investigated.
Simon Horman March 15, 2023, 9 a.m. UTC | #7
On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote:
> From: Oz Shlomo <ozsh@nvidia.com>
> 
> Currently jumping over a output-to-port action is translated to tc
> mirror action and stolen control action.
> However, the tc control action is not propagated to the hw offload action,
> thus the hardware action will mirror the packet and continue to the next
> action.
> 
> Transalte mirror/stolen to "mirred egress redirect" action which terminates
> the action list.
> 
> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.")
> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
> Reviewed-by: Roi Dayan <roid@nvidia.com>

Perhaps I messed something up, but I am seeing a lot of failed
check-offloads tests [1] with this patch applied on top of
Eelco's patches to add those tests to the GitHub workflow [2].

[1] https://github.com/horms/ovs/actions/runs/4415894043/jobs/7741856452
[2] [PATCH v3 2/2] ci: Run tc offload tests in GitHub Actions.
    https://mail.openvswitch.org/pipermail/ovs-dev/2023-March/402672.html
Oz Shlomo March 15, 2023, 9:52 a.m. UTC | #8
On 15/03/2023 11:00, Simon Horman wrote:
> On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote:
>> From: Oz Shlomo <ozsh@nvidia.com>
>>
>> Currently jumping over a output-to-port action is translated to tc
>> mirror action and stolen control action.
>> However, the tc control action is not propagated to the hw offload action,
>> thus the hardware action will mirror the packet and continue to the next
>> action.
>>
>> Transalte mirror/stolen to "mirred egress redirect" action which terminates
>> the action list.
>>
>> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.")
>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
>> Reviewed-by: Roi Dayan <roid@nvidia.com>
> Perhaps I messed something up, but I am seeing a lot of failed
> check-offloads tests [1] with this patch applied on top of
> Eelco's patches to add those tests to the GitHub workflow [2].

Thanks Simon,

We will check this patch again in our lab

>
> [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhorms%2Fovs%2Factions%2Fruns%2F4415894043%2Fjobs%2F7741856452&data=05%7C01%7Cozsh%40nvidia.com%7C07c987cf05514686c62b08db2533cd3a%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638144676641214551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=VKe3N4zZAXxlZrVwfmneLB5fN0vJxlAuwQKIp4RqwIo%3D&reserved=0
> [2] [PATCH v3 2/2] ci: Run tc offload tests in GitHub Actions.
>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fpipermail%2Fovs-dev%2F2023-March%2F402672.html&data=05%7C01%7Cozsh%40nvidia.com%7C07c987cf05514686c62b08db2533cd3a%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638144676641214551%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=JjH%2Bfay7I9kSOlJefXWjboA%2FFRGRWy6TB5kq1BOaFgc%3D&reserved=0
Simon Horman March 15, 2023, 10:51 a.m. UTC | #9
On Wed, Mar 15, 2023 at 11:52:16AM +0200, Oz Shlomo wrote:
> 
> On 15/03/2023 11:00, Simon Horman wrote:
> > On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote:
> > > From: Oz Shlomo <ozsh@nvidia.com>
> > > 
> > > Currently jumping over a output-to-port action is translated to tc
> > > mirror action and stolen control action.
> > > However, the tc control action is not propagated to the hw offload action,
> > > thus the hardware action will mirror the packet and continue to the next
> > > action.
> > > 
> > > Transalte mirror/stolen to "mirred egress redirect" action which terminates
> > > the action list.
> > > 
> > > Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.")
> > > Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
> > > Reviewed-by: Roi Dayan <roid@nvidia.com>
> > Perhaps I messed something up, but I am seeing a lot of failed
> > check-offloads tests [1] with this patch applied on top of
> > Eelco's patches to add those tests to the GitHub workflow [2].
> 
> Thanks Simon,
> 
> We will check this patch again in our lab

Thanks Oz, much appreciated.
Marcelo Leitner March 15, 2023, 3:18 p.m. UTC | #10
On Wed, Mar 15, 2023 at 09:33:09AM +0100, Simon Horman wrote:
> On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote:
> > On 3/14/23 12:15, Simon Horman wrote:
> > > On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote:
> > >> On 3/13/23 11:27, Roi Dayan wrote:
> > >>> From: Oz Shlomo <ozsh@nvidia.com>
> > >>>
> > >>> Currently jumping over a output-to-port action is translated to tc
> > >>> mirror action and stolen control action.
> > >>> However, the tc control action is not propagated to the hw offload action,
> > >>> thus the hardware action will mirror the packet and continue to the next
> > >>> action.
> > >>
> > >> This sounds like a bug in the hardware offloading in the kernel/driver.
> > >> Shouldn't this be fixed in the kernel?
> > >
> > > Hi Ilya,
> > >
> > > I think what we are seeing is a mismatch between how
> > > OVS translates flows to TC, and how TC actually works (since forever).
> > >
> > > It seems to me that fixing the translation, as this patch does,
> > > is the right approach. But I could be missing something (often the case).
> >
> > I'm not a TC expert, but the description sounded like the driver
> > or some other code in the kernel is ignoring the TC_ACT_STOLEN.
> > If that's correct, that must be a bug in the kernel.

That's correct, but it's more like a tc feature that hasn't been
implemented in the HWOL side of it and that shouldn't have been used
here. Or a gap, or so, but still. It's not a small piece of code. Tc
is super flexible and that flexibility is not there in the driver
sometimes.

This is only becoming a problem now because with check_pkt_larger()
ovs will generate action lists that don't execute entirely.

> >
> > We could still change the way OVS translates flows for TC in order
> > to avoid the kernel bug as long as we're not introducing anything
> > nasty in the flow translation logic.  But it still remains a kernel
> > bug that needs to be fixed regardless.  So, I was just making sure
> > that the issue is worked on from the kernel side.  And if it can
> > be fixed in the kernel in a reasonable time frame, then maybe we
> > don't need to change OVS after all.  This jumping code is fairly
> > complex already.
>
> Thanks Ilya,
>
> I understand your point. And I agree that the kernel treatment of
> TC_ACT_STOLEN needs to be understood/investigated.

Nvidia folks please feel free to chime in here, but it seems to me that
it that it would be nice to avoid having to implement this in the
driver. Perhaps some safety checks, to reject offloading stuff that it
knows that it will be broken, and that's it. Well, as Ilya said, as
long as it's not nasty from ovs side, of course.

As I was saying above, I think the motivation for this patch is
check_pkt_larger, on which it could have:
actions:check_pkt_larger(value,action1,action2)
and we want to stop the execution after action1. Tc allows more than
one way of doing it, and from tc perspective, the new way this patch
is proposing is fine. But then, it could also be:
actions:check_pkt_larger(value,action1,action2),action3
and then we
don't want to stop the execution after action1, but continue on
action3.

I don't know the details on the translation here and optimizations
that are done, but there will be a moment that 2 jumps are needed (j1:
action1->action3, j2: match->action2), and then the control action
that is getting ignored here now, won't even be there (well, it will
be, but it will be a PIPE one).

Hopefully this long text (sorry) makes sense?

  Marcelo
Ilya Maximets March 21, 2023, 5:05 p.m. UTC | #11
On 3/15/23 16:18, Marcelo Ricardo Leitner wrote:
> On Wed, Mar 15, 2023 at 09:33:09AM +0100, Simon Horman wrote:
>> On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote:
>>> On 3/14/23 12:15, Simon Horman wrote:
>>>> On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote:
>>>>> On 3/13/23 11:27, Roi Dayan wrote:
>>>>>> From: Oz Shlomo <ozsh@nvidia.com>
>>>>>>
>>>>>> Currently jumping over a output-to-port action is translated to tc
>>>>>> mirror action and stolen control action.
>>>>>> However, the tc control action is not propagated to the hw offload action,
>>>>>> thus the hardware action will mirror the packet and continue to the next
>>>>>> action.
>>>>>
>>>>> This sounds like a bug in the hardware offloading in the kernel/driver.
>>>>> Shouldn't this be fixed in the kernel?
>>>>
>>>> Hi Ilya,
>>>>
>>>> I think what we are seeing is a mismatch between how
>>>> OVS translates flows to TC, and how TC actually works (since forever).
>>>>
>>>> It seems to me that fixing the translation, as this patch does,
>>>> is the right approach. But I could be missing something (often the case).
>>>
>>> I'm not a TC expert, but the description sounded like the driver
>>> or some other code in the kernel is ignoring the TC_ACT_STOLEN.
>>> If that's correct, that must be a bug in the kernel.
> 
> That's correct, but it's more like a tc feature that hasn't been
> implemented in the HWOL side of it and that shouldn't have been used
> here. Or a gap, or so, but still. It's not a small piece of code. Tc
> is super flexible and that flexibility is not there in the driver
> sometimes.

Sure, but "something is complex" is not really an excuse for that
something to not operate correctly. :)

Unfortunately, it's a common theme with hardware offload in TC
incorrectly accepting and misinterpreting what shouldn't have been
accepted in the first place, or ignoring things that shouldn't have
been ignored.

And this issue in TC should have been addressed when the hardware
offload parts of TC were first introduced.  If not by implementing
the "missing feature", then, at least, by making sure that rules that
are using this feature are not going to hardware.

Either way it's an internal TC bug and not OVS'es fault.

Again, to be clear, I'm OK with fixing it on OVS side by not using
broken API, but the API should be fixed in the kernel, regardless.
Otherwise, someone else will hit that issue.  It may even be us
again in the future.

Best regards, Ilya Maximets.
Simon Horman March 22, 2023, 11:06 a.m. UTC | #12
On Tue, Mar 21, 2023 at 06:05:20PM +0100, Ilya Maximets wrote:
> On 3/15/23 16:18, Marcelo Ricardo Leitner wrote:
> > On Wed, Mar 15, 2023 at 09:33:09AM +0100, Simon Horman wrote:
> >> On Tue, Mar 14, 2023 at 06:49:25PM +0100, Ilya Maximets wrote:
> >>> On 3/14/23 12:15, Simon Horman wrote:
> >>>> On Mon, Mar 13, 2023 at 09:34:30PM +0100, Ilya Maximets wrote:
> >>>>> On 3/13/23 11:27, Roi Dayan wrote:
> >>>>>> From: Oz Shlomo <ozsh@nvidia.com>
> >>>>>>
> >>>>>> Currently jumping over a output-to-port action is translated to tc
> >>>>>> mirror action and stolen control action.
> >>>>>> However, the tc control action is not propagated to the hw offload action,
> >>>>>> thus the hardware action will mirror the packet and continue to the next
> >>>>>> action.
> >>>>>
> >>>>> This sounds like a bug in the hardware offloading in the kernel/driver.
> >>>>> Shouldn't this be fixed in the kernel?
> >>>>
> >>>> Hi Ilya,
> >>>>
> >>>> I think what we are seeing is a mismatch between how
> >>>> OVS translates flows to TC, and how TC actually works (since forever).
> >>>>
> >>>> It seems to me that fixing the translation, as this patch does,
> >>>> is the right approach. But I could be missing something (often the case).
> >>>
> >>> I'm not a TC expert, but the description sounded like the driver
> >>> or some other code in the kernel is ignoring the TC_ACT_STOLEN.
> >>> If that's correct, that must be a bug in the kernel.
> > 
> > That's correct, but it's more like a tc feature that hasn't been
> > implemented in the HWOL side of it and that shouldn't have been used
> > here. Or a gap, or so, but still. It's not a small piece of code. Tc
> > is super flexible and that flexibility is not there in the driver
> > sometimes.
> 
> Sure, but "something is complex" is not really an excuse for that
> something to not operate correctly. :)
> 
> Unfortunately, it's a common theme with hardware offload in TC
> incorrectly accepting and misinterpreting what shouldn't have been
> accepted in the first place, or ignoring things that shouldn't have
> been ignored.
> 
> And this issue in TC should have been addressed when the hardware
> offload parts of TC were first introduced.  If not by implementing
> the "missing feature", then, at least, by making sure that rules that
> are using this feature are not going to hardware.
> 
> Either way it's an internal TC bug and not OVS'es fault.
> 
> Again, to be clear, I'm OK with fixing it on OVS side by not using
> broken API, but the API should be fixed in the kernel, regardless.
> Otherwise, someone else will hit that issue.  It may even be us
> again in the future.

Ack. I agree if the API is broken it should (also) be fixed.

Oz, can we work on making that so?
Ilya Maximets April 26, 2023, 3:05 p.m. UTC | #13
On 3/15/23 11:51, Simon Horman wrote:
> On Wed, Mar 15, 2023 at 11:52:16AM +0200, Oz Shlomo wrote:
>>
>> On 15/03/2023 11:00, Simon Horman wrote:
>>> On Mon, Mar 13, 2023 at 12:27:51PM +0200, Roi Dayan wrote:
>>>> From: Oz Shlomo <ozsh@nvidia.com>
>>>>
>>>> Currently jumping over a output-to-port action is translated to tc
>>>> mirror action and stolen control action.
>>>> However, the tc control action is not propagated to the hw offload action,
>>>> thus the hardware action will mirror the packet and continue to the next
>>>> action.
>>>>
>>>> Transalte mirror/stolen to "mirred egress redirect" action which terminates
>>>> the action list.
>>>>
>>>> Fixes: e4daf88a4390 ("netdev-offload-tc: Handle check_pkt_len datapath action.")
>>>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
>>>> Reviewed-by: Roi Dayan <roid@nvidia.com>
>>> Perhaps I messed something up, but I am seeing a lot of failed
>>> check-offloads tests [1] with this patch applied on top of
>>> Eelco's patches to add those tests to the GitHub workflow [2].
>>
>> Thanks Simon,
>>
>> We will check this patch again in our lab
> 
> Thanks Oz, much appreciated.

I see a lot of failures as well.  They are caused by 'Kernel flower acknowledgment
does not match request!' message.  And that doesn't sound good.

The system is Ubuntu 22.10 in my case.

I'm marking this patch as 'Changes Requested' for now.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/lib/tc.c b/lib/tc.c
index 4c07e22162e7..fe06d3c110b6 100644
--- a/lib/tc.c
+++ b/lib/tc.c
@@ -3341,13 +3341,16 @@  nl_msg_put_flower_acts(struct ofpbuf *request, struct tc_flower *flower)
                     }
                     action->jump_action = JUMP_ACTION_STOP;
                 } else {
-                    if (ingress) {
-                        nl_msg_put_act_mirred(request, ifindex, action_pc,
-                                              TCA_INGRESS_MIRROR);
+                    int out_action;
+
+                    if (action_pc == TC_ACT_STOLEN) {
+                        out_action = TCA_EGRESS_REDIR;
                     } else {
-                        nl_msg_put_act_mirred(request, ifindex, action_pc,
-                                              TCA_EGRESS_MIRROR);
+                        out_action = TCA_EGRESS_MIRROR;
                     }
+
+                    nl_msg_put_act_mirred(request, ifindex, action_pc,
+                                          out_action);
                 }
                 nl_msg_put_act_cookie(request, &flower->act_cookie);
                 nl_msg_put_act_flags(request);