diff mbox series

[ovs-dev,1/1] netdev-tc-offloads: Support match on priority tags

Message ID 20190521121151.14008-1-elibr@mellanox.com
State Accepted
Headers show
Series [ovs-dev,1/1] netdev-tc-offloads: Support match on priority tags | expand

Commit Message

Eli Britstein May 21, 2019, 12:11 p.m. UTC
The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
either the VID, PCP or CFI are non-zero. For priority-tag packets
there is a VLAN tag header with a zero VLAN TCI. Match on existence of
VLAN header (TPID) regardless of TCI matching.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
---
 lib/netdev-tc-offloads.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Comments

Eli Britstein May 31, 2019, 7:35 a.m. UTC | #1
Ping
Simon Horman May 31, 2019, 12:45 p.m. UTC | #2
Sorry Eli,

I had missed this but I have it now.

On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr@mellanox.com> wrote:

> Ping
> ------------------------------
> *From:* Eli Britstein <elibr@mellanox.com>
> *Sent:* Tuesday, May 21, 2019 3:11:51 PM
> *To:* dev@openvswitch.org; Simon Horman
> *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
> *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority tags
>
> The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
> either the VID, PCP or CFI are non-zero. For priority-tag packets
> there is a VLAN tag header with a zero VLAN TCI. Match on existence of
> VLAN header (TPID) regardless of TCI matching.
>
> Signed-off-by: Eli Britstein <elibr@mellanox.com>
> Reviewed-by: Roi Dayan <roid@mellanox.com>
> ---
>  lib/netdev-tc-offloads.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
> index d5c66acc1..ef9ee0786 100644
> --- a/lib/netdev-tc-offloads.c
> +++ b/lib/netdev-tc-offloads.c
> @@ -1146,6 +1146,10 @@ netdev_tc_flow_put(struct netdev *netdev, struct
> match *match,
>      }
>      mask->mpls_lse[0] = 0;
>
> +    if (eth_type_vlan(key->vlans[0].tpid)) {
> +        flower.key.encap_eth_type[0] = flower.key.eth_type;
> +        flower.key.eth_type = key->vlans[0].tpid;
> +    }
>      if (mask->vlans[0].tci) {
>          ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
>          ovs_be16 pcp_mask = mask->vlans[0].tci & htons(VLAN_PCP_MASK);
> @@ -1166,8 +1170,6 @@ netdev_tc_flow_put(struct netdev *netdev, struct
> match *match,
>                  VLOG_DBG_RL(&rl, "vlan_prio[0]: %d\n",
>                              flower.key.vlan_prio[0]);
>              }
> -            flower.key.encap_eth_type[0] = flower.key.eth_type;
> -            flower.key.eth_type = key->vlans[0].tpid;
>          } else if (mask->vlans[0].tci == htons(0xffff) &&
>                     ntohs(key->vlans[0].tci) == 0) {
>              /* exact && no vlan */
> @@ -1177,6 +1179,10 @@ netdev_tc_flow_put(struct netdev *netdev, struct
> match *match,
>          }
>      }
>
> +    if (eth_type_vlan(key->vlans[1].tpid)) {
> +        flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
> +        flower.key.encap_eth_type[0] = key->vlans[1].tpid;
> +    }
>      if (mask->vlans[1].tci) {
>          ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
>          ovs_be16 pcp_mask = mask->vlans[1].tci & htons(VLAN_PCP_MASK);
> @@ -1196,8 +1202,6 @@ netdev_tc_flow_put(struct netdev *netdev, struct
> match *match,
>                  flower.mask.vlan_prio[1] =
> vlan_tci_to_pcp(mask->vlans[1].tci);
>                  VLOG_DBG_RL(&rl, "vlan_prio[1]: %d",
> flower.key.vlan_prio[1]);
>              }
> -            flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
> -            flower.key.encap_eth_type[0] = key->vlans[1].tpid;
>          } else if (mask->vlans[1].tci == htons(0xffff) &&
>                     ntohs(key->vlans[1].tci) == 0) {
>              /* exact && no vlan */
> --
> 2.17.2
>
>
Simon Horman June 3, 2019, 2:42 p.m. UTC | #3
On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
> Sorry Eli,
> 
> I had missed this but I have it now.
> 
> On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr@mellanox.com> wrote:
> 
> > Ping
> > ------------------------------
> > *From:* Eli Britstein <elibr@mellanox.com>
> > *Sent:* Tuesday, May 21, 2019 3:11:51 PM
> > *To:* dev@openvswitch.org; Simon Horman
> > *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
> > *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority tags
> >
> > The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
> > either the VID, PCP or CFI are non-zero. For priority-tag packets
> > there is a VLAN tag header with a zero VLAN TCI. Match on existence of
> > VLAN header (TPID) regardless of TCI matching.
> >
> > Signed-off-by: Eli Britstein <elibr@mellanox.com>
> > Reviewed-by: Roi Dayan <roid@mellanox.com>

Thanks this seems to be a nice enhancement, applied to master.
Louis Peens June 11, 2019, 7:28 a.m. UTC | #4
On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman@netronome.com>
wrote:

> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
> > Sorry Eli,
> >
> > I had missed this but I have it now.
> >
> > On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr@mellanox.com> wrote:
> >
> > > Ping
> > > ------------------------------
> > > *From:* Eli Britstein <elibr@mellanox.com>
> > > *Sent:* Tuesday, May 21, 2019 3:11:51 PM
> > > *To:* dev@openvswitch.org; Simon Horman
> > > *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
> > > *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
> tags
> > >
> > > The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
> > > either the VID, PCP or CFI are non-zero. For priority-tag packets
> > > there is a VLAN tag header with a zero VLAN TCI. Match on existence of
> > > VLAN header (TPID) regardless of TCI matching.
> > >
> > > Signed-off-by: Eli Britstein <elibr@mellanox.com>
> > > Reviewed-by: Roi Dayan <roid@mellanox.com>
>
> Thanks this seems to be a nice enhancement, applied to master.
>

Hey

During some internal testing we found that this patch broke some
set-L4 cases when using tc-offloads. Setting up a simple bridge and
flow rule looking something like this:

  ovs-vsctl show:
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "veth0r"
            Interface "veth0r"
        Port "veth1r"
            Interface "veth1r"

  ovs-ofctl add-flow br0 cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
    actions=set_field:4002-\>tcp_dst,output:veth1r

and then sending VLAN tagged traffic leads to packets not egressing the
port.

The TC rule that gets installed looks like this:
  vlan_ethtype ip
  eth_type ipv4
  ip_proto tcp
  dst_port 4000
  ip_flags nofrag
  not_in_hw
        action order 1:  pedit action pipe keys 1
         index 1 ref 1 bind 1 installed 3 sec used 3 sec
         key #0  at tcp+0: val 00000fa2 mask ffff0000
        Action statistics:
        Sent 368 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: csum (tcp) action pipe
        index 1 ref 1 bind 1 installed 3 sec used 3 sec
        Action statistics:
        Sent 368 bytes 8 pkt (dropped 8, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 3: mirred (Egress Redirect to device ens260np1) stolen
        index 1 ref 1 bind 1 installed 3 sec used 3 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
        cookie 8eb331199f4d41dc739e769a2f79f1ba

Note the drop counters in the csum(tcp) section. Before the patch the rule
looks like this:
  eth_type ipv4
  ip_proto tcp
  dst_port 4000
  ip_flags nofrag
  not_in_hw
        action order 1:  pedit action pipe keys 1
         index 1 ref 1 bind 1 installed 3 sec used 3 sec
         key #0  at tcp+0: val 00000fa2 mask ffff0000
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 2: csum (tcp) action pipe
        index 1 ref 1 bind 1 installed 3 sec used 3 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0

        action order 3: mirred (Egress Redirect to device ens260np1) stolen
        index 1 ref 1 bind 1 installed 3 sec used 3 sec
        Action statistics:
        Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
        backlog 0b 0p requeues 0
        cookie 83b3464d0c47e3747956bca7e8857a53

Here I suspect there may be a different issue as none of the counters
increase
so the TC rule is probably never hit (confirmed by the datapath rule timing
out even with continuous traffic), but at least packets are correctly
egressing via the userspace fallback path. It looks like this patch is
installing a rule in TC that does get hit, but is now mysteriously
dropping packets in the middle of the action processing pipeline.

Another strange observation is that this only seems to happen when setting
one of the layer4 fields, setting layer3 seems to work as expected.

Regards
Louis Peens


> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Ilya Maximets June 11, 2019, 11:21 a.m. UTC | #5
> On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman at netronome.com>
> wrote:
> 
>> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
>> > Sorry Eli,
>> >
>> > I had missed this but I have it now.
>> >
>> > On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr at mellanox.com> wrote:
>> >
>> > > Ping
>> > > ------------------------------
>> > > *From:* Eli Britstein <elibr at mellanox.com>
>> > > *Sent:* Tuesday, May 21, 2019 3:11:51 PM
>> > > *To:* dev at openvswitch.org; Simon Horman
>> > > *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
>> > > *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
>> tags
>> > >
>> > > The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
>> > > either the VID, PCP or CFI are non-zero. For priority-tag packets
>> > > there is a VLAN tag header with a zero VLAN TCI. Match on existence of
>> > > VLAN header (TPID) regardless of TCI matching.
>> > >
>> > > Signed-off-by: Eli Britstein <elibr at mellanox.com>
>> > > Reviewed-by: Roi Dayan <roid at mellanox.com>
>>
>> Thanks this seems to be a nice enhancement, applied to master.
>>
> 
> Hey
> 
> During some internal testing we found that this patch broke some
> set-L4 cases when using tc-offloads. Setting up a simple bridge and
> flow rule looking something like this:
> 
>   ovs-vsctl show:
>     Bridge "br0"
>         Port "br0"
>             Interface "br0"
>                 type: internal
>         Port "veth0r"
>             Interface "veth0r"
>         Port "veth1r"
>             Interface "veth1r"
> 
>   ovs-ofctl add-flow br0 cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
>     actions=set_field:4002-\>tcp_dst,output:veth1r
> 
> and then sending VLAN tagged traffic leads to packets not egressing the
> port.
> 
> The TC rule that gets installed looks like this:
>   vlan_ethtype ip

There must be no vlan_ethertype match just because there was no such field match
in the original flow.

Looking at the code I see that it checks for key.tpid which makes no sense,
because it must check for the mask first. At least there must be something like
this:

diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 2af0f10d9..7e7160426 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
     }
     mask->mpls_lse[0] = 0;
 
-    if (eth_type_vlan(key->vlans[0].tpid)) {
+    if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) {
         flower.key.encap_eth_type[0] = flower.key.eth_type;
+        flower.mask.encap_eth_type[0] = flower.mask.eth_type;
         flower.key.eth_type = key->vlans[0].tpid;
+        flower.mask.eth_type = mask->vlans[0].tpid;
     }
     if (mask->vlans[0].tci) {
         ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
@@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
         }
     }
 
-    if (eth_type_vlan(key->vlans[1].tpid)) {
+    if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) {
         flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
+        flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0];
         flower.key.encap_eth_type[0] = key->vlans[1].tpid;
+        flower.mask.encap_eth_type[0] = mask->vlans[1].tpid;
     }
     if (mask->vlans[1].tci) {
         ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
---

I'm not an expert here, so it needs checking and review form someone more
familiar with flower.


>   eth_type ipv4
>   ip_proto tcp
>   dst_port 4000
>   ip_flags nofrag
>   not_in_hw
>         action order 1:  pedit action pipe keys 1
>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>          key #0  at tcp+0: val 00000fa2 mask ffff0000
>         Action statistics:
>         Sent 368 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
>         backlog 0b 0p requeues 0
> 
>         action order 2: csum (tcp) action pipe
>         index 1 ref 1 bind 1 installed 3 sec used 3 sec
>         Action statistics:
>         Sent 368 bytes 8 pkt (dropped 8, overlimits 0 requeues 0)
>         backlog 0b 0p requeues 0
> 
>         action order 3: mirred (Egress Redirect to device ens260np1) stolen
>         index 1 ref 1 bind 1 installed 3 sec used 3 sec
>         Action statistics:
>         Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>         backlog 0b 0p requeues 0
>         cookie 8eb331199f4d41dc739e769a2f79f1ba
> 
> Note the drop counters in the csum(tcp) section. Before the patch the rule
> looks like this:
>   eth_type ipv4
>   ip_proto tcp
>   dst_port 4000
>   ip_flags nofrag
>   not_in_hw
>         action order 1:  pedit action pipe keys 1
>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>          key #0  at tcp+0: val 00000fa2 mask ffff0000
>         Action statistics:
>         Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>         backlog 0b 0p requeues 0
> 
>         action order 2: csum (tcp) action pipe
>         index 1 ref 1 bind 1 installed 3 sec used 3 sec
>         Action statistics:
>         Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>         backlog 0b 0p requeues 0
> 
>         action order 3: mirred (Egress Redirect to device ens260np1) stolen
>         index 1 ref 1 bind 1 installed 3 sec used 3 sec
>         Action statistics:
>         Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>         backlog 0b 0p requeues 0
>         cookie 83b3464d0c47e3747956bca7e8857a53
> 
> Here I suspect there may be a different issue as none of the counters
> increase
> so the TC rule is probably never hit (confirmed by the datapath rule timing
> out even with continuous traffic), but at least packets are correctly
> egressing via the userspace fallback path. It looks like this patch is
> installing a rule in TC that does get hit, but is now mysteriously
> dropping packets in the middle of the action processing pipeline.
> 
> Another strange observation is that this only seems to happen when setting
> one of the layer4 fields, setting layer3 seems to work as expected.
> 
> Regards
> Louis Peens
> 
> 
>> _______________________________________________
>> dev mailing list
>> dev at openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
Eli Britstein June 12, 2019, 9:53 a.m. UTC | #6
On 6/11/2019 2:21 PM, Ilya Maximets wrote:
>> On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman at netronome.com>
>> wrote:
>>
>>> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
>>>> Sorry Eli,
>>>>
>>>> I had missed this but I have it now.
>>>>
>>>> On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr at mellanox.com> wrote:
>>>>
>>>>> Ping
>>>>> ------------------------------
>>>>> *From:* Eli Britstein <elibr at mellanox.com>
>>>>> *Sent:* Tuesday, May 21, 2019 3:11:51 PM
>>>>> *To:* dev at openvswitch.org; Simon Horman
>>>>> *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
>>>>> *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
>>> tags
>>>>> The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
>>>>> either the VID, PCP or CFI are non-zero. For priority-tag packets
>>>>> there is a VLAN tag header with a zero VLAN TCI. Match on existence of
>>>>> VLAN header (TPID) regardless of TCI matching.
>>>>>
>>>>> Signed-off-by: Eli Britstein <elibr at mellanox.com>
>>>>> Reviewed-by: Roi Dayan <roid at mellanox.com>
>>> Thanks this seems to be a nice enhancement, applied to master.
>>>
>> Hey
>>
>> During some internal testing we found that this patch broke some
>> set-L4 cases when using tc-offloads. Setting up a simple bridge and
>> flow rule looking something like this:
>>
>>    ovs-vsctl show:
>>      Bridge "br0"
>>          Port "br0"
>>              Interface "br0"
>>                  type: internal
>>          Port "veth0r"
>>              Interface "veth0r"
>>          Port "veth1r"
>>              Interface "veth1r"
>>
>>    ovs-ofctl add-flow br0 cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
>>      actions=set_field:4002-\>tcp_dst,output:veth1r
>>
>> and then sending VLAN tagged traffic leads to packets not egressing the
>> port.
>>
>> The TC rule that gets installed looks like this:
>>    vlan_ethtype ip
> There must be no vlan_ethertype match just because there was no such field match
> in the original flow.
>
> Looking at the code I see that it checks for key.tpid which makes no sense,
> because it must check for the mask first. At least there must be something like
> this:
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 2af0f10d9..7e7160426 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
>       }
>       mask->mpls_lse[0] = 0;
>   
> -    if (eth_type_vlan(key->vlans[0].tpid)) {
> +    if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) {
>           flower.key.encap_eth_type[0] = flower.key.eth_type;
> +        flower.mask.encap_eth_type[0] = flower.mask.eth_type;
>           flower.key.eth_type = key->vlans[0].tpid;
> +        flower.mask.eth_type = mask->vlans[0].tpid;
>       }
>       if (mask->vlans[0].tci) {
>           ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
> @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
>           }
>       }
>   
> -    if (eth_type_vlan(key->vlans[1].tpid)) {
> +    if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) {
>           flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
> +        flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0];
>           flower.key.encap_eth_type[0] = key->vlans[1].tpid;
> +        flower.mask.encap_eth_type[0] = mask->vlans[1].tpid;
>       }
>       if (mask->vlans[1].tci) {
>           ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
> ---
>
> I'm not an expert here, so it needs checking and review form someone more
> familiar with flower.
This seems correct. Need to test it.
>
>
>>    eth_type ipv4
>>    ip_proto tcp
>>    dst_port 4000
>>    ip_flags nofrag
>>    not_in_hw
>>          action order 1:  pedit action pipe keys 1
>>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>>          Action statistics:
>>          Sent 368 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 2: csum (tcp) action pipe
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 368 bytes 8 pkt (dropped 8, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 3: mirred (Egress Redirect to device ens260np1) stolen
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>          cookie 8eb331199f4d41dc739e769a2f79f1ba
>>
>> Note the drop counters in the csum(tcp) section. Before the patch the rule
>> looks like this:
>>    eth_type ipv4
>>    ip_proto tcp
>>    dst_port 4000
>>    ip_flags nofrag
>>    not_in_hw
>>          action order 1:  pedit action pipe keys 1
>>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 2: csum (tcp) action pipe
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 3: mirred (Egress Redirect to device ens260np1) stolen
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>          cookie 83b3464d0c47e3747956bca7e8857a53
>>
>> Here I suspect there may be a different issue as none of the counters
>> increase
>> so the TC rule is probably never hit (confirmed by the datapath rule timing
>> out even with continuous traffic), but at least packets are correctly
>> egressing via the userspace fallback path. It looks like this patch is
>> installing a rule in TC that does get hit, but is now mysteriously
>> dropping packets in the middle of the action processing pipeline.
>>
>> Another strange observation is that this only seems to happen when setting
>> one of the layer4 fields, setting layer3 seems to work as expected.

I tried setting such tc rule (using tc tool), and it works OK for me.

Make sure you have commit 2ecba2d1e45b ("net: sched: act_csum: Fix csum 
calc for tagged packets") in your kernel.

>>
>> Regards
>> Louis Peens
>>
>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&amp;data=02%7C01%7Celibr%40mellanox.com%7Cc6de245c204d41f12c4408d6ee5ef971%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636958489017862941&amp;sdata=Z4b89HrFIv%2F%2BMnGVuyUqACYLBag%2Fmxi8YYt8i0omc3A%3D&amp;reserved=0
>>>
Louis Peens June 14, 2019, 2:40 p.m. UTC | #7
On Wed, Jun 12, 2019 at 11:53 AM Eli Britstein <elibr@mellanox.com> wrote:

>
> On 6/11/2019 2:21 PM, Ilya Maximets wrote:
> >> On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman at
> netronome.com>
> >> wrote:
> >>
> >>> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
> >>>> Sorry Eli,
> >>>>
> >>>> I had missed this but I have it now.
> >>>>
> >>>> On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr at mellanox.com>
> wrote:
> >>>>
> >>>>> Ping
> >>>>> ------------------------------
> >>>>> *From:* Eli Britstein <elibr at mellanox.com>
> >>>>> *Sent:* Tuesday, May 21, 2019 3:11:51 PM
> >>>>> *To:* dev at openvswitch.org; Simon Horman
> >>>>> *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
> >>>>> *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
> >>> tags
> >>>>> The logic by which a TC rule has a VLAN match is by the VLAN TCI
> field,
> >>>>> either the VID, PCP or CFI are non-zero. For priority-tag packets
> >>>>> there is a VLAN tag header with a zero VLAN TCI. Match on existence
> of
> >>>>> VLAN header (TPID) regardless of TCI matching.
> >>>>>
> >>>>> Signed-off-by: Eli Britstein <elibr at mellanox.com>
> >>>>> Reviewed-by: Roi Dayan <roid at mellanox.com>
> >>> Thanks this seems to be a nice enhancement, applied to master.
> >>>
> >> Hey
> >>
> >> During some internal testing we found that this patch broke some
> >> set-L4 cases when using tc-offloads. Setting up a simple bridge and
> >> flow rule looking something like this:
> >>
> >>    ovs-vsctl show:
> >>      Bridge "br0"
> >>          Port "br0"
> >>              Interface "br0"
> >>                  type: internal
> >>          Port "veth0r"
> >>              Interface "veth0r"
> >>          Port "veth1r"
> >>              Interface "veth1r"
> >>
> >>    ovs-ofctl add-flow br0
> cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
> >>      actions=set_field:4002-\>tcp_dst,output:veth1r
> >>
> >> and then sending VLAN tagged traffic leads to packets not egressing the
> >> port.
> >>
> >> The TC rule that gets installed looks like this:
> >>    vlan_ethtype ip
> > There must be no vlan_ethertype match just because there was no such
> field match
> > in the original flow.
> >
> > Looking at the code I see that it checks for key.tpid which makes no
> sense,
> > because it must check for the mask first. At least there must be
> something like
> > this:
> >
> > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> > index 2af0f10d9..7e7160426 100644
> > --- a/lib/netdev-offload-tc.c
> > +++ b/lib/netdev-offload-tc.c
> > @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct
> match *match,
> >       }
> >       mask->mpls_lse[0] = 0;
> >
> > -    if (eth_type_vlan(key->vlans[0].tpid)) {
> > +    if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) {
> >           flower.key.encap_eth_type[0] = flower.key.eth_type;
> > +        flower.mask.encap_eth_type[0] = flower.mask.eth_type;
> >           flower.key.eth_type = key->vlans[0].tpid;
> > +        flower.mask.eth_type = mask->vlans[0].tpid;
> >       }
> >       if (mask->vlans[0].tci) {
> >           ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
> > @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct
> match *match,
> >           }
> >       }
> >
> > -    if (eth_type_vlan(key->vlans[1].tpid)) {
> > +    if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) {
> >           flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
> > +        flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0];
> >           flower.key.encap_eth_type[0] = key->vlans[1].tpid;
> > +        flower.mask.encap_eth_type[0] = mask->vlans[1].tpid;
> >       }
> >       if (mask->vlans[1].tci) {
> >           ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
> > ---
> >
> > I'm not an expert here, so it needs checking and review form someone more
> > familiar with flower.
> This seems correct. Need to test it.
> >
> >
> >>    eth_type ipv4
> >>    ip_proto tcp
> >>    dst_port 4000
> >>    ip_flags nofrag
> >>    not_in_hw
> >>          action order 1:  pedit action pipe keys 1
> >>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
> >>           key #0  at tcp+0: val 00000fa2 mask ffff0000
> >>          Action statistics:
> >>          Sent 368 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
> >>          backlog 0b 0p requeues 0
> >>
> >>          action order 2: csum (tcp) action pipe
> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
> >>          Action statistics:
> >>          Sent 368 bytes 8 pkt (dropped 8, overlimits 0 requeues 0)
> >>          backlog 0b 0p requeues 0
> >>
> >>          action order 3: mirred (Egress Redirect to device ens260np1)
> stolen
> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
> >>          Action statistics:
> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>          backlog 0b 0p requeues 0
> >>          cookie 8eb331199f4d41dc739e769a2f79f1ba
> >>
> >> Note the drop counters in the csum(tcp) section. Before the patch the
> rule
> >> looks like this:
> >>    eth_type ipv4
> >>    ip_proto tcp
> >>    dst_port 4000
> >>    ip_flags nofrag
> >>    not_in_hw
> >>          action order 1:  pedit action pipe keys 1
> >>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
> >>           key #0  at tcp+0: val 00000fa2 mask ffff0000
> >>          Action statistics:
> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>          backlog 0b 0p requeues 0
> >>
> >>          action order 2: csum (tcp) action pipe
> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
> >>          Action statistics:
> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>          backlog 0b 0p requeues 0
> >>
> >>          action order 3: mirred (Egress Redirect to device ens260np1)
> stolen
> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
> >>          Action statistics:
> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >>          backlog 0b 0p requeues 0
> >>          cookie 83b3464d0c47e3747956bca7e8857a53
> >>
> >> Here I suspect there may be a different issue as none of the counters
> >> increase
> >> so the TC rule is probably never hit (confirmed by the datapath rule
> timing
> >> out even with continuous traffic), but at least packets are correctly
> >> egressing via the userspace fallback path. It looks like this patch is
> >> installing a rule in TC that does get hit, but is now mysteriously
> >> dropping packets in the middle of the action processing pipeline.
> >>
> >> Another strange observation is that this only seems to happen when
> setting
> >> one of the layer4 fields, setting layer3 seems to work as expected.
>
> I tried setting such tc rule (using tc tool), and it works OK for me.
>
> Make sure you have commit 2ecba2d1e45b ("net: sched: act_csum: Fix csum
> calc for tagged packets") in your kernel.
>
Hey, thanks for the input so far. Currently I'm still seeing this issue,
even with the above mentioned
csum patch included, but I suspect the issue I'm seeing is probably
related. I can reproduce the issue using
TC rules only, so this moves the blame from OVS. In case somebody
wants do dig deeper, here is my TC setup:

echo "Create veth interfaces"
ip link add $V0r type veth peer name $V0n
ip link add $V1r type veth peer name $V1n
echo "Recreate qdiscs"
tc qdisc del dev $V0r handle ffff: ingress
tc qdisc del dev $V1r handle ffff: ingress
tc qdisc add dev $V0r handle ffff: ingress
tc qdisc add dev $V1r handle ffff: ingress
echo "Add flow rules"
match="802.1Q flower vlan_ethtype ipv4 ip_proto tcp dst_port 4000 ip_flags
nofrag"
action="pedit ex munge tcp dport set 4002 pipe csum tcp "
action+="pipe mirred egress redirect dev $V1r"
tc filter add dev $V0r parent ffff: protocol ${match} action ${action}

The pcap I use looks like this:
reading from file tcp_vlan.pcap, link-type EN10MB (Ethernet)
15:16:55.293780 52:12:ef:32:45:ab > 54:a2:32:ef:54:1b, ethertype 802.1Q
(0x8100), length 64: vlan 100, p 2, ethertype IPv4, (tos 0xc, ttl 64, id 1,
offset 0, flags [none], proto TCP (6), length 46)
    1.2.3.4.3000 > 5.6.7.8.4000: Flags [S], cksum 0x2529 (correct), seq
0:6, win 8192, length 6

With the above config I see the same behaviour where the packets are
dropped at the csum action part. Further instrumentation in the
act_csum module shows that there is skb corruption happening - the 4002
value that is suppose to end up in the TCP destination
field ends up in the IP-header length field. This makes me think that there
is possibly another issue similar to the act_csum one that needs to be
fixed in
act_pedit but I've not found that yet.

This was on the linux stable tree at tag v5.2-rc4. Hope somebody here with
more knowledge on the kernel tc code can provide some input.
Thanks

Louis Peens

>
> >>
> >> Regards
> >> Louis Peens
> >>
> >>
> >>> _______________________________________________
> >>> dev mailing list
> >>> dev at openvswitch.org
> >>>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&amp;data=02%7C01%7Celibr%40mellanox.com%7Cc6de245c204d41f12c4408d6ee5ef971%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636958489017862941&amp;sdata=Z4b89HrFIv%2F%2BMnGVuyUqACYLBag%2Fmxi8YYt8i0omc3A%3D&amp;reserved=0
> >>>
>
Eli Britstein June 14, 2019, 4:01 p.m. UTC | #8
On 6/14/2019 5:40 PM, Louis Peens wrote:


On Wed, Jun 12, 2019 at 11:53 AM Eli Britstein <elibr@mellanox.com<mailto:elibr@mellanox.com>> wrote:

On 6/11/2019 2:21 PM, Ilya Maximets wrote:
>> On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman at netronome.com<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnetronome.com&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227223260&sdata=nO14Y2PABBRpM4JGuW26l5Fg5oP47m0VDz10xGCcheI%3D&reserved=0>>
>> wrote:
>>
>>> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
>>>> Sorry Eli,
>>>>
>>>> I had missed this but I have it now.
>>>>
>>>> On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr at mellanox.com<http://mellanox.com>> wrote:
>>>>
>>>>> Ping
>>>>> ------------------------------
>>>>> *From:* Eli Britstein <elibr at mellanox.com<http://mellanox.com>>
>>>>> *Sent:* Tuesday, May 21, 2019 3:11:51 PM
>>>>> *To:* dev at openvswitch.org<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenvswitch.org&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227223260&sdata=t3dIZiyDY7ZUmrjZQd%2FwCQ3QSCxsQGhVa8bevoD1zwQ%3D&reserved=0>; Simon Horman
>>>>> *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
>>>>> *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
>>> tags
>>>>> The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
>>>>> either the VID, PCP or CFI are non-zero. For priority-tag packets
>>>>> there is a VLAN tag header with a zero VLAN TCI. Match on existence of
>>>>> VLAN header (TPID) regardless of TCI matching.
>>>>>
>>>>> Signed-off-by: Eli Britstein <elibr at mellanox.com<http://mellanox.com>>
>>>>> Reviewed-by: Roi Dayan <roid at mellanox.com<http://mellanox.com>>
>>> Thanks this seems to be a nice enhancement, applied to master.
>>>
>> Hey
>>
>> During some internal testing we found that this patch broke some
>> set-L4 cases when using tc-offloads. Setting up a simple bridge and
>> flow rule looking something like this:
>>
>>    ovs-vsctl show:
>>      Bridge "br0"
>>          Port "br0"
>>              Interface "br0"
>>                  type: internal
>>          Port "veth0r"
>>              Interface "veth0r"
>>          Port "veth1r"
>>              Interface "veth1r"
>>
>>    ovs-ofctl add-flow br0 cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
>>      actions=set_field:4002-\>tcp_dst,output:veth1r
>>
>> and then sending VLAN tagged traffic leads to packets not egressing the
>> port.
>>
>> The TC rule that gets installed looks like this:
>>    vlan_ethtype ip
> There must be no vlan_ethertype match just because there was no such field match
> in the original flow.
>
> Looking at the code I see that it checks for key.tpid which makes no sense,
> because it must check for the mask first. At least there must be something like
> this:
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 2af0f10d9..7e7160426 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
>       }
>       mask->mpls_lse[0] = 0;
>
> -    if (eth_type_vlan(key->vlans[0].tpid)) {
> +    if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) {
>           flower.key.encap_eth_type[0] = flower.key.eth_type;
> +        flower.mask.encap_eth_type[0] = flower.mask.eth_type;
>           flower.key.eth_type = key->vlans[0].tpid;
> +        flower.mask.eth_type = mask->vlans[0].tpid;
>       }
>       if (mask->vlans[0].tci) {
>           ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
> @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
>           }
>       }
>
> -    if (eth_type_vlan(key->vlans[1].tpid)) {
> +    if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) {
>           flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
> +        flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0];
>           flower.key.encap_eth_type[0] = key->vlans[1].tpid;
> +        flower.mask.encap_eth_type[0] = mask->vlans[1].tpid;
>       }
>       if (mask->vlans[1].tci) {
>           ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
> ---
>
> I'm not an expert here, so it needs checking and review form someone more
> familiar with flower.
This seems correct. Need to test it.
>
>
>>    eth_type ipv4
>>    ip_proto tcp
>>    dst_port 4000
>>    ip_flags nofrag
>>    not_in_hw
>>          action order 1:  pedit action pipe keys 1
>>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>>          Action statistics:
>>          Sent 368 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 2: csum (tcp) action pipe
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 368 bytes 8 pkt (dropped 8, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 3: mirred (Egress Redirect to device ens260np1) stolen
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>          cookie 8eb331199f4d41dc739e769a2f79f1ba
>>
>> Note the drop counters in the csum(tcp) section. Before the patch the rule
>> looks like this:
>>    eth_type ipv4
>>    ip_proto tcp
>>    dst_port 4000
>>    ip_flags nofrag
>>    not_in_hw
>>          action order 1:  pedit action pipe keys 1
>>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 2: csum (tcp) action pipe
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>
>>          action order 3: mirred (Egress Redirect to device ens260np1) stolen
>>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>>          Action statistics:
>>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>          backlog 0b 0p requeues 0
>>          cookie 83b3464d0c47e3747956bca7e8857a53
>>
>> Here I suspect there may be a different issue as none of the counters
>> increase
>> so the TC rule is probably never hit (confirmed by the datapath rule timing
>> out even with continuous traffic), but at least packets are correctly
>> egressing via the userspace fallback path. It looks like this patch is
>> installing a rule in TC that does get hit, but is now mysteriously
>> dropping packets in the middle of the action processing pipeline.
>>
>> Another strange observation is that this only seems to happen when setting
>> one of the layer4 fields, setting layer3 seems to work as expected.

I tried setting such tc rule (using tc tool), and it works OK for me.

Make sure you have commit 2ecba2d1e45b ("net: sched: act_csum: Fix csum
calc for tagged packets") in your kernel.
Hey, thanks for the input so far. Currently I'm still seeing this issue, even with the above mentioned
csum patch included, but I suspect the issue I'm seeing is probably related. I can reproduce the issue using
TC rules only, so this moves the blame from OVS. In case somebody
wants do dig deeper, here is my TC setup:

echo "Create veth interfaces"
ip link add $V0r type veth peer name $V0n
ip link add $V1r type veth peer name $V1n
echo "Recreate qdiscs"
tc qdisc del dev $V0r handle ffff: ingress
tc qdisc del dev $V1r handle ffff: ingress
tc qdisc add dev $V0r handle ffff: ingress
tc qdisc add dev $V1r handle ffff: ingress
echo "Add flow rules"
match="802.1Q flower vlan_ethtype ipv4 ip_proto tcp dst_port 4000 ip_flags nofrag"
action="pedit ex munge tcp dport set 4002 pipe csum tcp "
action+="pipe mirred egress redirect dev $V1r"
tc filter add dev $V0r parent ffff: protocol ${match} action ${action}

The pcap I use looks like this:
reading from file tcp_vlan.pcap, link-type EN10MB (Ethernet)
15:16:55.293780 52:12:ef:32:45:ab > 54:a2:32:ef:54:1b, ethertype 802.1Q (0x8100), length 64: vlan 100, p 2, ethertype IPv4, (tos 0xc, ttl 64, id 1, offset 0, flags [none], proto TCP (6), length 46)
    1.2.3.4.3000 > 5.6.7.8.4000: Flags [S], cksum 0x2529 (correct), seq 0:6, win 8192, length 6

With the above config I see the same behaviour where the packets are dropped at the csum action part. Further instrumentation in the
act_csum module shows that there is skb corruption happening - the 4002 value that is suppose to end up in the TCP destination
field ends up in the IP-header length field. This makes me think that there is possibly another issue similar to the act_csum one that needs to be fixed in
act_pedit but I've not found that yet.

This was on the linux stable tree at tag v5.2-rc4. Hope somebody here with more knowledge on the kernel tc code can provide some input.
Thanks

Can you try this? it is still not final (and not yet accepted), but maybe it fixes your issue

https://lore.kernel.org/netdev/830aa8f07b528b50b212c01d53de6ec651500535.1559322531.git.dcaratti@redhat.com/

Louis Peens

>>
>> Regards
>> Louis Peens
>>
>>
>>> _______________________________________________
>>> dev mailing list
>>> dev at openvswitch.org<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenvswitch.org&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227233250&sdata=IX1K1vVoW%2B1EH8WJzpZ8sY0jHbpH%2F5fkSj5HE3rAfvU%3D&reserved=0>
>>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&amp;data=02%7C01%7Celibr%40mellanox.com%7Cc6de245c204d41f12c4408d6ee5ef971%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636958489017862941&amp;sdata=Z4b89HrFIv%2F%2BMnGVuyUqACYLBag%2Fmxi8YYt8i0omc3A%3D&amp;reserved=0<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227233250&sdata=7THf8MLACqYwOrnsT1xIOiBS1mfMJyBPOM5ijl%2Fwr44%3D&reserved=0>
>>>
Ilya Maximets June 19, 2019, 7:55 a.m. UTC | #9
On 12.06.2019 12:53, Eli Britstein wrote:
> 
> On 6/11/2019 2:21 PM, Ilya Maximets wrote:
>>> On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman at netronome.com>
>>> wrote:
>>>
>>>> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
>>>>> Sorry Eli,
>>>>>
>>>>> I had missed this but I have it now.
>>>>>
>>>>> On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr at mellanox.com> wrote:
>>>>>
>>>>>> Ping
>>>>>> ------------------------------
>>>>>> *From:* Eli Britstein <elibr at mellanox.com>
>>>>>> *Sent:* Tuesday, May 21, 2019 3:11:51 PM
>>>>>> *To:* dev at openvswitch.org; Simon Horman
>>>>>> *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
>>>>>> *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
>>>> tags
>>>>>> The logic by which a TC rule has a VLAN match is by the VLAN TCI field,
>>>>>> either the VID, PCP or CFI are non-zero. For priority-tag packets
>>>>>> there is a VLAN tag header with a zero VLAN TCI. Match on existence of
>>>>>> VLAN header (TPID) regardless of TCI matching.
>>>>>>
>>>>>> Signed-off-by: Eli Britstein <elibr at mellanox.com>
>>>>>> Reviewed-by: Roi Dayan <roid at mellanox.com>
>>>> Thanks this seems to be a nice enhancement, applied to master.
>>>>
>>> Hey
>>>
>>> During some internal testing we found that this patch broke some
>>> set-L4 cases when using tc-offloads. Setting up a simple bridge and
>>> flow rule looking something like this:
>>>
>>>    ovs-vsctl show:
>>>      Bridge "br0"
>>>          Port "br0"
>>>              Interface "br0"
>>>                  type: internal
>>>          Port "veth0r"
>>>              Interface "veth0r"
>>>          Port "veth1r"
>>>              Interface "veth1r"
>>>
>>>    ovs-ofctl add-flow br0 cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
>>>      actions=set_field:4002-\>tcp_dst,output:veth1r
>>>
>>> and then sending VLAN tagged traffic leads to packets not egressing the
>>> port.
>>>
>>> The TC rule that gets installed looks like this:
>>>    vlan_ethtype ip
>> There must be no vlan_ethertype match just because there was no such field match
>> in the original flow.
>>
>> Looking at the code I see that it checks for key.tpid which makes no sense,
>> because it must check for the mask first. At least there must be something like
>> this:
>>
>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>> index 2af0f10d9..7e7160426 100644
>> --- a/lib/netdev-offload-tc.c
>> +++ b/lib/netdev-offload-tc.c
>> @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
>>       }
>>       mask->mpls_lse[0] = 0;
>>   
>> -    if (eth_type_vlan(key->vlans[0].tpid)) {
>> +    if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) {
>>           flower.key.encap_eth_type[0] = flower.key.eth_type;
>> +        flower.mask.encap_eth_type[0] = flower.mask.eth_type;
>>           flower.key.eth_type = key->vlans[0].tpid;
>> +        flower.mask.eth_type = mask->vlans[0].tpid;
>>       }
>>       if (mask->vlans[0].tci) {
>>           ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
>> @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct match *match,
>>           }
>>       }
>>   
>> -    if (eth_type_vlan(key->vlans[1].tpid)) {
>> +    if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) {
>>           flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
>> +        flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0];
>>           flower.key.encap_eth_type[0] = key->vlans[1].tpid;
>> +        flower.mask.encap_eth_type[0] = mask->vlans[1].tpid;
>>       }
>>       if (mask->vlans[1].tci) {
>>           ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
>> ---
>>
>> I'm not an expert here, so it needs checking and review form someone more
>> familiar with flower.
> This seems correct. Need to test it.

I'll send this as an official patch.
Louis Peens June 27, 2019, 12:55 p.m. UTC | #10
On Fri, Jun 14, 2019 at 6:01 PM Eli Britstein <elibr@mellanox.com> wrote:

>
> On 6/14/2019 5:40 PM, Louis Peens wrote:
>
>
>
> On Wed, Jun 12, 2019 at 11:53 AM Eli Britstein <elibr@mellanox.com> wrote:
>
>>
>> On 6/11/2019 2:21 PM, Ilya Maximets wrote:
>> >> On Mon, Jun 3, 2019 at 4:42 PM Simon Horman <simon.horman at
>> netronome.com
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fnetronome.com&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227223260&sdata=nO14Y2PABBRpM4JGuW26l5Fg5oP47m0VDz10xGCcheI%3D&reserved=0>
>> >
>> >> wrote:
>> >>
>> >>> On Fri, May 31, 2019 at 02:45:25PM +0200, Simon Horman wrote:
>> >>>> Sorry Eli,
>> >>>>
>> >>>> I had missed this but I have it now.
>> >>>>
>> >>>> On Fri, 31 May 2019 at 09:35, Eli Britstein <elibr at mellanox.com>
>> wrote:
>> >>>>
>> >>>>> Ping
>> >>>>> ------------------------------
>> >>>>> *From:* Eli Britstein <elibr at mellanox.com>
>> >>>>> *Sent:* Tuesday, May 21, 2019 3:11:51 PM
>> >>>>> *To:* dev at openvswitch.org
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenvswitch.org&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227223260&sdata=t3dIZiyDY7ZUmrjZQd%2FwCQ3QSCxsQGhVa8bevoD1zwQ%3D&reserved=0>;
>> Simon Horman
>> >>>>> *Cc:* Roi Dayan; Paul Blakey; Eli Cohen; Eli Britstein
>> >>>>> *Subject:* [PATCH 1/1] netdev-tc-offloads: Support match on priority
>> >>> tags
>> >>>>> The logic by which a TC rule has a VLAN match is by the VLAN TCI
>> field,
>> >>>>> either the VID, PCP or CFI are non-zero. For priority-tag packets
>> >>>>> there is a VLAN tag header with a zero VLAN TCI. Match on existence
>> of
>> >>>>> VLAN header (TPID) regardless of TCI matching.
>> >>>>>
>> >>>>> Signed-off-by: Eli Britstein <elibr at mellanox.com>
>> >>>>> Reviewed-by: Roi Dayan <roid at mellanox.com>
>> >>> Thanks this seems to be a nice enhancement, applied to master.
>> >>>
>> >> Hey
>> >>
>> >> During some internal testing we found that this patch broke some
>> >> set-L4 cases when using tc-offloads. Setting up a simple bridge and
>> >> flow rule looking something like this:
>> >>
>> >>    ovs-vsctl show:
>> >>      Bridge "br0"
>> >>          Port "br0"
>> >>              Interface "br0"
>> >>                  type: internal
>> >>          Port "veth0r"
>> >>              Interface "veth0r"
>> >>          Port "veth1r"
>> >>              Interface "veth1r"
>> >>
>> >>    ovs-ofctl add-flow br0
>> cookie=1,table=0,in_port=veth0r,tcp,tcp_dst=4000, \
>> >>      actions=set_field:4002-\>tcp_dst,output:veth1r
>> >>
>> >> and then sending VLAN tagged traffic leads to packets not egressing the
>> >> port.
>> >>
>> >> The TC rule that gets installed looks like this:
>> >>    vlan_ethtype ip
>> > There must be no vlan_ethertype match just because there was no such
>> field match
>> > in the original flow.
>> >
>> > Looking at the code I see that it checks for key.tpid which makes no
>> sense,
>> > because it must check for the mask first. At least there must be
>> something like
>> > this:
>> >
>> > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>> > index 2af0f10d9..7e7160426 100644
>> > --- a/lib/netdev-offload-tc.c
>> > +++ b/lib/netdev-offload-tc.c
>> > @@ -1146,9 +1146,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct
>> match *match,
>> >       }
>> >       mask->mpls_lse[0] = 0;
>> >
>> > -    if (eth_type_vlan(key->vlans[0].tpid)) {
>> > +    if (mask->vlans[0].tpid && eth_type_vlan(key->vlans[0].tpid)) {
>> >           flower.key.encap_eth_type[0] = flower.key.eth_type;
>> > +        flower.mask.encap_eth_type[0] = flower.mask.eth_type;
>> >           flower.key.eth_type = key->vlans[0].tpid;
>> > +        flower.mask.eth_type = mask->vlans[0].tpid;
>> >       }
>> >       if (mask->vlans[0].tci) {
>> >           ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
>> > @@ -1179,9 +1181,11 @@ netdev_tc_flow_put(struct netdev *netdev, struct
>> match *match,
>> >           }
>> >       }
>> >
>> > -    if (eth_type_vlan(key->vlans[1].tpid)) {
>> > +    if (mask->vlans[1].tpid && eth_type_vlan(key->vlans[1].tpid)) {
>> >           flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
>> > +        flower.mask.encap_eth_type[1] = flower.mask.encap_eth_type[0];
>> >           flower.key.encap_eth_type[0] = key->vlans[1].tpid;
>> > +        flower.mask.encap_eth_type[0] = mask->vlans[1].tpid;
>> >       }
>> >       if (mask->vlans[1].tci) {
>> >           ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
>> > ---
>> >
>> > I'm not an expert here, so it needs checking and review form someone
>> more
>> > familiar with flower.
>> This seems correct. Need to test it.
>> >
>> >
>> >>    eth_type ipv4
>> >>    ip_proto tcp
>> >>    dst_port 4000
>> >>    ip_flags nofrag
>> >>    not_in_hw
>> >>          action order 1:  pedit action pipe keys 1
>> >>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>> >>          Action statistics:
>> >>          Sent 368 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 2: csum (tcp) action pipe
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 368 bytes 8 pkt (dropped 8, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 3: mirred (Egress Redirect to device ens260np1)
>> stolen
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>          cookie 8eb331199f4d41dc739e769a2f79f1ba
>> >>
>> >> Note the drop counters in the csum(tcp) section. Before the patch the
>> rule
>> >> looks like this:
>> >>    eth_type ipv4
>> >>    ip_proto tcp
>> >>    dst_port 4000
>> >>    ip_flags nofrag
>> >>    not_in_hw
>> >>          action order 1:  pedit action pipe keys 1
>> >>           index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>           key #0  at tcp+0: val 00000fa2 mask ffff0000
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 2: csum (tcp) action pipe
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>
>> >>          action order 3: mirred (Egress Redirect to device ens260np1)
>> stolen
>> >>          index 1 ref 1 bind 1 installed 3 sec used 3 sec
>> >>          Action statistics:
>> >>          Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>> >>          backlog 0b 0p requeues 0
>> >>          cookie 83b3464d0c47e3747956bca7e8857a53
>> >>
>> >> Here I suspect there may be a different issue as none of the counters
>> >> increase
>> >> so the TC rule is probably never hit (confirmed by the datapath rule
>> timing
>> >> out even with continuous traffic), but at least packets are correctly
>> >> egressing via the userspace fallback path. It looks like this patch is
>> >> installing a rule in TC that does get hit, but is now mysteriously
>> >> dropping packets in the middle of the action processing pipeline.
>> >>
>> >> Another strange observation is that this only seems to happen when
>> setting
>> >> one of the layer4 fields, setting layer3 seems to work as expected.
>>
>> I tried setting such tc rule (using tc tool), and it works OK for me.
>>
>> Make sure you have commit 2ecba2d1e45b ("net: sched: act_csum: Fix csum
>> calc for tagged packets") in your kernel.
>>
> Hey, thanks for the input so far. Currently I'm still seeing this issue,
> even with the above mentioned
> csum patch included, but I suspect the issue I'm seeing is probably
> related. I can reproduce the issue using
> TC rules only, so this moves the blame from OVS. In case somebody
> wants do dig deeper, here is my TC setup:
>
> echo "Create veth interfaces"
> ip link add $V0r type veth peer name $V0n
> ip link add $V1r type veth peer name $V1n
> echo "Recreate qdiscs"
> tc qdisc del dev $V0r handle ffff: ingress
> tc qdisc del dev $V1r handle ffff: ingress
> tc qdisc add dev $V0r handle ffff: ingress
> tc qdisc add dev $V1r handle ffff: ingress
> echo "Add flow rules"
> match="802.1Q flower vlan_ethtype ipv4 ip_proto tcp dst_port 4000 ip_flags
> nofrag"
> action="pedit ex munge tcp dport set 4002 pipe csum tcp "
> action+="pipe mirred egress redirect dev $V1r"
> tc filter add dev $V0r parent ffff: protocol ${match} action ${action}
>
> The pcap I use looks like this:
> reading from file tcp_vlan.pcap, link-type EN10MB (Ethernet)
> 15:16:55.293780 52:12:ef:32:45:ab > 54:a2:32:ef:54:1b, ethertype 802.1Q
> (0x8100), length 64: vlan 100, p 2, ethertype IPv4, (tos 0xc, ttl 64, id 1,
> offset 0, flags [none], proto TCP (6), length 46)
>     1.2.3.4.3000 > 5.6.7.8.4000: Flags [S], cksum 0x2529 (correct), seq
> 0:6, win 8192, length 6
>
> With the above config I see the same behaviour where the packets are
> dropped at the csum action part. Further instrumentation in the
> act_csum module shows that there is skb corruption happening - the 4002
> value that is suppose to end up in the TCP destination
> field ends up in the IP-header length field. This makes me think that
> there is possibly another issue similar to the act_csum one that needs to
> be fixed in
> act_pedit but I've not found that yet.
>
> This was on the linux stable tree at tag v5.2-rc4. Hope somebody here with
> more knowledge on the kernel tc code can provide some input.
> Thanks
>
> Can you try this? it is still not final (and not yet accepted), but maybe
> it fixes your issue
>
>
> https://lore.kernel.org/netdev/830aa8f07b528b50b212c01d53de6ec651500535.1559322531.git.dcaratti@redhat.com/
>
Hey
Sorry about the delay, I only got back to this now. Thanks for the pointer
to the patches above, they do seem to be in the
correct direction. Unfortunately they don't make a difference in my case
since it only makes changes for L3 and not L4 (tcp src/dst in my test).
I tried implementing something similar for L4 but could not do so with any
success yet.
I did find that both: "skb_network_offset" and "skb_transport_offset" are 0
in the
"TCA_PEDIT_KEY_EX_HDR_TYPE_TCP/UDP" case statement in
"pedit_skb_hdr_offset". This is inside the
"skb_transport_header_was_set" check, so I would expect the
transport_offset to be non-zero or at least different to the network_offset.
That explains why pedit is changing the value at the incorrect offset,
however I could not figure out what is causing this yet.

That is the update from my side, maybe this triggers another idea from
someone on what is going wrong.
Thanks

>
> Louis Peens
>
>>
>> >>
>> >> Regards
>> >> Louis Peens
>> >>
>> >>
>> >>> _______________________________________________
>> >>> dev mailing list
>> >>> dev at openvswitch.org
>> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenvswitch.org&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227233250&sdata=IX1K1vVoW%2B1EH8WJzpZ8sY0jHbpH%2F5fkSj5HE3rAfvU%3D&reserved=0>
>> >>>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&amp;data=02%7C01%7Celibr%40mellanox.com%7Cc6de245c204d41f12c4408d6ee5ef971%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636958489017862941&amp;sdata=Z4b89HrFIv%2F%2BMnGVuyUqACYLBag%2Fmxi8YYt8i0omc3A%3D&amp;reserved=0
>> <https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=02%7C01%7Celibr%40mellanox.com%7C139fdfc2545245c825a208d6f0d6389d%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636961200227233250&sdata=7THf8MLACqYwOrnsT1xIOiBS1mfMJyBPOM5ijl%2Fwr44%3D&reserved=0>
>> >>>
>>
>
>
>
diff mbox series

Patch

diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
index d5c66acc1..ef9ee0786 100644
--- a/lib/netdev-tc-offloads.c
+++ b/lib/netdev-tc-offloads.c
@@ -1146,6 +1146,10 @@  netdev_tc_flow_put(struct netdev *netdev, struct match *match,
     }
     mask->mpls_lse[0] = 0;
 
+    if (eth_type_vlan(key->vlans[0].tpid)) {
+        flower.key.encap_eth_type[0] = flower.key.eth_type;
+        flower.key.eth_type = key->vlans[0].tpid;
+    }
     if (mask->vlans[0].tci) {
         ovs_be16 vid_mask = mask->vlans[0].tci & htons(VLAN_VID_MASK);
         ovs_be16 pcp_mask = mask->vlans[0].tci & htons(VLAN_PCP_MASK);
@@ -1166,8 +1170,6 @@  netdev_tc_flow_put(struct netdev *netdev, struct match *match,
                 VLOG_DBG_RL(&rl, "vlan_prio[0]: %d\n",
                             flower.key.vlan_prio[0]);
             }
-            flower.key.encap_eth_type[0] = flower.key.eth_type;
-            flower.key.eth_type = key->vlans[0].tpid;
         } else if (mask->vlans[0].tci == htons(0xffff) &&
                    ntohs(key->vlans[0].tci) == 0) {
             /* exact && no vlan */
@@ -1177,6 +1179,10 @@  netdev_tc_flow_put(struct netdev *netdev, struct match *match,
         }
     }
 
+    if (eth_type_vlan(key->vlans[1].tpid)) {
+        flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
+        flower.key.encap_eth_type[0] = key->vlans[1].tpid;
+    }
     if (mask->vlans[1].tci) {
         ovs_be16 vid_mask = mask->vlans[1].tci & htons(VLAN_VID_MASK);
         ovs_be16 pcp_mask = mask->vlans[1].tci & htons(VLAN_PCP_MASK);
@@ -1196,8 +1202,6 @@  netdev_tc_flow_put(struct netdev *netdev, struct match *match,
                 flower.mask.vlan_prio[1] = vlan_tci_to_pcp(mask->vlans[1].tci);
                 VLOG_DBG_RL(&rl, "vlan_prio[1]: %d", flower.key.vlan_prio[1]);
             }
-            flower.key.encap_eth_type[1] = flower.key.encap_eth_type[0];
-            flower.key.encap_eth_type[0] = key->vlans[1].tpid;
         } else if (mask->vlans[1].tci == htons(0xffff) &&
                    ntohs(key->vlans[1].tci) == 0) {
             /* exact && no vlan */