diff mbox series

[ovs-dev] ofproto: Fix for frequent invalidation of flows due to mismatch in mask bits

Message ID 1579191443-27484-1-git-send-email-vishal.deep.ajmera@ericsson.com
State Deferred
Headers show
Series [ovs-dev] ofproto: Fix for frequent invalidation of flows due to mismatch in mask bits | expand

Commit Message

Li,Rongqing via dev Jan. 16, 2020, 4:17 p.m. UTC
The wildcard bits in the installed megaflow entry could be different from
the bits originally generated by the ofproto layer. Datapath implementation
wildcards those match fields which are not present in the incoming packet
before installing the flow.

When the revalidator thread validates a megaflow, it will first query the
ofproto layer to get the wildcard bits and then compare it against the
wildcard bits in the megaflow. If the bits are different the entry will be
removed. A subsequent packet will again result in the same megaflow entry
being installed only for it to be removed by the revalidator thread. This
cycle will continue and will significantly degrade performance.

An earlier patch fixing the issue for MPLS and VLAN was sent.
However similar problem now appears for IPv6 datapath flows.

This patch addresses the issue in a generic way.

Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
---
 ofproto/ofproto-dpif-upcall.c | 14 +++++++++++++-
 ofproto/ofproto-dpif-xlate.c  | 20 --------------------
 2 files changed, 13 insertions(+), 21 deletions(-)

Comments

Ben Pfaff Jan. 21, 2020, 9:53 p.m. UTC | #1
On Thu, Jan 16, 2020 at 09:47:23PM +0530, Vishal Deep Ajmera via dev wrote:
> The wildcard bits in the installed megaflow entry could be different from
> the bits originally generated by the ofproto layer. Datapath implementation
> wildcards those match fields which are not present in the incoming packet
> before installing the flow.
> 
> When the revalidator thread validates a megaflow, it will first query the
> ofproto layer to get the wildcard bits and then compare it against the
> wildcard bits in the megaflow. If the bits are different the entry will be
> removed. A subsequent packet will again result in the same megaflow entry
> being installed only for it to be removed by the revalidator thread. This
> cycle will continue and will significantly degrade performance.
> 
> An earlier patch fixing the issue for MPLS and VLAN was sent.
> However similar problem now appears for IPv6 datapath flows.
> 
> This patch addresses the issue in a generic way.
> 
> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>

This is clever!  I think that this kind of approach is the right one.

I don't yet understand why this should be moved from xlate_wc_finish()
to revalidate_ukey__().  Can you help me understand that?
Li,Rongqing via dev Jan. 23, 2020, 10:26 a.m. UTC | #2
> This is clever!  I think that this kind of approach is the right one.
> 
> I don't yet understand why this should be moved from xlate_wc_finish()
> to revalidate_ukey__().  Can you help me understand that?

Hi Ben,

Thanks for reviewing the patch. When I use the same fix in xlate_wc_finish()
I get several unit test failures: 0738  0756  0763  0768  2247  2251  2255
2256.

0738:
-Megaflow:
recirc_id=0,eth,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=2.2.2.2,tun_tos=3,tun_fl
ags=-df-csum-key,in_port=1,nw_ecn=3,nw_frag=no
+Megaflow:
recirc_id=0,eth,ip,tun_src=1.1.1.1,tun_dst=2.2.2.2,tun_tos=3,tun_flags=-df-c
sum-key,in_port=1,nw_ecn=3,nw_frag=no

0756:
-Megaflow:
recirc_id=0,eth,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_fl
ags=+df-csum+key,tun_metadata0,tun_metadata1=NP,tun_metadata2=NP,in_port=1,n
w_frag=no
+Megaflow:
recirc_id=0,eth,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_fl
ags=+df-csum+key,tun_metadata0,in_port=1,nw_frag=no

0763:
+2020-01-23T06:05:58.695Z|00002|odp_util(revalidator9)|WARN|unexpected L3
matching with masked Ethertype 0x800/0
+2020-01-23T06:05:58.695Z|00003|odp_util(revalidator9)|WARN|the flow mask in
error is:
skb_priority(0),tunnel(tun_id=0xffffffffffffffff,src=255.255.255.255,dst=255
.255.255.255,tos=0xff,ttl=0,flags(df|csum|key)),skb_mark(0),ct_state(0),ct_z
one(0),ct_mark(0),ct_label(0),recirc_id(0xffffffff),dp_hash(0),in_port(42949
67295),packet_type(ns=65535,id=0xffff),eth_type(0x0000),ipv4(src=0.0.0.0,dst
=0.0.0.0,proto=0,tos=0x3,ttl=0,frag=<error>),icmp(type=0,code=0), for the
following flow key:
packet_type=(1,0x800),tun_id=0x1c8,tun_src=1.1.2.92,tun_dst=1.1.2.88,tun_ipv
6_src=::,tun_ipv6_dst=::,tun_gbp_id=0,tun_gbp_flags=0,tun_tos=3,tun_ttl=64,t
un_erspan_ver=0,tun_flags=key,in_port=3,nw_src=30.0.0.1,nw_dst=30.0.0.2,nw_p
roto=1,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0

And likewise. Any idea why these failures? It looks to me that the patch
wild-carded more mask bits then expected.

Regards,
Vishal
Li,Rongqing via dev Feb. 4, 2020, 9:15 a.m. UTC | #3
> Thanks for reviewing the patch. When I use the same fix in
xlate_wc_finish()
> I get several unit test failures: 0738  0756  0763  0768  2247  2251  2255
> 2256.
> 
> 0738:
> -Megaflow:
>
recirc_id=0,eth,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=2.2.2.2,tun_tos=3,tun_fl
> ags=-df-csum-key,in_port=1,nw_ecn=3,nw_frag=no
> +Megaflow:
>
recirc_id=0,eth,ip,tun_src=1.1.1.1,tun_dst=2.2.2.2,tun_tos=3,tun_flags=-df-c
> sum-key,in_port=1,nw_ecn=3,nw_frag=no
> 
> 0756:
> -Megaflow:
>
recirc_id=0,eth,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_fl
> ags=+df-
> csum+key,tun_metadata0,tun_metadata1=NP,tun_metadata2=NP,in_port=1,n
> w_frag=no
> +Megaflow:
>
recirc_id=0,eth,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=1.1.1.2,tun_tos=0,tun_fl
> ags=+df-csum+key,tun_metadata0,in_port=1,nw_frag=no
> 
> 0763:
> +2020-01-23T06:05:58.695Z|00002|odp_util(revalidator9)|WARN|unexpected
> L3
> matching with masked Ethertype 0x800/0
> +2020-01-23T06:05:58.695Z|00003|odp_util(revalidator9)|WARN|the flow
> mask in
> error is:
>
skb_priority(0),tunnel(tun_id=0xffffffffffffffff,src=255.255.255.255,dst=255
>
.255.255.255,tos=0xff,ttl=0,flags(df|csum|key)),skb_mark(0),ct_state(0),ct_z
>
one(0),ct_mark(0),ct_label(0),recirc_id(0xffffffff),dp_hash(0),in_port(42949
>
67295),packet_type(ns=65535,id=0xffff),eth_type(0x0000),ipv4(src=0.0.0.0,dst
> =0.0.0.0,proto=0,tos=0x3,ttl=0,frag=<error>),icmp(type=0,code=0), for the
> following flow key:
>
packet_type=(1,0x800),tun_id=0x1c8,tun_src=1.1.2.92,tun_dst=1.1.2.88,tun_ipv
>
6_src=::,tun_ipv6_dst=::,tun_gbp_id=0,tun_gbp_flags=0,tun_tos=3,tun_ttl=64,t
> un_erspan_ver=0,tun_flags=key,in_port=3,nw_src=30.0.0.1,nw_dst=30.0.0.2,n
> w_p
> roto=1,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0
> 
> And likewise. Any idea why these failures? It looks to me that the patch
> wild-carded more mask bits then expected.

Hi Ben,

To me it looks like we will need an equivalent of flow_wc_map() which is
less stricter than
flow_wildcards_init_for_packet() if we want to add this check in
xlate_wc_finish().

Let me know if the current solution has issues. I can try to move this check
to
xlate_wc_finish() otherwise.

Warm Regards,
Vishal Ajmera
Li,Rongqing via dev Feb. 12, 2020, 8:38 a.m. UTC | #4
> To me it looks like we will need an equivalent of flow_wc_map() which is
> less stricter than
> flow_wildcards_init_for_packet() if we want to add this check in
> xlate_wc_finish().
> 
> Let me know if the current solution has issues. I can try to move this
check
> to
> xlate_wc_finish() otherwise.

Hi Ben,

Any thoughts on this? Should I write an equivalent of flow_wc_map() and use
it in xlate_wc_finish()
or we use a stricter version i.e. flow_wildcards_init_for_packet() and use
in revalidate_ukey__?

Warm Regards,
Vishal Ajmera
Ben Pfaff March 6, 2020, 9:41 p.m. UTC | #5
On Wed, Feb 12, 2020 at 08:38:51AM +0000, Vishal Deep Ajmera wrote:
> > To me it looks like we will need an equivalent of flow_wc_map() which is
> > less stricter than
> > flow_wildcards_init_for_packet() if we want to add this check in
> > xlate_wc_finish().
> > 
> > Let me know if the current solution has issues. I can try to move this
> check
> > to
> > xlate_wc_finish() otherwise.
> 
> Hi Ben,
> 
> Any thoughts on this? Should I write an equivalent of flow_wc_map() and use
> it in xlate_wc_finish()
> or we use a stricter version i.e. flow_wildcards_init_for_packet() and use
> in revalidate_ukey__?

My big problem for review here is that I don't understand the motivation
for moving the code around.  Sure, it *works*, but why?  Why did you
think to do it?
Vishal Deep Ajmera April 20, 2020, 7:15 a.m. UTC | #6
> >
> > Any thoughts on this? Should I write an equivalent of flow_wc_map() and
> use
> > it in xlate_wc_finish()
> > or we use a stricter version i.e. flow_wildcards_init_for_packet() and
use
> > in revalidate_ukey__?
> 
> My big problem for review here is that I don't understand the motivation
> for moving the code around.  Sure, it *works*, but why?  Why did you
> think to do it?

The reason to move the code from xlate_wc_finish() to revalidate_ukey__() is
as it caused quite a few unit tests to fail.
After looking at the failures I notice that Megaflow is not matching the
'expected' value.

For e.g. failure for test 0738 is:

-Megaflow:
recirc_id=0,eth,ip,tun_id=0,tun_src=1.1.1.1,tun_dst=2.2.2.2,tun_tos=3,tun_fl
ags=-df-csum-key,in_port=1,nw_ecn=3,nw_frag=no
+Megaflow:
recirc_id=0,eth,ip,tun_src=1.1.1.1,tun_dst=2.2.2.2,tun_tos=3,tun_flags=-df-c
sum-key,in_port=1,nw_ecn=3,nw_frag=no

flow_wildcards_init_for_packet() unwildcards 'tun_id' only when either
FLOW_TNL_F_KEY is set in tunnel flags or tun->ip_dst is not set & tun_id is
non-zero.

However the 'expected' Megaflow itself is not correct in this case as
tun_flags = -key and tun_dst is set, so tun_id should have been wildcarded
but it is not.

Let me know if it makes sense.

Warm Regards,
Vishal Ajmera
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 409286a..1371486 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2169,7 +2169,7 @@  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
 {
     struct xlate_out *xoutp;
     struct netflow *netflow;
-    struct flow_wildcards dp_mask, wc;
+    struct flow_wildcards dp_mask, wc, wc_from_flow;
     enum reval_result result;
     struct reval_context ctx = {
         .odp_actions = odp_actions,
@@ -2215,6 +2215,18 @@  revalidate_ukey__(struct udpif *udpif, const struct udpif_key *ukey,
         goto exit;
     }
 
+    /* Clear flow wildcard bits for fields which are not present
+     * in the original packet header. These wildcards may get set
+     * due to push/set_field actions. This results into frequent
+     * invalidation of datapath flows by revalidator thread. */
+
+    /* Create wc mask based on incoming packet. */
+    flow_wildcards_init_for_packet(&wc_from_flow,
+                                   &ctx.flow);
+
+    /* Clear mask fields in ctx which are not relevant for packet. */
+    flow_wildcards_and(ctx.wc, &wc_from_flow, ctx.wc);
+
     /* Do not modify if any bit is wildcarded by the installed datapath flow,
      * but not the newly revalidated wildcard mask (wc), i.e., if revalidation
      * tells that the datapath flow is now too generic and must be narrowed
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 4407f9c..42fdb9f 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -7363,26 +7363,6 @@  xlate_wc_finish(struct xlate_ctx *ctx)
         ctx->wc->masks.tp_src = 0;
         ctx->wc->masks.tp_dst = 0;
     }
-
-    /* Clear flow wildcard bits for fields which are not present
-     * in the original packet header. These wildcards may get set
-     * due to push/set_field actions. This results into frequent
-     * invalidation of datapath flows by revalidator thread. */
-
-    /* Clear mpls label wc bits if original packet is non-mpls. */
-    if (!eth_type_mpls(ctx->xin->upcall_flow->dl_type)) {
-        for (i = 0; i < FLOW_MAX_MPLS_LABELS; i++) {
-            ctx->wc->masks.mpls_lse[i] = 0;
-        }
-    }
-    /* Clear vlan header wc bits if original packet does not have
-     * vlan header. */
-    for (i = 0; i < FLOW_MAX_VLAN_HEADERS; i++) {
-        if (!eth_type_vlan(ctx->xin->upcall_flow->vlans[i].tpid)) {
-            ctx->wc->masks.vlans[i].tpid = 0;
-            ctx->wc->masks.vlans[i].tci = 0;
-        }
-    }
 }
 
 /* Translates the flow, actions, or rule in 'xin' into datapath actions in