diff mbox series

[ovs-dev,v4,1/3] ofproto-dpif-upcall: fix push_dp_ops

Message ID 20221003040220.1153222-2-hepeng.0320@bytedance.com
State Changes Requested
Headers show
Series [ovs-dev,v4,1/3] ofproto-dpif-upcall: fix push_dp_ops | expand

Commit Message

Peng He Oct. 3, 2022, 4:02 a.m. UTC
push_dp_ops only handles delete ops errors but ignores the modify
ops results. It's better to handle all the dp operation errors in
a consistent way.

We observe in the production environment that sometimes a megaflow
with wrong actions keep staying in datapath. The coverage command shows
revalidators have dumped several times, however the correct
actions are not set. This implies that the ukey's action does not
equal to the meagaflow's, i.e. revalidators think the underlying
megaflow's actions are correct however they are not.

We also check the megaflow using the ofproto/trace command, and the
actions are not matched with the ones in the actual magaflow. By
performing a revalidator/purge command, the right actions are set.

This patch prevents the inconsistency by considering modify failure
in revalidators.

Signed-off-by: Peng He <hepeng.0320@bytedance.com>
---
 ofproto/ofproto-dpif-upcall.c | 37 ++++++++++++++++++++++++-----------
 1 file changed, 26 insertions(+), 11 deletions(-)

Comments

Eelco Chaudron Oct. 19, 2022, 10:44 a.m. UTC | #1
On 3 Oct 2022, at 6:02, Peng He wrote:

> push_dp_ops only handles delete ops errors but ignores the modify
> ops results. It's better to handle all the dp operation errors in
> a consistent way.
>
> We observe in the production environment that sometimes a megaflow
> with wrong actions keep staying in datapath. The coverage command shows
> revalidators have dumped several times, however the correct
> actions are not set. This implies that the ukey's action does not
> equal to the meagaflow's, i.e. revalidators think the underlying
> megaflow's actions are correct however they are not.
>
> We also check the megaflow using the ofproto/trace command, and the
> actions are not matched with the ones in the actual magaflow. By
> performing a revalidator/purge command, the right actions are set.
>
> This patch prevents the inconsistency by considering modify failure
> in revalidators.
>
> Signed-off-by: Peng He <hepeng.0320@bytedance.com>
> ---
>  ofproto/ofproto-dpif-upcall.c | 37 ++++++++++++++++++++++++-----------
>  1 file changed, 26 insertions(+), 11 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index 7ad728adf..dd1abfdee 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2416,26 +2416,41 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, size_t n_ops)
>
>      for (i = 0; i < n_ops; i++) {
>          struct ukey_op *op = &ops[i];
> -        struct dpif_flow_stats *push, *stats, push_buf;
> -
> -        stats = op->dop.flow_del.stats;
> -        push = &push_buf;
> -
> -        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> -            /* Only deleted flows need their stats pushed. */
> -            continue;
> -        }
>
>          if (op->dop.error) {
> -            /* flow_del error, 'stats' is unusable. */
>              if (op->ukey) {
>                  ovs_mutex_lock(&op->ukey->mutex);
> -                transition_ukey(op->ukey, UKEY_EVICTED);
> +                if (op->dop.type == DPIF_OP_FLOW_DEL)
> +                    transition_ukey(op->ukey, UKEY_EVICTED);
> +                else {
> +                    /* modify error has variety reasons, it's easy
> +                     * to transition the state into UKEY_EVICTED, and
> +                     * let revalidator_sweep remove the ukey.
> +                     *
> +                     * In the next round of revalidating, the revalidator
> +                     * can recover the ukey through ukey_create_from_dpif_flow,
> +                     * if recover fails, the revalidator will remove the datapath
> +                     * flow itself.
> +                     */
> +                    transition_ukey(op->ukey, UKEY_EVICTING);
> +                    transition_ukey(op->ukey, UKEY_EVICTED);
> +                }
>                  ovs_mutex_unlock(&op->ukey->mutex);
>              }
> +            /* if modify or delete fails, there is no need to push stats */
>              continue;
>          }
>
> +        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> +            /* Only deleted flows need their stats pushed. */

The code looks good, however, what was the conclusion on updating stats on other flows that now got deleted instead of updated? Do they not need their stats pushed? See v3 discussion.

> +            continue;
> +        }
> +
> +        struct dpif_flow_stats *push, *stats, push_buf;
> +
> +        stats = op->dop.flow_del.stats;
> +        push = &push_buf;
> +
>          if (op->ukey) {
>              ovs_mutex_lock(&op->ukey->mutex);
>              transition_ukey(op->ukey, UKEY_EVICTED);
> -- 
> 2.25.1
Peng He Oct. 19, 2022, 1:15 p.m. UTC | #2
sorry to miss that part,

will take some time to think if we need it or not.

BTW, looks like the kernel handle_upcalls code path has also this issue.

the code is here:

       for (i = 0; i < n_ops; i++) {
           struct udpif_key *ukey = ops[i].ukey;

           if (ukey) {
               ovs_mutex_lock(&ukey->mutex);
               if (ops[i].dop.error) {
                   transition_ukey(ukey, UKEY_EVICTED);
               } else if (ukey->state < UKEY_OPERATIONAL) {
                   transition_ukey(ukey, UKEY_OPERATIONAL);
               }
               ovs_mutex_unlock(&ukey->mutex);
           }
       }

Eelco Chaudron <echaudro@redhat.com> 于2022年10月19日周三 18:44写道:

>
>
> On 3 Oct 2022, at 6:02, Peng He wrote:
>
> > push_dp_ops only handles delete ops errors but ignores the modify
> > ops results. It's better to handle all the dp operation errors in
> > a consistent way.
> >
> > We observe in the production environment that sometimes a megaflow
> > with wrong actions keep staying in datapath. The coverage command shows
> > revalidators have dumped several times, however the correct
> > actions are not set. This implies that the ukey's action does not
> > equal to the meagaflow's, i.e. revalidators think the underlying
> > megaflow's actions are correct however they are not.
> >
> > We also check the megaflow using the ofproto/trace command, and the
> > actions are not matched with the ones in the actual magaflow. By
> > performing a revalidator/purge command, the right actions are set.
> >
> > This patch prevents the inconsistency by considering modify failure
> > in revalidators.
> >
> > Signed-off-by: Peng He <hepeng.0320@bytedance.com>
> > ---
> >  ofproto/ofproto-dpif-upcall.c | 37 ++++++++++++++++++++++++-----------
> >  1 file changed, 26 insertions(+), 11 deletions(-)
> >
> > diff --git a/ofproto/ofproto-dpif-upcall.c
> b/ofproto/ofproto-dpif-upcall.c
> > index 7ad728adf..dd1abfdee 100644
> > --- a/ofproto/ofproto-dpif-upcall.c
> > +++ b/ofproto/ofproto-dpif-upcall.c
> > @@ -2416,26 +2416,41 @@ push_dp_ops(struct udpif *udpif, struct ukey_op
> *ops, size_t n_ops)
> >
> >      for (i = 0; i < n_ops; i++) {
> >          struct ukey_op *op = &ops[i];
> > -        struct dpif_flow_stats *push, *stats, push_buf;
> > -
> > -        stats = op->dop.flow_del.stats;
> > -        push = &push_buf;
> > -
> > -        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> > -            /* Only deleted flows need their stats pushed. */
> > -            continue;
> > -        }
> >
> >          if (op->dop.error) {
> > -            /* flow_del error, 'stats' is unusable. */
> >              if (op->ukey) {
> >                  ovs_mutex_lock(&op->ukey->mutex);
> > -                transition_ukey(op->ukey, UKEY_EVICTED);
> > +                if (op->dop.type == DPIF_OP_FLOW_DEL)
> > +                    transition_ukey(op->ukey, UKEY_EVICTED);
> > +                else {
> > +                    /* modify error has variety reasons, it's easy
> > +                     * to transition the state into UKEY_EVICTED, and
> > +                     * let revalidator_sweep remove the ukey.
> > +                     *
> > +                     * In the next round of revalidating, the
> revalidator
> > +                     * can recover the ukey through
> ukey_create_from_dpif_flow,
> > +                     * if recover fails, the revalidator will remove
> the datapath
> > +                     * flow itself.
> > +                     */
> > +                    transition_ukey(op->ukey, UKEY_EVICTING);
> > +                    transition_ukey(op->ukey, UKEY_EVICTED);
> > +                }
> >                  ovs_mutex_unlock(&op->ukey->mutex);
> >              }
> > +            /* if modify or delete fails, there is no need to push
> stats */
> >              continue;
> >          }
> >
> > +        if (op->dop.type != DPIF_OP_FLOW_DEL) {
> > +            /* Only deleted flows need their stats pushed. */
>
> The code looks good, however, what was the conclusion on updating stats on
> other flows that now got deleted instead of updated? Do they not need their
> stats pushed? See v3 discussion.
>
> > +            continue;
> > +        }
> > +
> > +        struct dpif_flow_stats *push, *stats, push_buf;
> > +
> > +        stats = op->dop.flow_del.stats;
> > +        push = &push_buf;
> > +
> >          if (op->ukey) {
> >              ovs_mutex_lock(&op->ukey->mutex);
> >              transition_ukey(op->ukey, UKEY_EVICTED);
> > --
> > 2.25.1
>
>
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 7ad728adf..dd1abfdee 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -2416,26 +2416,41 @@  push_dp_ops(struct udpif *udpif, struct ukey_op *ops, size_t n_ops)
 
     for (i = 0; i < n_ops; i++) {
         struct ukey_op *op = &ops[i];
-        struct dpif_flow_stats *push, *stats, push_buf;
-
-        stats = op->dop.flow_del.stats;
-        push = &push_buf;
-
-        if (op->dop.type != DPIF_OP_FLOW_DEL) {
-            /* Only deleted flows need their stats pushed. */
-            continue;
-        }
 
         if (op->dop.error) {
-            /* flow_del error, 'stats' is unusable. */
             if (op->ukey) {
                 ovs_mutex_lock(&op->ukey->mutex);
-                transition_ukey(op->ukey, UKEY_EVICTED);
+                if (op->dop.type == DPIF_OP_FLOW_DEL)
+                    transition_ukey(op->ukey, UKEY_EVICTED);
+                else {
+                    /* modify error has variety reasons, it's easy
+                     * to transition the state into UKEY_EVICTED, and
+                     * let revalidator_sweep remove the ukey.
+                     *
+                     * In the next round of revalidating, the revalidator
+                     * can recover the ukey through ukey_create_from_dpif_flow,
+                     * if recover fails, the revalidator will remove the datapath
+                     * flow itself.
+                     */
+                    transition_ukey(op->ukey, UKEY_EVICTING);
+                    transition_ukey(op->ukey, UKEY_EVICTED);
+                }
                 ovs_mutex_unlock(&op->ukey->mutex);
             }
+            /* if modify or delete fails, there is no need to push stats */
             continue;
         }
 
+        if (op->dop.type != DPIF_OP_FLOW_DEL) {
+            /* Only deleted flows need their stats pushed. */
+            continue;
+        }
+
+        struct dpif_flow_stats *push, *stats, push_buf;
+
+        stats = op->dop.flow_del.stats;
+        push = &push_buf;
+
         if (op->ukey) {
             ovs_mutex_lock(&op->ukey->mutex);
             transition_ukey(op->ukey, UKEY_EVICTED);