diff mbox series

[ovs-dev] ofproto-dpif: Init ukey->dump_seq to zero

Message ID 1522841162-5066-1-git-send-email-jan.scheurich@ericsson.com
State Accepted
Headers show
Series [ovs-dev] ofproto-dpif: Init ukey->dump_seq to zero | expand

Commit Message

Jan Scheurich April 4, 2018, 11:26 a.m. UTC
In the current implementation the dump_seq of a new datapath flow ukey
is set to seq_read(udpif->dump_seq). This implies that any revalidation
during the current dump_seq period (up to 500 ms) is skipped.

This can trigger incorrect behavior, for example when the the creation of
datapath flow triggers a PACKET_IN to the controller, which which course
the controller installs a new flow entry that should invalidate the
original datapath flow.

Initializing ukey->dump_seq to zero implies that the first dump of the
flow, be it for revalidation or dumping statistics, will always be
executed as zero is not a valid value of the ovs_seq.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>

---
 ofproto/ofproto-dpif-upcall.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

Comments

Ben Pfaff April 4, 2018, 5:19 p.m. UTC | #1
On Wed, Apr 04, 2018 at 01:26:02PM +0200, Jan Scheurich wrote:
> In the current implementation the dump_seq of a new datapath flow ukey
> is set to seq_read(udpif->dump_seq). This implies that any revalidation
> during the current dump_seq period (up to 500 ms) is skipped.
> 
> This can trigger incorrect behavior, for example when the the creation of
> datapath flow triggers a PACKET_IN to the controller, which which course
> the controller installs a new flow entry that should invalidate the
> original datapath flow.
> 
> Initializing ukey->dump_seq to zero implies that the first dump of the
> flow, be it for revalidation or dumping statistics, will always be
> executed as zero is not a valid value of the ovs_seq.
> 
> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>

If we are going to do this, then we should delete the 'dump_seq' member
of struct upcall, because it will always be zero.  It is also worth
considering whether the other caller of ukey_create__() should pass 0,
and if so then we can delete the 'dump_seq' parameter of
ukey_create__().

Thanks,

Ben.
Jan Scheurich April 4, 2018, 7:03 p.m. UTC | #2
Thanks Ben,

I hope what my patch does is precisely what you suggest. I had the same thoughts when I had a closer look at the code.

Regards, Jan

> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org]
> Sent: Wednesday, 04 April, 2018 19:20
> To: Jan Scheurich <jan.scheurich@ericsson.com>
> Cc: dev@openvswitch.org; Zoltán Balogh <zoltan.balogh@ericsson.com>; jpettit@ovn.org
> Subject: Re: [PATCH] ofproto-dpif: Init ukey->dump_seq to zero
> 
> On Wed, Apr 04, 2018 at 01:26:02PM +0200, Jan Scheurich wrote:
> > In the current implementation the dump_seq of a new datapath flow ukey
> > is set to seq_read(udpif->dump_seq). This implies that any revalidation
> > during the current dump_seq period (up to 500 ms) is skipped.
> >
> > This can trigger incorrect behavior, for example when the the creation of
> > datapath flow triggers a PACKET_IN to the controller, which which course
> > the controller installs a new flow entry that should invalidate the
> > original datapath flow.
> >
> > Initializing ukey->dump_seq to zero implies that the first dump of the
> > flow, be it for revalidation or dumping statistics, will always be
> > executed as zero is not a valid value of the ovs_seq.
> >
> > Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
> 
> If we are going to do this, then we should delete the 'dump_seq' member
> of struct upcall, because it will always be zero.  It is also worth
> considering whether the other caller of ukey_create__() should pass 0,
> and if so then we can delete the 'dump_seq' parameter of
> ukey_create__().
> 
> Thanks,
> 
> Ben.
Ben Pfaff April 4, 2018, 8:28 p.m. UTC | #3
Oh, that's weird.  It's as if I didn't read the patch.  Maybe I just
read some preliminary version in another thread.

Anyway, you're totally right.  I applied this to master.  If you're
seeing problems in another branch, let me know and I will backport.

On Wed, Apr 04, 2018 at 07:03:43PM +0000, Jan Scheurich wrote:
> Thanks Ben,
> 
> I hope what my patch does is precisely what you suggest. I had the same thoughts when I had a closer look at the code.
> 
> Regards, Jan
> 
> > -----Original Message-----
> > From: Ben Pfaff [mailto:blp@ovn.org]
> > Sent: Wednesday, 04 April, 2018 19:20
> > To: Jan Scheurich <jan.scheurich@ericsson.com>
> > Cc: dev@openvswitch.org; Zoltán Balogh <zoltan.balogh@ericsson.com>; jpettit@ovn.org
> > Subject: Re: [PATCH] ofproto-dpif: Init ukey->dump_seq to zero
> > 
> > On Wed, Apr 04, 2018 at 01:26:02PM +0200, Jan Scheurich wrote:
> > > In the current implementation the dump_seq of a new datapath flow ukey
> > > is set to seq_read(udpif->dump_seq). This implies that any revalidation
> > > during the current dump_seq period (up to 500 ms) is skipped.
> > >
> > > This can trigger incorrect behavior, for example when the the creation of
> > > datapath flow triggers a PACKET_IN to the controller, which which course
> > > the controller installs a new flow entry that should invalidate the
> > > original datapath flow.
> > >
> > > Initializing ukey->dump_seq to zero implies that the first dump of the
> > > flow, be it for revalidation or dumping statistics, will always be
> > > executed as zero is not a valid value of the ovs_seq.
> > >
> > > Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
> > 
> > If we are going to do this, then we should delete the 'dump_seq' member
> > of struct upcall, because it will always be zero.  It is also worth
> > considering whether the other caller of ukey_create__() should pass 0,
> > and if so then we can delete the 'dump_seq' parameter of
> > ukey_create__().
> > 
> > Thanks,
> > 
> > Ben.
Jan Scheurich April 6, 2018, 9:34 a.m. UTC | #4
> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org]
> Sent: Wednesday, 04 April, 2018 22:28
> 
> Oh, that's weird.  It's as if I didn't read the patch.  Maybe I just
> read some preliminary version in another thread.
> 
> Anyway, you're totally right.  I applied this to master.  If you're
> seeing problems in another branch, let me know and I will backport.

Thanks!

I think the issue was introduced into OVS by the following commit a long time ago. 

commit 23597df052262dec961fd86eb7c54d10984a1ec0
Author: Joe Stringer <joestringer@nicira.com>
Date:   Fri Jul 25 13:54:24 2014 +1200

It's a temporary glitch that can cause unexpected behavior only within the first few hundred milliseconds after datapath flow creation. It is most likely to affect "reactive" controller use cases (MAC learning, ARP handling), like the OVN test case that now failed with a small change of timing. So it is possible that one could notice short packet drops or duplicate PACKET_INs in real SDN deployments when looking close enough.

My preference would be to backport it all the way to OVS 2.5. But of course I don't have proof that there are actual problems out in the field that it would solve.

One could also do a systematic search of unit test cases that apply "sleep" or "time/warp" work-arounds for the issue and simplify these on master branch. But I fear I won't time for that.

Regards, Jan
Ben Pfaff April 6, 2018, 5:44 p.m. UTC | #5
[also CCing Joe on the chance that he wants to comment]

On Fri, Apr 06, 2018 at 09:34:38AM +0000, Jan Scheurich wrote:
> > -----Original Message-----
> > From: Ben Pfaff [mailto:blp@ovn.org]
> > Sent: Wednesday, 04 April, 2018 22:28
> > 
> > Oh, that's weird.  It's as if I didn't read the patch.  Maybe I just
> > read some preliminary version in another thread.
> > 
> > Anyway, you're totally right.  I applied this to master.  If you're
> > seeing problems in another branch, let me know and I will backport.
> 
> Thanks!
> 
> I think the issue was introduced into OVS by the following commit a long time ago. 
> 
> commit 23597df052262dec961fd86eb7c54d10984a1ec0
> Author: Joe Stringer <joestringer@nicira.com>
> Date:   Fri Jul 25 13:54:24 2014 +1200
> 
> It's a temporary glitch that can cause unexpected behavior only within
> the first few hundred milliseconds after datapath flow creation. It is
> most likely to affect "reactive" controller use cases (MAC learning,
> ARP handling), like the OVN test case that now failed with a small
> change of timing. So it is possible that one could notice short packet
> drops or duplicate PACKET_INs in real SDN deployments when looking
> close enough.
> 
> My preference would be to backport it all the way to OVS 2.5. But of
> course I don't have proof that there are actual problems out in the
> field that it would solve.

OK.

I am going to sit on this for a few days and see whether anyone reports
unusual issues.  If nothing arises, I'll backport as far as reasonable.

Thanks,

Ben.
Joe Stringer April 8, 2018, 8:07 p.m. UTC | #6
On 6 April 2018 at 10:44, Ben Pfaff <blp@ovn.org> wrote:
> [also CCing Joe on the chance that he wants to comment]
>
> On Fri, Apr 06, 2018 at 09:34:38AM +0000, Jan Scheurich wrote:
>> > -----Original Message-----
>> > From: Ben Pfaff [mailto:blp@ovn.org]
>> > Sent: Wednesday, 04 April, 2018 22:28
>> >
>> > Oh, that's weird.  It's as if I didn't read the patch.  Maybe I just
>> > read some preliminary version in another thread.
>> >
>> > Anyway, you're totally right.  I applied this to master.  If you're
>> > seeing problems in another branch, let me know and I will backport.
>>
>> Thanks!
>>
>> I think the issue was introduced into OVS by the following commit a long time ago.
>>
>> commit 23597df052262dec961fd86eb7c54d10984a1ec0
>> Author: Joe Stringer <joestringer@nicira.com>
>> Date:   Fri Jul 25 13:54:24 2014 +1200
>>
>> It's a temporary glitch that can cause unexpected behavior only within
>> the first few hundred milliseconds after datapath flow creation. It is
>> most likely to affect "reactive" controller use cases (MAC learning,
>> ARP handling), like the OVN test case that now failed with a small
>> change of timing. So it is possible that one could notice short packet
>> drops or duplicate PACKET_INs in real SDN deployments when looking
>> close enough.

I see, so there's two seqs used in this code, dump_seq and reval_seq.
The purpose of the dump_seq is just to handle the case where a thread
dumps the exact same flow from the kernel datapath twice in quick
succession. This may happen because the kernel API doesn't provide a
guarantee about dumping an exact snapshot of the entire table, so if
one set of flows (~10-15) are dumped from the kernel, then a flow is
inserted into the bucket that is being dumped right now, then a
subsequent dump will begin from the same bucket/index as where the
previous dump ended, and perhaps rather than starting from the very
next flow, it starts with the last flow of the previous dump. Hence we
see the same flow twice.

From this perspective, setting the dump_seq during upcall processing
does not make sense, so I agree with the basis of the patch. However,
I note that you also set it to 0 in ukey_create_from_dpif_flow(),
which is actually a dump so the same flow could be revalidated twice.
This doesn't quite seem right according to the above reasoning, but
that said looking over the code I can't see anything problematic with
handling it this way. Creating ukeys from dpif_flows is very much a
corner case and it seems unlikely to have a large impact even if there
is some logical mismatch in this case. (Equally, because it's such a
corner case, people are far less likely to notice a problem here..)

If I follow the order of relevant executions is something like:

* Openflow has set of rules R1
* dump_seq is set to N1
* new datapath flow arrives, creates upcall with N1 and associated
ukey with same dump_seq
* datapath flow is forwarded to controller, controller pushes new
openflow flow, rules are now R2
* during the same dump, the revalidator dumps the newly installed
flow, finds the ukey, sees the dump_seq N1 and skips revalidating it
although there are new rules R2
* traffic continues to be forwarded according to R1 until the next dump
* Because of rule transition R1->R2, another flow dump / revalidation
round is immediately triggered via reval_seq, with minimum bound of
starting in T+5ms
* dump_seq is set to N2
* During this second revalidation round, the datapath flow is actually
checked, determined to be forwarding per out-of-date behaviour, and
updated / evicted.

I would think that if you have a small number of flows, a new
revalidation round should be triggered fairly quickly due to reval_seq
(scheduled to happen in minimum ~5ms, as per updif_revalidator()). In
this window the flow will not be revalidated so you would still see
blips like this. On the other hand, if you have a large number of
flows, statistically you may not end up dumping the newly installed
flow during the same dump, so it may still take hundreds of
milliseconds for the dump to finish, to trigger revalidation due to
the R1->R2 rule change, then for the flow to eventually be revalidated
to implement the new behaviour. I don't know that it would make a big
difference there. That said, for the case where the flow /is/ dumped
during the same round, its actions may be updated so OVS would be more
responsive to OpenFlow changes in that case.

If you think there's a slightly different timeline, I'd be curious of
your thoughts on it though I won't provide any guarantee I'll be able
to provide further insight into it :-)

I believe that you observe an improvement in behaviour with the patch,
so I have no objections. Logically it seems the right direction. That
said, I doubt that it's able to make the datapath more responsive in
all cases as it's really quite difficult to synchronize the state
between OpenFlow and the datapath within a short period. The general
rule we tend to observe in this area of the code is "Try to be less
than 1 second behind the current OpenFlow layer's forwarding rules".

Cheers,
Joe
Ben Pfaff April 16, 2018, 9:47 p.m. UTC | #7
On Fri, Apr 06, 2018 at 10:44:13AM -0700, Ben Pfaff wrote:
> [also CCing Joe on the chance that he wants to comment]
> 
> On Fri, Apr 06, 2018 at 09:34:38AM +0000, Jan Scheurich wrote:
> > > -----Original Message-----
> > > From: Ben Pfaff [mailto:blp@ovn.org]
> > > Sent: Wednesday, 04 April, 2018 22:28
> > > 
> > > Oh, that's weird.  It's as if I didn't read the patch.  Maybe I just
> > > read some preliminary version in another thread.
> > > 
> > > Anyway, you're totally right.  I applied this to master.  If you're
> > > seeing problems in another branch, let me know and I will backport.
> > 
> > Thanks!
> > 
> > I think the issue was introduced into OVS by the following commit a long time ago. 
> > 
> > commit 23597df052262dec961fd86eb7c54d10984a1ec0
> > Author: Joe Stringer <joestringer@nicira.com>
> > Date:   Fri Jul 25 13:54:24 2014 +1200
> > 
> > It's a temporary glitch that can cause unexpected behavior only within
> > the first few hundred milliseconds after datapath flow creation. It is
> > most likely to affect "reactive" controller use cases (MAC learning,
> > ARP handling), like the OVN test case that now failed with a small
> > change of timing. So it is possible that one could notice short packet
> > drops or duplicate PACKET_INs in real SDN deployments when looking
> > close enough.
> > 
> > My preference would be to backport it all the way to OVS 2.5. But of
> > course I don't have proof that there are actual problems out in the
> > field that it would solve.
> 
> OK.
> 
> I am going to sit on this for a few days and see whether anyone reports
> unusual issues.  If nothing arises, I'll backport as far as reasonable.

I backported to branch-2.9 and branch-2.8.
Jan Scheurich April 17, 2018, 7:50 a.m. UTC | #8
> >
> > OK.
> >
> > I am going to sit on this for a few days and see whether anyone reports
> > unusual issues.  If nothing arises, I'll backport as far as reasonable.
> 
> I backported to branch-2.9 and branch-2.8.

Thanks, Ben.
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 7bfeedd..00160e1 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -231,7 +231,6 @@  struct upcall {
     bool ukey_persists;            /* Set true to keep 'ukey' beyond the
                                       lifetime of this upcall. */
 
-    uint64_t dump_seq;             /* udpif->dump_seq at translation time. */
     uint64_t reval_seq;            /* udpif->reval_seq at translation time. */
 
     /* Not used by the upcall callback interface. */
@@ -1159,7 +1158,6 @@  upcall_xlate(struct udpif *udpif, struct upcall *upcall,
          * with pushing its stats eventually. */
     }
 
-    upcall->dump_seq = seq_read(udpif->dump_seq);
     upcall->reval_seq = seq_read(udpif->reval_seq);
 
     xerr = xlate_actions(&xin, &upcall->xout);
@@ -1633,7 +1631,7 @@  ukey_create__(const struct nlattr *key, size_t key_len,
               const struct nlattr *mask, size_t mask_len,
               bool ufid_present, const ovs_u128 *ufid,
               const unsigned pmd_id, const struct ofpbuf *actions,
-              uint64_t dump_seq, uint64_t reval_seq, long long int used,
+              uint64_t reval_seq, long long int used,
               uint32_t key_recirc_id, struct xlate_out *xout)
     OVS_NO_THREAD_SAFETY_ANALYSIS
 {
@@ -1654,7 +1652,7 @@  ukey_create__(const struct nlattr *key, size_t key_len,
     ukey_set_actions(ukey, actions);
 
     ovs_mutex_init(&ukey->mutex);
-    ukey->dump_seq = dump_seq;
+    ukey->dump_seq = 0;     /* Not yet dumped */
     ukey->reval_seq = reval_seq;
     ukey->state = UKEY_CREATED;
     ukey->state_thread = ovsthread_id_self();
@@ -1704,8 +1702,7 @@  ukey_create_from_upcall(struct upcall *upcall, struct flow_wildcards *wc)
 
     return ukey_create__(keybuf.data, keybuf.size, maskbuf.data, maskbuf.size,
                          true, upcall->ufid, upcall->pmd_id,
-                         &upcall->put_actions, upcall->dump_seq,
-                         upcall->reval_seq, 0,
+                         &upcall->put_actions, upcall->reval_seq, 0,
                          upcall->have_recirc_ref ? upcall->recirc->id : 0,
                          &upcall->xout);
 }
@@ -1717,7 +1714,7 @@  ukey_create_from_dpif_flow(const struct udpif *udpif,
 {
     struct dpif_flow full_flow;
     struct ofpbuf actions;
-    uint64_t dump_seq, reval_seq;
+    uint64_t reval_seq;
     uint64_t stub[DPIF_FLOW_BUFSIZE / 8];
     const struct nlattr *a;
     unsigned int left;
@@ -1754,12 +1751,11 @@  ukey_create_from_dpif_flow(const struct udpif *udpif,
         }
     }
 
-    dump_seq = seq_read(udpif->dump_seq);
     reval_seq = seq_read(udpif->reval_seq) - 1; /* Ensure revalidation. */
     ofpbuf_use_const(&actions, &flow->actions, flow->actions_len);
     *ukey = ukey_create__(flow->key, flow->key_len,
                           flow->mask, flow->mask_len, flow->ufid_present,
-                          &flow->ufid, flow->pmd_id, &actions, dump_seq,
+                          &flow->ufid, flow->pmd_id, &actions,
                           reval_seq, flow->stats.used, 0, NULL);
 
     return 0;