[ovs-dev,RFC] connmgr: support changing openflow versions without restarting
diff mbox series

Message ID f7t1rrfwekl.fsf@dhcp-25.97.bos.redhat.com
State New
Headers show
Series
  • [ovs-dev,RFC] connmgr: support changing openflow versions without restarting
Related show

Commit Message

Aaron Conole Jan. 31, 2020, 8:34 p.m. UTC
When commit a0baa7dfa4fe ("connmgr: Make treatment of active and passive
connections more uniform") was applied, it didn't take into account
that a reconfiguration of the allowed_versions setting needs to
propagate to the rconn and pvconn (and lower connection) elements.

This change inelegantly deletes and recreates these pvconn and rconn
elements.  There's probably better ways of doing this (like maybe
adding the knob to rconn/vconn and kicking off a resync?) but I don't
know the openflow spec well enough to know whether it's possible or
supported.

A new test is added to ensure we don't break the behavior again.

Fixes: a0baa7dfa4fe ("connmgr: Make treatment of active and passive
connections more uniform")

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
---
2.21.0

Comments

Ben Pfaff Feb. 6, 2020, 7:27 p.m. UTC | #1
On Fri, Jan 31, 2020 at 03:34:02PM -0500, Aaron Conole wrote:
> When commit a0baa7dfa4fe ("connmgr: Make treatment of active and passive
> connections more uniform") was applied, it didn't take into account
> that a reconfiguration of the allowed_versions setting needs to
> propagate to the rconn and pvconn (and lower connection) elements.
> 
> This change inelegantly deletes and recreates these pvconn and rconn
> elements.  There's probably better ways of doing this (like maybe
> adding the knob to rconn/vconn and kicking off a resync?) but I don't
> know the openflow spec well enough to know whether it's possible or
> supported.
> 
> A new test is added to ensure we don't break the behavior again.
> 
> Fixes: a0baa7dfa4fe ("connmgr: Make treatment of active and passive
> connections more uniform")
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>

I spent some time working to make it possible to change this kind of
thing midstream without deleting and recreating everything.  It
unfortunately gets bogged down in nasty corner cases, so I don't want to
pursue it unless it turns out to be important.

I think this patch does introduce a problem, though.  The ofservice code
assumes in a few places that an ofservice has a pvconn or an rconn.  If
the pvconn goes away and there's no rconn, then we'll get crashes I
think.  See ofservice_run(), for example.  The code needs to avoid that.
Aaron Conole Feb. 12, 2020, 6:53 p.m. UTC | #2
Ben Pfaff <blp@ovn.org> writes:

> On Fri, Jan 31, 2020 at 03:34:02PM -0500, Aaron Conole wrote:
>> When commit a0baa7dfa4fe ("connmgr: Make treatment of active and passive
>> connections more uniform") was applied, it didn't take into account
>> that a reconfiguration of the allowed_versions setting needs to
>> propagate to the rconn and pvconn (and lower connection) elements.
>> 
>> This change inelegantly deletes and recreates these pvconn and rconn
>> elements.  There's probably better ways of doing this (like maybe
>> adding the knob to rconn/vconn and kicking off a resync?) but I don't
>> know the openflow spec well enough to know whether it's possible or
>> supported.
>> 
>> A new test is added to ensure we don't break the behavior again.
>> 
>> Fixes: a0baa7dfa4fe ("connmgr: Make treatment of active and passive
>> connections more uniform")
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>
> I spent some time working to make it possible to change this kind of
> thing midstream without deleting and recreating everything.  It
> unfortunately gets bogged down in nasty corner cases, so I don't want to
> pursue it unless it turns out to be important.
>
> I think this patch does introduce a problem, though.  The ofservice code
> assumes in a few places that an ofservice has a pvconn or an rconn.  If
> the pvconn goes away and there's no rconn, then we'll get crashes I
> think.  See ofservice_run(), for example.  The code needs to avoid that.

Okay, I'll take a closer look.  One method I can thing to use for this
would be to rcu-ify the pvconn and rconn pointers in the ofservice
struct.  Then swap in a new pointer, and defer release to the next sync
period.  This means that at all times there should be a valid rconn and
pvconn pointer.

BUT

I don't know if this will work for the pvconn (because I don't know if
it will run into some kind of resource conflict with the listening
socket), but maybe it will need to have SO_REUSEPORT added to the list
of socket options in inet_open_passive (for the pvconn).

I guess for rconn I would expect that the remote endpoint should have
some listen backlog, so I don't think anything should be needed in that
case.

WDYT?
Ben Pfaff Feb. 13, 2020, 11:48 p.m. UTC | #3
On Wed, Feb 12, 2020 at 01:53:17PM -0500, Aaron Conole wrote:
> Ben Pfaff <blp@ovn.org> writes:
> 
> > On Fri, Jan 31, 2020 at 03:34:02PM -0500, Aaron Conole wrote:
> >> When commit a0baa7dfa4fe ("connmgr: Make treatment of active and passive
> >> connections more uniform") was applied, it didn't take into account
> >> that a reconfiguration of the allowed_versions setting needs to
> >> propagate to the rconn and pvconn (and lower connection) elements.
> >> 
> >> This change inelegantly deletes and recreates these pvconn and rconn
> >> elements.  There's probably better ways of doing this (like maybe
> >> adding the knob to rconn/vconn and kicking off a resync?) but I don't
> >> know the openflow spec well enough to know whether it's possible or
> >> supported.
> >> 
> >> A new test is added to ensure we don't break the behavior again.
> >> 
> >> Fixes: a0baa7dfa4fe ("connmgr: Make treatment of active and passive
> >> connections more uniform")
> >> 
> >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> >
> > I spent some time working to make it possible to change this kind of
> > thing midstream without deleting and recreating everything.  It
> > unfortunately gets bogged down in nasty corner cases, so I don't want to
> > pursue it unless it turns out to be important.
> >
> > I think this patch does introduce a problem, though.  The ofservice code
> > assumes in a few places that an ofservice has a pvconn or an rconn.  If
> > the pvconn goes away and there's no rconn, then we'll get crashes I
> > think.  See ofservice_run(), for example.  The code needs to avoid that.
> 
> Okay, I'll take a closer look.  One method I can thing to use for this
> would be to rcu-ify the pvconn and rconn pointers in the ofservice
> struct.  Then swap in a new pointer, and defer release to the next sync
> period.  This means that at all times there should be a valid rconn and
> pvconn pointer.

This seems like a lot of work versus just removing an ofservice that
can't be opened.

Patch
diff mbox series

diff --git a/ofproto/connmgr.c b/ofproto/connmgr.c
index 51d656cba..37c616691 100644
--- a/ofproto/connmgr.c
+++ b/ofproto/connmgr.c
@@ -2021,6 +2021,26 @@  ofservice_reconfigure(struct ofservice *ofservice,
      * version. */
     if (ofservice->s.allowed_versions != settings->allowed_versions) {
         ofservice_close_all(ofservice);
+        if (ofservice->pvconn) {
+            struct pvconn *pvconn = NULL;
+            pvconn_close(ofservice->pvconn);
+            /* there is a potentially squelched error here */
+            pvconn_open(ofservice->target, settings->allowed_versions,
+                                    settings->dscp, &pvconn);
+            ofservice->pvconn = pvconn;
+        }
+
+        if (ofservice->rconn) {
+            struct rconn *rconn = NULL;
+            rconn_destroy(ofservice->rconn);
+
+            char *name = ofconn_make_name(ofservice->connmgr, ofservice->target);
+            rconn = rconn_create(5, 8, settings->dscp, settings->allowed_versions);
+            rconn_connect(rconn, ofservice->target, name);
+            free(name);
+            ofservice->rconn = rconn;
+        }
+
     }
 
     ofservice->s = *settings;
diff --git a/tests/bridge.at b/tests/bridge.at
index d48463e26..904f1381c 100644
--- a/tests/bridge.at
+++ b/tests/bridge.at
@@ -103,3 +103,20 @@  AT_CHECK([ovs-appctl -t ovs-vswitchd version], [0], [ignore])
 OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 AT_CLEANUP
+
+AT_SETUP([bridge - change ofproto versions])
+dnl Start vswitch and add a version test bridge
+OVS_VSWITCHD_START(
+    [add-br vr_test0 -- \
+     set bridge vr_test0 datapath-type=dummy \
+                         protocols=OpenFlow10])
+
+dnl set the version to include, say, OpenFlow14
+AT_CHECK([ovs-vsctl set bridge vr_test0 protocols=OpenFlow10,OpenFlow14])
+
+dnl now try to use bundle action on a flow
+AT_CHECK([ovs-ofctl add-flow vr_test0 --bundle actions=normal])
+
+OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+AT_CLEANUP