diff mbox

[ovs-dev] rconn: Avoid abort for ill-behaved remote.

Message ID 20161223172343.4405-1-blp@ovn.org
State Accepted
Headers show

Commit Message

Ben Pfaff Dec. 23, 2016, 5:23 p.m. UTC
If an rconn peer fails to send a hello message, the version number doesn't
get set.  Later, if the peer delays long enough, the rconn attempts to send
an echo request but assert-fails instead because it doesn't know what
version to use.  This fixes the problem.

To reproduce this problem:

    make sandbox
    ovs-vsctl add-br br0
    ovs-vsctl set-controller br0 ptcp:12345
    nc 127.0.0.1 12345

and wait 10 seconds for ovs-vswitchd to die.  (Then exit the sandbox.)

Reported-by: 张东亚 <fortitude.zhang@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
---
 lib/rconn.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

Comments

Justin Pettit Dec. 23, 2016, 9:02 p.m. UTC | #1
> On Dec 23, 2016, at 9:23 AM, Ben Pfaff <blp@ovn.org> wrote:
> 
> If an rconn peer fails to send a hello message, the version number doesn't
> get set.  Later, if the peer delays long enough, the rconn attempts to send
> an echo request but assert-fails instead because it doesn't know what
> version to use.  This fixes the problem.
> 
> To reproduce this problem:
> 
>    make sandbox
>    ovs-vsctl add-br br0
>    ovs-vsctl set-controller br0 ptcp:12345
>    nc 127.0.0.1 12345
> 
> and wait 10 seconds for ovs-vswitchd to die.  (Then exit the sandbox.)
> 
> Reported-by: 张东亚 <fortitude.zhang@gmail.com>
> Signed-off-by: Ben Pfaff <blp@ovn.org>

Acked-by: Justin Pettit <jpettit@ovn.org>

--Justin
Ben Pfaff Dec. 23, 2016, 9:31 p.m. UTC | #2
On Fri, Dec 23, 2016 at 01:02:00PM -0800, Justin Pettit wrote:
> 
> > On Dec 23, 2016, at 9:23 AM, Ben Pfaff <blp@ovn.org> wrote:
> > 
> > If an rconn peer fails to send a hello message, the version number doesn't
> > get set.  Later, if the peer delays long enough, the rconn attempts to send
> > an echo request but assert-fails instead because it doesn't know what
> > version to use.  This fixes the problem.
> > 
> > To reproduce this problem:
> > 
> >    make sandbox
> >    ovs-vsctl add-br br0
> >    ovs-vsctl set-controller br0 ptcp:12345
> >    nc 127.0.0.1 12345
> > 
> > and wait 10 seconds for ovs-vswitchd to die.  (Then exit the sandbox.)
> > 
> > Reported-by: 张东亚 <fortitude.zhang@gmail.com>
> > Signed-off-by: Ben Pfaff <blp@ovn.org>
> 
> Acked-by: Justin Pettit <jpettit@ovn.org>

Thanks.  I applied this to master and backported it as far as
branch-2.0.
diff mbox

Patch

diff --git a/lib/rconn.c b/lib/rconn.c
index 0c1812a..8a29864 100644
--- a/lib/rconn.c
+++ b/lib/rconn.c
@@ -558,19 +558,23 @@  run_ACTIVE(struct rconn *rc)
 {
     if (timed_out(rc)) {
         unsigned int base = MAX(rc->last_activity, rc->state_entered);
-        int version;
-
         VLOG_DBG("%s: idle %u seconds, sending inactivity probe",
                  rc->name, (unsigned int) (time_now() - base));
 
-        version = rconn_get_version__(rc);
-        ovs_assert(version >= 0 && version <= 0xff);
-
         /* Ordering is important here: rconn_send() can transition to BACKOFF,
          * and we don't want to transition back to IDLE if so, because then we
          * can end up queuing a packet with vconn == NULL and then *boom*. */
         state_transition(rc, S_IDLE);
-        rconn_send__(rc, make_echo_request(version), NULL);
+
+        /* Send an echo request if we can.  (If version negotiation is not
+         * complete, that is, if we did not yet receive a "hello" message from
+         * the peer, we do not know the version to use, so we don't send
+         * anything.) */
+        int version = rconn_get_version__(rc);
+        if (version >= 0 && version <= 0xff) {
+            rconn_send__(rc, make_echo_request(version), NULL);
+        }
+
         return;
     }