[ovs-dev,1/2] ovs python: ovs.stream.open_block() returns success even if the remote is unreachable

Message ID 20180710192639.31368-1-nusiddiq@redhat.com
State New
Headers show
Series
  • Partial cluster support in Python IDL client
Related show

Commit Message

Numan Siddique July 10, 2018, 7:26 p.m.
From: Numan Siddique <nusiddiq@redhat.com>

The python function ovs.socket_util.check_connection_completion() uses select()
(provided by python) to monitor the socket file descriptor. The select()
returns 1 when the file descriptor becomes ready. For error cases like -
111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
expects the exceptfds list to be set by select(). But that is not the case.
As per the select() man page, writefds list will be set for POLLERR.
Please see "Correspondence between select() and poll() notifications" section of select(2)
man page.

Because of this behavior, ovs.socket_util.check_connection_completion() returns success
even if the remote is unreachable or not listening on the port.

This patch fixes this issue by adding a wrapper function - check_connection_completion_status()
which calls sock.connect_ex() to get the status of the connection if
ovs.socket_util.check_connection_completion() returns success.

The test cases added fails without the fix in this patch.

Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
---
 python/ovs/socket_util.py | 34 ++++++++++++++++++++++++++++++++++
 python/ovs/stream.py      | 16 +++++++++++++---
 tests/automake.mk         |  1 +
 tests/ovsdb-idl.at        | 16 ++++++++++++++++
 tests/test-stream.py      | 32 ++++++++++++++++++++++++++++++++
 5 files changed, 96 insertions(+), 3 deletions(-)
 create mode 100644 tests/test-stream.py

Comments

Ben Pfaff July 10, 2018, 8:38 p.m. | #1
On Wed, Jul 11, 2018 at 12:56:39AM +0530, nusiddiq@redhat.com wrote:
> From: Numan Siddique <nusiddiq@redhat.com>
> 
> The python function ovs.socket_util.check_connection_completion() uses select()
> (provided by python) to monitor the socket file descriptor. The select()
> returns 1 when the file descriptor becomes ready. For error cases like -
> 111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
> expects the exceptfds list to be set by select(). But that is not the case.
> As per the select() man page, writefds list will be set for POLLERR.
> Please see "Correspondence between select() and poll() notifications" section of select(2)
> man page.
> 
> Because of this behavior, ovs.socket_util.check_connection_completion() returns success
> even if the remote is unreachable or not listening on the port.
> 
> This patch fixes this issue by adding a wrapper function - check_connection_completion_status()
> which calls sock.connect_ex() to get the status of the connection if
> ovs.socket_util.check_connection_completion() returns success.
> 
> The test cases added fails without the fix in this patch.
> 
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>

Can we just code check_connection_completion() like we do in C, by
directly using select() on Windows and poll() everywhere else?  The
approach in this patch seems a little indirect to me.

Thanks,

Ben.
Numan Siddique July 11, 2018, 7:07 a.m. | #2
On Wed, Jul 11, 2018 at 2:08 AM Ben Pfaff <blp@ovn.org> wrote:

> On Wed, Jul 11, 2018 at 12:56:39AM +0530, nusiddiq@redhat.com wrote:
> > From: Numan Siddique <nusiddiq@redhat.com>
> >
> > The python function ovs.socket_util.check_connection_completion() uses
> select()
> > (provided by python) to monitor the socket file descriptor. The select()
> > returns 1 when the file descriptor becomes ready. For error cases like -
> > 111 (Connection refused) and 113 (No route to host) (POLLERR),
> ovs.poller._SelectSelect.poll()
> > expects the exceptfds list to be set by select(). But that is not the
> case.
> > As per the select() man page, writefds list will be set for POLLERR.
> > Please see "Correspondence between select() and poll() notifications"
> section of select(2)
> > man page.
> >
> > Because of this behavior, ovs.socket_util.check_connection_completion()
> returns success
> > even if the remote is unreachable or not listening on the port.
> >
> > This patch fixes this issue by adding a wrapper function -
> check_connection_completion_status()
> > which calls sock.connect_ex() to get the status of the connection if
> > ovs.socket_util.check_connection_completion() returns success.
> >
> > The test cases added fails without the fix in this patch.
> >
> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>
> Can we just code check_connection_completion() like we do in C, by
> directly using select() on Windows and poll() everywhere else?  The
> approach in this patch seems a little indirect to me.
>
>
Thanks for the review. As per the comments here -
https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L51
*****
# eventlet/gevent doesn't support select.poll. If select.poll is used,
# python interpreter is blocked as a whole instead of switching from the
# current thread that is about to block to other runnable thread.
# So emulate select.poll by select.select because using python means that
# performance isn't so important.
*******

we cannot use poll() in python. I tried with this patch to test it out -
https://paste.fedoraproject.org/paste/YUhgVte-BOjgid-ojmHZJw
All the ovs python idl tests pass. But it doesn't work with openstack
networking-ovn. The whole neutron-server process just blocks.

I don't see any other way for python. Once select() returns we have to use
some mechanism to get the error code.
Does calling sock.connect_ex() bother  you ?

Thanks
Numan



> Thanks,
>
> Ben.
>
Ben Pfaff July 11, 2018, 4:56 p.m. | #3
On Wed, Jul 11, 2018 at 12:37:28PM +0530, Numan Siddique wrote:
> On Wed, Jul 11, 2018 at 2:08 AM Ben Pfaff <blp@ovn.org> wrote:
> 
> > On Wed, Jul 11, 2018 at 12:56:39AM +0530, nusiddiq@redhat.com wrote:
> > > From: Numan Siddique <nusiddiq@redhat.com>
> > >
> > > The python function ovs.socket_util.check_connection_completion() uses
> > select()
> > > (provided by python) to monitor the socket file descriptor. The select()
> > > returns 1 when the file descriptor becomes ready. For error cases like -
> > > 111 (Connection refused) and 113 (No route to host) (POLLERR),
> > ovs.poller._SelectSelect.poll()
> > > expects the exceptfds list to be set by select(). But that is not the
> > case.
> > > As per the select() man page, writefds list will be set for POLLERR.
> > > Please see "Correspondence between select() and poll() notifications"
> > section of select(2)
> > > man page.
> > >
> > > Because of this behavior, ovs.socket_util.check_connection_completion()
> > returns success
> > > even if the remote is unreachable or not listening on the port.
> > >
> > > This patch fixes this issue by adding a wrapper function -
> > check_connection_completion_status()
> > > which calls sock.connect_ex() to get the status of the connection if
> > > ovs.socket_util.check_connection_completion() returns success.
> > >
> > > The test cases added fails without the fix in this patch.
> > >
> > > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> >
> > Can we just code check_connection_completion() like we do in C, by
> > directly using select() on Windows and poll() everywhere else?  The
> > approach in this patch seems a little indirect to me.
> >
> >
> Thanks for the review. As per the comments here -
> https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L51
> *****
> # eventlet/gevent doesn't support select.poll. If select.poll is used,
> # python interpreter is blocked as a whole instead of switching from the
> # current thread that is about to block to other runnable thread.
> # So emulate select.poll by select.select because using python means that
> # performance isn't so important.
> *******
> 
> we cannot use poll() in python. I tried with this patch to test it out -
> https://paste.fedoraproject.org/paste/YUhgVte-BOjgid-ojmHZJw
> All the ovs python idl tests pass. But it doesn't work with openstack
> networking-ovn. The whole neutron-server process just blocks.
> 
> I don't see any other way for python. Once select() returns we have to use
> some mechanism to get the error code.
> Does calling sock.connect_ex() bother  you ?

I don't mean to change what poller.py does.  I mean to implement
ovs.socket_util.check_connection_completion in terms of select.poll and
select.select directly.  The comment in python/ovs/poller.py should not
be relevant because the use of select.poll in
check_connection_completion would never block (it would use a timeout of
0).

Thanks,

Ben.
Numan Siddique July 11, 2018, 8:07 p.m. | #4
On Wed, Jul 11, 2018 at 10:26 PM Ben Pfaff <blp@ovn.org> wrote:

> On Wed, Jul 11, 2018 at 12:37:28PM +0530, Numan Siddique wrote:
> > On Wed, Jul 11, 2018 at 2:08 AM Ben Pfaff <blp@ovn.org> wrote:
> >
> > > On Wed, Jul 11, 2018 at 12:56:39AM +0530, nusiddiq@redhat.com wrote:
> > > > From: Numan Siddique <nusiddiq@redhat.com>
> > > >
> > > > The python function ovs.socket_util.check_connection_completion()
> uses
> > > select()
> > > > (provided by python) to monitor the socket file descriptor. The
> select()
> > > > returns 1 when the file descriptor becomes ready. For error cases
> like -
> > > > 111 (Connection refused) and 113 (No route to host) (POLLERR),
> > > ovs.poller._SelectSelect.poll()
> > > > expects the exceptfds list to be set by select(). But that is not the
> > > case.
> > > > As per the select() man page, writefds list will be set for POLLERR.
> > > > Please see "Correspondence between select() and poll() notifications"
> > > section of select(2)
> > > > man page.
> > > >
> > > > Because of this behavior,
> ovs.socket_util.check_connection_completion()
> > > returns success
> > > > even if the remote is unreachable or not listening on the port.
> > > >
> > > > This patch fixes this issue by adding a wrapper function -
> > > check_connection_completion_status()
> > > > which calls sock.connect_ex() to get the status of the connection if
> > > > ovs.socket_util.check_connection_completion() returns success.
> > > >
> > > > The test cases added fails without the fix in this patch.
> > > >
> > > > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> > >
> > > Can we just code check_connection_completion() like we do in C, by
> > > directly using select() on Windows and poll() everywhere else?  The
> > > approach in this patch seems a little indirect to me.
> > >
> > >
> > Thanks for the review. As per the comments here -
> > https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L51
> > *****
> > # eventlet/gevent doesn't support select.poll. If select.poll is used,
> > # python interpreter is blocked as a whole instead of switching from the
> > # current thread that is about to block to other runnable thread.
> > # So emulate select.poll by select.select because using python means that
> > # performance isn't so important.
> > *******
> >
> > we cannot use poll() in python. I tried with this patch to test it out -
> > https://paste.fedoraproject.org/paste/YUhgVte-BOjgid-ojmHZJw
> > All the ovs python idl tests pass. But it doesn't work with openstack
> > networking-ovn. The whole neutron-server process just blocks.
> >
> > I don't see any other way for python. Once select() returns we have to
> use
> > some mechanism to get the error code.
> > Does calling sock.connect_ex() bother  you ?
>
> I don't mean to change what poller.py does.  I mean to implement
> ovs.socket_util.check_connection_completion in terms of select.poll and
> select.select directly.  The comment in python/ovs/poller.py should not
> be relevant because the use of select.poll in
> check_connection_completion would never block (it would use a timeout of
> 0).
>
>
Hi Ben,

For python applications which use eventlet, select.poll/epoll is not
available. 'select ' python module
is monkey patched and poll and few other functions are deleted.

https://github.com/eventlet/eventlet/blob/master/NEWS#L155
https://github.com/eventlet/eventlet/blob/master/eventlet/green/select.py#L9
https://github.com/eventlet/eventlet/blob/master/eventlet/patcher.py#L305

Thanks
Numan


Thanks,
>
> Ben.
>

Patch

diff --git a/python/ovs/socket_util.py b/python/ovs/socket_util.py
index 403104936..91f4532ea 100644
--- a/python/ovs/socket_util.py
+++ b/python/ovs/socket_util.py
@@ -256,6 +256,40 @@  def inet_open_active(style, target, default_port, dscp):
         return get_exception_errno(e), None
 
 
+def check_connection_completion_status(sock, target, default_port):
+    status = check_connection_completion(sock)
+    if status:
+        return status
+
+    try:
+        address = inet_parse_active(target, default_port)
+    except ValueError:
+        # It's not a valid tcp target.
+        return status
+
+    # For tcp connections, check_connection_completion function returns 0
+    # when the select for the socket fd signals. But it doesn't really
+    # verify the connection was established or not. So call connect again on
+    # the socket to get the status.
+    try:
+        err = sock.connect_ex(address)
+    except socket.error as e:
+        err = get_exception_errno(e)
+        if sys.platform == 'win32' and error == errno.WSAEWOULDBLOCK:
+            # WSAEWOULDBLOCK would be the equivalent on Windows
+            # for EINPROGRESS on Unix.
+            err = err.EINPROGRESS
+
+    if err == errno.EINPROGRESS or err == errno.EALREADY:
+        return errno.EINPROGRESS
+
+    if err == 0 or err == errno.EISCONN:
+        return 0
+
+    sock.close()
+    return err
+
+
 def get_exception_errno(e):
     """A lot of methods on Python socket objects raise socket.error, but that
     exception is documented as having two completely different forms of
diff --git a/python/ovs/stream.py b/python/ovs/stream.py
index 5996497a5..7d5227469 100644
--- a/python/ovs/stream.py
+++ b/python/ovs/stream.py
@@ -119,6 +119,7 @@  class Stream(object):
                                                       bInitialState=False)
 
         self.name = name
+        self.suffix = name.split(":", 1)[1]
         if status == errno.EAGAIN:
             self.state = Stream.__S_CONNECTING
         elif status == 0:
@@ -191,8 +192,16 @@  class Stream(object):
         if error:
             return error, None
         else:
-            status = ovs.socket_util.check_connection_completion(sock)
-            return 0, cls(sock, name, status)
+            err = ovs.socket_util.check_connection_completion_status(
+                sock, suffix, 0)
+            if err == errno.EAGAIN or err == errno.EINPROGRESS:
+                status = errno.EAGAIN
+                err = 0
+            elif err == 0:
+                status = 0
+            else:
+                status = err
+            return err, cls(sock, name, status)
 
     @staticmethod
     def _open(suffix, dscp):
@@ -246,7 +255,8 @@  class Stream(object):
 
     def __scs_connecting(self):
         if self.socket is not None:
-            retval = ovs.socket_util.check_connection_completion(self.socket)
+            retval = ovs.socket_util.check_connection_completion_status(
+                self.socket, self.suffix, 0)
             assert retval != errno.EINPROGRESS
         elif sys.platform == 'win32':
             if self.retry_connect:
diff --git a/tests/automake.mk b/tests/automake.mk
index 8224e5a4a..0abf29d47 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -421,6 +421,7 @@  CHECK_PYFILES = \
 	tests/test-l7.py \
 	tests/test-ovsdb.py \
 	tests/test-reconnect.py \
+	tests/test-stream.py \
 	tests/MockXenAPI.py \
 	tests/test-unix-socket.py \
 	tests/test-unixctl.py \
diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
index 58935faf3..d4d283db4 100644
--- a/tests/ovsdb-idl.at
+++ b/tests/ovsdb-idl.at
@@ -1720,3 +1720,19 @@  OVSDB_CHECK_IDL_COMPOUND_INDEX_WITH_REF([set, simple3 idl-compound-index-with-re
 007: check simple4: empty
 008: End test
 ]])
+
+m4_define([CHECK_STREAM_OPEN_BLOCK_PY],
+  [AT_SETUP([$1])
+   AT_SKIP_IF([test $2 = no])
+   AT_KEYWORDS([Check PY Stream open block - $3])
+   AT_CHECK([ovsdb_start_idltest "ptcp:0:127.0.0.1"])
+   PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
+   WRONG_PORT=$(($TCP_PORT+1))
+   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [0], [ignore])
+   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$WRONG_PORT], [1], [ignore])
+   OVSDB_SERVER_SHUTDOWN
+   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [1], [ignore])
+   AT_CLEANUP])
+
+CHECK_STREAM_OPEN_BLOCK_PY([Check PY2 Stream open block], [$HAVE_PYTHON], [$PYTHON])
+CHECK_STREAM_OPEN_BLOCK_PY([Check PY3 Stream open block], [$HAVE_PYTHON], [$PYTHON3])
diff --git a/tests/test-stream.py b/tests/test-stream.py
new file mode 100644
index 000000000..4a5117501
--- /dev/null
+++ b/tests/test-stream.py
@@ -0,0 +1,32 @@ 
+# Copyright (c) 2018, Red Hat Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at:
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+
+import ovs.stream
+
+
+def main(argv):
+    remote = argv[1]
+    err, stream = ovs.stream.Stream.open_block(
+            ovs.stream.Stream.open(remote))
+
+    if err or stream is None:
+        sys.exit(1)
+
+    sys.exit(0)
+
+
+if __name__ == '__main__':
+    main(sys.argv)