diff mbox series

[ovs-dev,v4,1/2] ovs python: ovs.stream.open_block() returns success even if the remote is unreachable

Message ID 20180807113758.3159-1-nusiddiq@redhat.com
State Accepted
Headers show
Series Partial cluster support in Python IDL client | expand

Commit Message

Numan Siddique Aug. 7, 2018, 11:37 a.m. UTC
From: Numan Siddique <nusiddiq@redhat.com>

The python function ovs.socket_util.check_connection_completion() uses select()
(provided by python) to monitor the socket file descriptor. The select()
returns 1 when the file descriptor becomes ready. For error cases like -
111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
expects the exceptfds list to be set by select(). But that is not the case.
As per the select() man page, writefds list will be set for POLLERR.
Please see "Correspondence between select() and poll() notifications" section of select(2)
man page.

Because of this behavior, ovs.socket_util.check_connection_completion() returns success
even if the remote is unreachable or not listening on the port.

This patch fixes this issue by using poll() to check the connection status similar to
the C implementation of check_connection_completion().

A new function 'get_system_poll() is added in ovs/poller.py which returns the
select.poll() object. If select.poll is monkey patched by eventlet/gevent, it
gets the original select.poll() and returns it.

The test cases added in this patch fails without the fix.

Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
---
 python/ovs/poller.py      | 34 ++++++++++++++++++++++++++++++++--
 python/ovs/socket_util.py |  5 +++--
 python/ovs/stream.py      | 11 +++++++++--
 tests/automake.mk         |  1 +
 tests/ovsdb-idl.at        | 16 ++++++++++++++++
 tests/test-stream.py      | 32 ++++++++++++++++++++++++++++++++
 6 files changed, 93 insertions(+), 6 deletions(-)
 create mode 100644 tests/test-stream.py

Comments

Mark Michelson Aug. 7, 2018, 9:12 p.m. UTC | #1
Hi Numan, See below.

On 08/07/2018 07:37 AM, nusiddiq@redhat.com wrote:
> From: Numan Siddique <nusiddiq@redhat.com>
> 
> The python function ovs.socket_util.check_connection_completion() uses select()
> (provided by python) to monitor the socket file descriptor. The select()
> returns 1 when the file descriptor becomes ready. For error cases like -
> 111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
> expects the exceptfds list to be set by select(). But that is not the case.
> As per the select() man page, writefds list will be set for POLLERR.
> Please see "Correspondence between select() and poll() notifications" section of select(2)
> man page.
> 
> Because of this behavior, ovs.socket_util.check_connection_completion() returns success
> even if the remote is unreachable or not listening on the port.
> 
> This patch fixes this issue by using poll() to check the connection status similar to
> the C implementation of check_connection_completion().
> 
> A new function 'get_system_poll() is added in ovs/poller.py which returns the
> select.poll() object. If select.poll is monkey patched by eventlet/gevent, it
> gets the original select.poll() and returns it.

Is this safe? My concern is that eventlet/gevent monkey patches 
select.poll() so that the green thread yields properly. Using the system 
select.poll() might lead to unexpected blocking.

I read your conversation with Ben on the previous iteration of this 
series. It looks like you attempted to use the system select.poll() and 
ran into this blocking problem (the pastebin for your patch is now 
deleted so I can't see what you actually tried). Why would this approach 
work differently?

> 
> The test cases added in this patch fails without the fix.
> 
> Suggested-by: Ben Pfaff <blp@ovn.org>
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> ---
>   python/ovs/poller.py      | 34 ++++++++++++++++++++++++++++++++--
>   python/ovs/socket_util.py |  5 +++--
>   python/ovs/stream.py      | 11 +++++++++--
>   tests/automake.mk         |  1 +
>   tests/ovsdb-idl.at        | 16 ++++++++++++++++
>   tests/test-stream.py      | 32 ++++++++++++++++++++++++++++++++
>   6 files changed, 93 insertions(+), 6 deletions(-)
>   create mode 100644 tests/test-stream.py
> 
> diff --git a/python/ovs/poller.py b/python/ovs/poller.py
> index 2f3fcf9b6..ef67e6763 100644
> --- a/python/ovs/poller.py
> +++ b/python/ovs/poller.py
> @@ -31,14 +31,21 @@ except ImportError:
>       SSL = None
>   
>   try:
> -    import eventlet.patcher
> +    from eventlet import patcher as eventlet_patcher
>   
>       def _using_eventlet_green_select():
> -        return eventlet.patcher.is_monkey_patched(select)
> +        return eventlet_patcher.is_monkey_patched(select)
>   except:
> +    eventlet_patcher = None
>       def _using_eventlet_green_select():
>           return False
>   
> +try:
> +    from gevent import monkey as gevent_monkey
> +except:
> +    gevent_monkey = None
> +
> +
>   vlog = ovs.vlog.Vlog("poller")
>   
>   POLLIN = 0x001
> @@ -257,3 +264,26 @@ class Poller(object):
>       def __reset(self):
>           self.poll = SelectPoll()
>           self.timeout = -1
> +
> +
> +"""
> +Returns the original select.poll() object. If select.poll is monkey patched
> +by eventlet or gevent library, it gets the original select.poll and returns
> +an object of it using the eventlet.patcher.original/gevent.monkey.get_original
> +functions.
> +
> +As a last resort, if there is any exception it returns the SelectPoll() object.
> +"""
> +def get_system_poll():
> +    try:
> +        if _using_eventlet_green_select():
> +            _system_poll = eventlet_patcher.original("select").poll
> +        elif gevent_monkey and gevent_monkey.is_object_patched(
> +                'select', 'poll'):
> +            _system_poll = gevent_monkey.get_original('select', 'poll')
> +        else:
> +            _system_poll = select.poll
> +    except:
> +        _system_poll = SelectPoll
> +
> +    return _system_poll()
> diff --git a/python/ovs/socket_util.py b/python/ovs/socket_util.py
> index 403104936..8e582fe91 100644
> --- a/python/ovs/socket_util.py
> +++ b/python/ovs/socket_util.py
> @@ -162,8 +162,8 @@ def make_unix_socket(style, nonblock, bind_path, connect_path, short=False):
>   
>   
>   def check_connection_completion(sock):
> -    p = ovs.poller.SelectPoll()
>       if sys.platform == "win32":
> +        p = ovs.poller.SelectPoll()
>           event = winutils.get_new_event(None, False, True, None)
>           # Receive notification of readiness for writing, of completed
>           # connection or multipoint join operation, and of socket closure.
> @@ -173,6 +173,7 @@ def check_connection_completion(sock):
>                                    win32file.FD_CLOSE)
>           p.register(event, ovs.poller.POLLOUT)
>       else:
> +        p = ovs.poller.get_system_poll()
>           p.register(sock, ovs.poller.POLLOUT)
>       pfds = p.poll(0)
>       if len(pfds) == 1:
> @@ -180,7 +181,7 @@ def check_connection_completion(sock):
>           if revents & ovs.poller.POLLERR:
>               try:
>                   # The following should raise an exception.
> -                socket.send("\0", socket.MSG_DONTWAIT)
> +                sock.send("\0".encode(), socket.MSG_DONTWAIT)
>   
>                   # (Here's where we end up if it didn't.)
>                   # XXX rate-limit
> diff --git a/python/ovs/stream.py b/python/ovs/stream.py
> index 5996497a5..ca0d84425 100644
> --- a/python/ovs/stream.py
> +++ b/python/ovs/stream.py
> @@ -191,8 +191,15 @@ class Stream(object):
>           if error:
>               return error, None
>           else:
> -            status = ovs.socket_util.check_connection_completion(sock)
> -            return 0, cls(sock, name, status)
> +            err = ovs.socket_util.check_connection_completion(sock)
> +            if err == errno.EAGAIN or err == errno.EINPROGRESS:
> +                status = errno.EAGAIN
> +                err = 0
> +            elif err == 0:
> +                status = 0
> +            else:
> +                status = err
> +            return err, cls(sock, name, status)
>   
>       @staticmethod
>       def _open(suffix, dscp):
> diff --git a/tests/automake.mk b/tests/automake.mk
> index 8224e5a4a..0abf29d47 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -421,6 +421,7 @@ CHECK_PYFILES = \
>   	tests/test-l7.py \
>   	tests/test-ovsdb.py \
>   	tests/test-reconnect.py \
> +	tests/test-stream.py \
>   	tests/MockXenAPI.py \
>   	tests/test-unix-socket.py \
>   	tests/test-unixctl.py \
> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index 014382850..e8a26e971 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1720,3 +1720,19 @@ OVSDB_CHECK_IDL_COMPOUND_INDEX_WITH_REF([set, simple3 idl-compound-index-with-re
>   007: check simple4: empty
>   008: End test
>   ]])
> +
> +m4_define([CHECK_STREAM_OPEN_BLOCK_PY],
> +  [AT_SETUP([$1])
> +   AT_SKIP_IF([test $2 = no])
> +   AT_KEYWORDS([Check PY Stream open block - $3])
> +   AT_CHECK([ovsdb_start_idltest "ptcp:0:127.0.0.1"])
> +   PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
> +   WRONG_PORT=$(($TCP_PORT+1))
> +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [0], [ignore])
> +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$WRONG_PORT], [1], [ignore])
> +   OVSDB_SERVER_SHUTDOWN
> +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [1], [ignore])
> +   AT_CLEANUP])
> +
> +CHECK_STREAM_OPEN_BLOCK_PY([Check PY2 Stream open block], [$HAVE_PYTHON], [$PYTHON])
> +CHECK_STREAM_OPEN_BLOCK_PY([Check PY3 Stream open block], [$HAVE_PYTHON], [$PYTHON3])
> diff --git a/tests/test-stream.py b/tests/test-stream.py
> new file mode 100644
> index 000000000..4a5117501
> --- /dev/null
> +++ b/tests/test-stream.py
> @@ -0,0 +1,32 @@
> +# Copyright (c) 2018, Red Hat Inc.
> +#
> +# Licensed under the Apache License, Version 2.0 (the "License");
> +# you may not use this file except in compliance with the License.
> +# You may obtain a copy of the License at:
> +#
> +#     http://www.apache.org/licenses/LICENSE-2.0
> +#
> +# Unless required by applicable law or agreed to in writing, software
> +# distributed under the License is distributed on an "AS IS" BASIS,
> +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> +# See the License for the specific language governing permissions and
> +# limitations under the License.
> +
> +import sys
> +
> +import ovs.stream
> +
> +
> +def main(argv):
> +    remote = argv[1]
> +    err, stream = ovs.stream.Stream.open_block(
> +            ovs.stream.Stream.open(remote))
> +
> +    if err or stream is None:
> +        sys.exit(1)
> +
> +    sys.exit(0)
> +
> +
> +if __name__ == '__main__':
> +    main(sys.argv)
>
Ben Pfaff Aug. 7, 2018, 9:27 p.m. UTC | #2
On Tue, Aug 07, 2018 at 05:12:32PM -0400, Mark Michelson wrote:
> On 08/07/2018 07:37 AM, nusiddiq@redhat.com wrote:
> >From: Numan Siddique <nusiddiq@redhat.com>
> >
> >The python function ovs.socket_util.check_connection_completion() uses select()
> >(provided by python) to monitor the socket file descriptor. The select()
> >returns 1 when the file descriptor becomes ready. For error cases like -
> >111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
> >expects the exceptfds list to be set by select(). But that is not the case.
> >As per the select() man page, writefds list will be set for POLLERR.
> >Please see "Correspondence between select() and poll() notifications" section of select(2)
> >man page.
> >
> >Because of this behavior, ovs.socket_util.check_connection_completion() returns success
> >even if the remote is unreachable or not listening on the port.
> >
> >This patch fixes this issue by using poll() to check the connection status similar to
> >the C implementation of check_connection_completion().
> >
> >A new function 'get_system_poll() is added in ovs/poller.py which returns the
> >select.poll() object. If select.poll is monkey patched by eventlet/gevent, it
> >gets the original select.poll() and returns it.
> 
> Is this safe? My concern is that eventlet/gevent monkey patches
> select.poll() so that the green thread yields properly. Using the system
> select.poll() might lead to unexpected blocking.

gevent monkey patches Python to get rid of poll() because poll() might
block and thereby stall Python.  This use of poll() is OK because it
will never block (because it uses a timeout of 0).
 
> I read your conversation with Ben on the previous iteration of this series.
> It looks like you attempted to use the system select.poll() and ran into
> this blocking problem (the pastebin for your patch is now deleted so I can't
> see what you actually tried). Why would this approach work differently?

I don't recall what was in the pastebin, but the use of poll() here
should be sound.  We use the same approach in C and we don't want to
block in C either.
Numan Siddique Aug. 8, 2018, 7:24 a.m. UTC | #3
On Wed, Aug 8, 2018 at 2:58 AM Ben Pfaff <blp@ovn.org> wrote:

> On Tue, Aug 07, 2018 at 05:12:32PM -0400, Mark Michelson wrote:
> > On 08/07/2018 07:37 AM, nusiddiq@redhat.com wrote:
> > >From: Numan Siddique <nusiddiq@redhat.com>
> > >
> > >The python function ovs.socket_util.check_connection_completion() uses
> select()
> > >(provided by python) to monitor the socket file descriptor. The select()
> > >returns 1 when the file descriptor becomes ready. For error cases like -
> > >111 (Connection refused) and 113 (No route to host) (POLLERR),
> ovs.poller._SelectSelect.poll()
> > >expects the exceptfds list to be set by select(). But that is not the
> case.
> > >As per the select() man page, writefds list will be set for POLLERR.
> > >Please see "Correspondence between select() and poll() notifications"
> section of select(2)
> > >man page.
> > >
> > >Because of this behavior, ovs.socket_util.check_connection_completion()
> returns success
> > >even if the remote is unreachable or not listening on the port.
> > >
> > >This patch fixes this issue by using poll() to check the connection
> status similar to
> > >the C implementation of check_connection_completion().
> > >
> > >A new function 'get_system_poll() is added in ovs/poller.py which
> returns the
> > >select.poll() object. If select.poll is monkey patched by
> eventlet/gevent, it
> > >gets the original select.poll() and returns it.
> >
> > Is this safe? My concern is that eventlet/gevent monkey patches
> > select.poll() so that the green thread yields properly. Using the system
> > select.poll() might lead to unexpected blocking.
>
> gevent monkey patches Python to get rid of poll() because poll() might
> block and thereby stall Python.  This use of poll() is OK because it
> will never block (because it uses a timeout of 0).
>
> > I read your conversation with Ben on the previous iteration of this
> series.
> > It looks like you attempted to use the system select.poll() and ran into
> > this blocking problem (the pastebin for your patch is now deleted so I
> can't
> > see what you actually tried). Why would this approach work differently?
>
> I don't recall what was in the pastebin, but the use of poll() here
> should be sound.  We use the same approach in C and we don't want to
> block in C either.
>

When I tried last time, I modified the class 'SelectPoll' class here
- https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L145
to use system.poll and this blocked the whole process. But as Ben suggested
earlier
and mentioned above , we use select.poll only when checking the connection
status
with timeout of 0. So we are fine. I tested this patch with openstack
neutron and it worked
fine.

Thanks
Numan
Mark Michelson Aug. 8, 2018, 1 p.m. UTC | #4
On 08/08/2018 03:24 AM, Numan Siddique wrote:
> 
> 
> On Wed, Aug 8, 2018 at 2:58 AM Ben Pfaff <blp@ovn.org 
> <mailto:blp@ovn.org>> wrote:
> 
>     On Tue, Aug 07, 2018 at 05:12:32PM -0400, Mark Michelson wrote:
>      > On 08/07/2018 07:37 AM, nusiddiq@redhat.com
>     <mailto:nusiddiq@redhat.com> wrote:
>      > >From: Numan Siddique <nusiddiq@redhat.com
>     <mailto:nusiddiq@redhat.com>>
>      > >
>      > >The python function
>     ovs.socket_util.check_connection_completion() uses select()
>      > >(provided by python) to monitor the socket file descriptor. The
>     select()
>      > >returns 1 when the file descriptor becomes ready. For error
>     cases like -
>      > >111 (Connection refused) and 113 (No route to host) (POLLERR),
>     ovs.poller._SelectSelect.poll()
>      > >expects the exceptfds list to be set by select(). But that is
>     not the case.
>      > >As per the select() man page, writefds list will be set for POLLERR.
>      > >Please see "Correspondence between select() and poll()
>     notifications" section of select(2)
>      > >man page.
>      > >
>      > >Because of this behavior,
>     ovs.socket_util.check_connection_completion() returns success
>      > >even if the remote is unreachable or not listening on the port.
>      > >
>      > >This patch fixes this issue by using poll() to check the
>     connection status similar to
>      > >the C implementation of check_connection_completion().
>      > >
>      > >A new function 'get_system_poll() is added in ovs/poller.py
>     which returns the
>      > >select.poll() object. If select.poll is monkey patched by
>     eventlet/gevent, it
>      > >gets the original select.poll() and returns it.
>      >
>      > Is this safe? My concern is that eventlet/gevent monkey patches
>      > select.poll() so that the green thread yields properly. Using the
>     system
>      > select.poll() might lead to unexpected blocking.
> 
>     gevent monkey patches Python to get rid of poll() because poll() might
>     block and thereby stall Python.  This use of poll() is OK because it
>     will never block (because it uses a timeout of 0).
> 
>      > I read your conversation with Ben on the previous iteration of
>     this series.
>      > It looks like you attempted to use the system select.poll() and
>     ran into
>      > this blocking problem (the pastebin for your patch is now deleted
>     so I can't
>      > see what you actually tried). Why would this approach work
>     differently?
> 
>     I don't recall what was in the pastebin, but the use of poll() here
>     should be sound.  We use the same approach in C and we don't want to
>     block in C either.
> 
> 
> When I tried last time, I modified the class 'SelectPoll' class here
> - https://github.com/openvswitch/ovs/blob/master/python/ovs/poller.py#L145
> to use system.poll and this blocked the whole process. But as Ben 
> suggested earlier
> and mentioned above , we use select.poll only when checking the 
> connection status
> with timeout of 0. So we are fine. I tested this patch with openstack 
> neutron and it worked
> fine.
> 
> Thanks
> Numan
> 

Thanks, guys. I'll give the patchset a more thorough review now.
Ben Pfaff Aug. 14, 2018, 6:47 p.m. UTC | #5
On Tue, Aug 07, 2018 at 05:07:58PM +0530, nusiddiq@redhat.com wrote:
> From: Numan Siddique <nusiddiq@redhat.com>
> 
> The python function ovs.socket_util.check_connection_completion() uses select()
> (provided by python) to monitor the socket file descriptor. The select()
> returns 1 when the file descriptor becomes ready. For error cases like -
> 111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
> expects the exceptfds list to be set by select(). But that is not the case.
> As per the select() man page, writefds list will be set for POLLERR.
> Please see "Correspondence between select() and poll() notifications" section of select(2)
> man page.
> 
> Because of this behavior, ovs.socket_util.check_connection_completion() returns success
> even if the remote is unreachable or not listening on the port.
> 
> This patch fixes this issue by using poll() to check the connection status similar to
> the C implementation of check_connection_completion().
> 
> A new function 'get_system_poll() is added in ovs/poller.py which returns the
> select.poll() object. If select.poll is monkey patched by eventlet/gevent, it
> gets the original select.poll() and returns it.
> 
> The test cases added in this patch fails without the fix.
> 
> Suggested-by: Ben Pfaff <blp@ovn.org>
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>

Thanks.

I had to fold in the following to placate flake8:

diff --git a/python/ovs/poller.py b/python/ovs/poller.py
index ef67e6763237..9c6892d98e97 100644
--- a/python/ovs/poller.py
+++ b/python/ovs/poller.py
@@ -37,6 +37,7 @@ try:
         return eventlet_patcher.is_monkey_patched(select)
 except:
     eventlet_patcher = None
+
     def _using_eventlet_green_select():
         return False
 
@@ -266,15 +267,15 @@ class Poller(object):
         self.timeout = -1
 
 
-"""
-Returns the original select.poll() object. If select.poll is monkey patched
-by eventlet or gevent library, it gets the original select.poll and returns
-an object of it using the eventlet.patcher.original/gevent.monkey.get_original
-functions.
-
-As a last resort, if there is any exception it returns the SelectPoll() object.
-"""
 def get_system_poll():
+    """Returns the original select.poll() object. If select.poll is
+    monkey patched by eventlet or gevent library, it gets the original
+    select.poll and returns an object of it using the
+    eventlet.patcher.original/gevent.monkey.get_original functions.
+
+    As a last resort, if there is any exception it returns the
+    SelectPoll() object.
+    """
     try:
         if _using_eventlet_green_select():
             _system_poll = eventlet_patcher.original("select").poll
Simon Horman Aug. 18, 2018, 7:40 p.m. UTC | #6
On Tue, Aug 07, 2018 at 05:07:58PM +0530, nusiddiq@redhat.com wrote:
> From: Numan Siddique <nusiddiq@redhat.com>
> 
> The python function ovs.socket_util.check_connection_completion() uses select()
> (provided by python) to monitor the socket file descriptor. The select()
> returns 1 when the file descriptor becomes ready. For error cases like -
> 111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
> expects the exceptfds list to be set by select(). But that is not the case.
> As per the select() man page, writefds list will be set for POLLERR.
> Please see "Correspondence between select() and poll() notifications" section of select(2)
> man page.
> 
> Because of this behavior, ovs.socket_util.check_connection_completion() returns success
> even if the remote is unreachable or not listening on the port.
> 
> This patch fixes this issue by using poll() to check the connection status similar to
> the C implementation of check_connection_completion().
> 
> A new function 'get_system_poll() is added in ovs/poller.py which returns the
> select.poll() object. If select.poll is monkey patched by eventlet/gevent, it
> gets the original select.poll() and returns it.
> 
> The test cases added in this patch fails without the fix.
> 
> Suggested-by: Ben Pfaff <blp@ovn.org>
> Signed-off-by: Numan Siddique <nusiddiq@redhat.com>

I believe that this patch is the cause of the testsuite failure currently
exhibited in the master branch when tested using travis-ci.

https://travis-ci.org/openvswitch/ovs/jobs/417506303

...

> diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> index 014382850..e8a26e971 100644
> --- a/tests/ovsdb-idl.at
> +++ b/tests/ovsdb-idl.at
> @@ -1720,3 +1720,19 @@ OVSDB_CHECK_IDL_COMPOUND_INDEX_WITH_REF([set, simple3 idl-compound-index-with-re
>  007: check simple4: empty
>  008: End test
>  ]])
> +
> +m4_define([CHECK_STREAM_OPEN_BLOCK_PY],
> +  [AT_SETUP([$1])
> +   AT_SKIP_IF([test $2 = no])
> +   AT_KEYWORDS([Check PY Stream open block - $3])
> +   AT_CHECK([ovsdb_start_idltest "ptcp:0:127.0.0.1"])
> +   PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
> +   WRONG_PORT=$(($TCP_PORT+1))
> +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [0], [ignore])
> +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$WRONG_PORT], [1], [ignore])
> +   OVSDB_SERVER_SHUTDOWN
> +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [1], [ignore])
> +   AT_CLEANUP])
> +
> +CHECK_STREAM_OPEN_BLOCK_PY([Check PY2 Stream open block], [$HAVE_PYTHON], [$PYTHON])
> +CHECK_STREAM_OPEN_BLOCK_PY([Check PY3 Stream open block], [$HAVE_PYTHON], [$PYTHON3])

I wonder if the problem exhibited by travis-ci would be resolved
by changing $HAVE_PYTHON to $HAVE_PYTHON3 on the "PY3 Stream" line.

Also, I wonder if s/PYTHON/PYTHON2/ is appropriate on the "PY2 Stream" line.

...
Ben Pfaff Aug. 18, 2018, 10:12 p.m. UTC | #7
On Sat, Aug 18, 2018 at 09:40:05PM +0200, Simon Horman wrote:
> On Tue, Aug 07, 2018 at 05:07:58PM +0530, nusiddiq@redhat.com wrote:
> > From: Numan Siddique <nusiddiq@redhat.com>
> > 
> > The python function ovs.socket_util.check_connection_completion() uses select()
> > (provided by python) to monitor the socket file descriptor. The select()
> > returns 1 when the file descriptor becomes ready. For error cases like -
> > 111 (Connection refused) and 113 (No route to host) (POLLERR), ovs.poller._SelectSelect.poll()
> > expects the exceptfds list to be set by select(). But that is not the case.
> > As per the select() man page, writefds list will be set for POLLERR.
> > Please see "Correspondence between select() and poll() notifications" section of select(2)
> > man page.
> > 
> > Because of this behavior, ovs.socket_util.check_connection_completion() returns success
> > even if the remote is unreachable or not listening on the port.
> > 
> > This patch fixes this issue by using poll() to check the connection status similar to
> > the C implementation of check_connection_completion().
> > 
> > A new function 'get_system_poll() is added in ovs/poller.py which returns the
> > select.poll() object. If select.poll is monkey patched by eventlet/gevent, it
> > gets the original select.poll() and returns it.
> > 
> > The test cases added in this patch fails without the fix.
> > 
> > Suggested-by: Ben Pfaff <blp@ovn.org>
> > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> 
> I believe that this patch is the cause of the testsuite failure currently
> exhibited in the master branch when tested using travis-ci.
> 
> https://travis-ci.org/openvswitch/ovs/jobs/417506303
> 
> ...
> 
> > diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> > index 014382850..e8a26e971 100644
> > --- a/tests/ovsdb-idl.at
> > +++ b/tests/ovsdb-idl.at
> > @@ -1720,3 +1720,19 @@ OVSDB_CHECK_IDL_COMPOUND_INDEX_WITH_REF([set, simple3 idl-compound-index-with-re
> >  007: check simple4: empty
> >  008: End test
> >  ]])
> > +
> > +m4_define([CHECK_STREAM_OPEN_BLOCK_PY],
> > +  [AT_SETUP([$1])
> > +   AT_SKIP_IF([test $2 = no])
> > +   AT_KEYWORDS([Check PY Stream open block - $3])
> > +   AT_CHECK([ovsdb_start_idltest "ptcp:0:127.0.0.1"])
> > +   PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
> > +   WRONG_PORT=$(($TCP_PORT+1))
> > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [0], [ignore])
> > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$WRONG_PORT], [1], [ignore])
> > +   OVSDB_SERVER_SHUTDOWN
> > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [1], [ignore])
> > +   AT_CLEANUP])
> > +
> > +CHECK_STREAM_OPEN_BLOCK_PY([Check PY2 Stream open block], [$HAVE_PYTHON], [$PYTHON])
> > +CHECK_STREAM_OPEN_BLOCK_PY([Check PY3 Stream open block], [$HAVE_PYTHON], [$PYTHON3])
> 
> I wonder if the problem exhibited by travis-ci would be resolved
> by changing $HAVE_PYTHON to $HAVE_PYTHON3 on the "PY3 Stream" line.
> 
> Also, I wonder if s/PYTHON/PYTHON2/ is appropriate on the "PY2 Stream" line.

You're right, thanks.  I sent a fix:
        https://patchwork.ozlabs.org/patch/959273/
Simon Horman Aug. 20, 2018, 2:12 p.m. UTC | #8
On 19 August 2018 at 00:12, Ben Pfaff <blp@ovn.org> wrote:

> On Sat, Aug 18, 2018 at 09:40:05PM +0200, Simon Horman wrote:
> > On Tue, Aug 07, 2018 at 05:07:58PM +0530, nusiddiq@redhat.com wrote:
> > > From: Numan Siddique <nusiddiq@redhat.com>
> > >
> > > The python function ovs.socket_util.check_connection_completion()
> uses select()
> > > (provided by python) to monitor the socket file descriptor. The
> select()
> > > returns 1 when the file descriptor becomes ready. For error cases like
> -
> > > 111 (Connection refused) and 113 (No route to host) (POLLERR),
> ovs.poller._SelectSelect.poll()
> > > expects the exceptfds list to be set by select(). But that is not the
> case.
> > > As per the select() man page, writefds list will be set for POLLERR.
> > > Please see "Correspondence between select() and poll() notifications"
> section of select(2)
> > > man page.
> > >
> > > Because of this behavior, ovs.socket_util.check_connection_completion()
> returns success
> > > even if the remote is unreachable or not listening on the port.
> > >
> > > This patch fixes this issue by using poll() to check the connection
> status similar to
> > > the C implementation of check_connection_completion().
> > >
> > > A new function 'get_system_poll() is added in ovs/poller.py which
> returns the
> > > select.poll() object. If select.poll is monkey patched by
> eventlet/gevent, it
> > > gets the original select.poll() and returns it.
> > >
> > > The test cases added in this patch fails without the fix.
> > >
> > > Suggested-by: Ben Pfaff <blp@ovn.org>
> > > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
> >
> > I believe that this patch is the cause of the testsuite failure currently
> > exhibited in the master branch when tested using travis-ci.
> >
> > https://travis-ci.org/openvswitch/ovs/jobs/417506303
> >
> > ...
> >
> > > diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
> > > index 014382850..e8a26e971 100644
> > > --- a/tests/ovsdb-idl.at
> > > +++ b/tests/ovsdb-idl.at
> > > @@ -1720,3 +1720,19 @@ OVSDB_CHECK_IDL_COMPOUND_INDEX_WITH_REF([set,
> simple3 idl-compound-index-with-re
> > >  007: check simple4: empty
> > >  008: End test
> > >  ]])
> > > +
> > > +m4_define([CHECK_STREAM_OPEN_BLOCK_PY],
> > > +  [AT_SETUP([$1])
> > > +   AT_SKIP_IF([test $2 = no])
> > > +   AT_KEYWORDS([Check PY Stream open block - $3])
> > > +   AT_CHECK([ovsdb_start_idltest "ptcp:0:127.0.0.1"])
> > > +   PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
> > > +   WRONG_PORT=$(($TCP_PORT+1))
> > > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT],
> [0], [ignore])
> > > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$WRONG_PORT],
> [1], [ignore])
> > > +   OVSDB_SERVER_SHUTDOWN
> > > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT],
> [1], [ignore])
> > > +   AT_CLEANUP])
> > > +
> > > +CHECK_STREAM_OPEN_BLOCK_PY([Check PY2 Stream open block],
> [$HAVE_PYTHON], [$PYTHON])
> > > +CHECK_STREAM_OPEN_BLOCK_PY([Check PY3 Stream open block],
> [$HAVE_PYTHON], [$PYTHON3])
> >
> > I wonder if the problem exhibited by travis-ci would be resolved
> > by changing $HAVE_PYTHON to $HAVE_PYTHON3 on the "PY3 Stream" line.
> >
> > Also, I wonder if s/PYTHON/PYTHON2/ is appropriate on the "PY2 Stream"
> line.
>
> You're right, thanks.  I sent a fix:
>         https://patchwork.ozlabs.org/patch/959273/


Thanks Ben, much appreciated.
Numan Siddique Aug. 20, 2018, 3:39 p.m. UTC | #9
On Mon, Aug 20, 2018 at 7:43 PM Simon Horman <simon.horman@netronome.com>
wrote:

>
>
> On 19 August 2018 at 00:12, Ben Pfaff <blp@ovn.org> wrote:
>
>> On Sat, Aug 18, 2018 at 09:40:05PM +0200, Simon Horman wrote:
>> > On Tue, Aug 07, 2018 at 05:07:58PM +0530, nusiddiq@redhat.com wrote:
>> > > From: Numan Siddique <nusiddiq@redhat.com>
>> > >
>> > > The python function ovs.socket_util.check_connection_completion()
>> uses select()
>> > > (provided by python) to monitor the socket file descriptor. The
>> select()
>> > > returns 1 when the file descriptor becomes ready. For error cases
>> like -
>> > > 111 (Connection refused) and 113 (No route to host) (POLLERR),
>> ovs.poller._SelectSelect.poll()
>> > > expects the exceptfds list to be set by select(). But that is not the
>> case.
>> > > As per the select() man page, writefds list will be set for POLLERR.
>> > > Please see "Correspondence between select() and poll() notifications"
>> section of select(2)
>> > > man page.
>> > >
>> > > Because of this behavior,
>> ovs.socket_util.check_connection_completion() returns success
>> > > even if the remote is unreachable or not listening on the port.
>> > >
>> > > This patch fixes this issue by using poll() to check the connection
>> status similar to
>> > > the C implementation of check_connection_completion().
>> > >
>> > > A new function 'get_system_poll() is added in ovs/poller.py which
>> returns the
>> > > select.poll() object. If select.poll is monkey patched by
>> eventlet/gevent, it
>> > > gets the original select.poll() and returns it.
>> > >
>> > > The test cases added in this patch fails without the fix.
>> > >
>> > > Suggested-by: Ben Pfaff <blp@ovn.org>
>> > > Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
>> >
>> > I believe that this patch is the cause of the testsuite failure
>> currently
>> > exhibited in the master branch when tested using travis-ci.
>> >
>> > https://travis-ci.org/openvswitch/ovs/jobs/417506303
>> >
>> > ...
>> >
>> > > diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
>> > > index 014382850..e8a26e971 100644
>> > > --- a/tests/ovsdb-idl.at
>> > > +++ b/tests/ovsdb-idl.at
>> > > @@ -1720,3 +1720,19 @@ OVSDB_CHECK_IDL_COMPOUND_INDEX_WITH_REF([set,
>> simple3 idl-compound-index-with-re
>> > >  007: check simple4: empty
>> > >  008: End test
>> > >  ]])
>> > > +
>> > > +m4_define([CHECK_STREAM_OPEN_BLOCK_PY],
>> > > +  [AT_SETUP([$1])
>> > > +   AT_SKIP_IF([test $2 = no])
>> > > +   AT_KEYWORDS([Check PY Stream open block - $3])
>> > > +   AT_CHECK([ovsdb_start_idltest "ptcp:0:127.0.0.1"])
>> > > +   PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
>> > > +   WRONG_PORT=$(($TCP_PORT+1))
>> > > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT],
>> [0], [ignore])
>> > > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$WRONG_PORT],
>> [1], [ignore])
>> > > +   OVSDB_SERVER_SHUTDOWN
>> > > +   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT],
>> [1], [ignore])
>> > > +   AT_CLEANUP])
>> > > +
>> > > +CHECK_STREAM_OPEN_BLOCK_PY([Check PY2 Stream open block],
>> [$HAVE_PYTHON], [$PYTHON])
>> > > +CHECK_STREAM_OPEN_BLOCK_PY([Check PY3 Stream open block],
>> [$HAVE_PYTHON], [$PYTHON3])
>> >
>> > I wonder if the problem exhibited by travis-ci would be resolved
>> > by changing $HAVE_PYTHON to $HAVE_PYTHON3 on the "PY3 Stream" line.
>> >
>> > Also, I wonder if s/PYTHON/PYTHON2/ is appropriate on the "PY2 Stream"
>> line.
>>
>> You're right, thanks.  I sent a fix:
>>         https://patchwork.ozlabs.org/patch/959273/
>
>
>
Oops. Thanks Ben for the fix.



> Thanks Ben, much appreciated.
>
>
diff mbox series

Patch

diff --git a/python/ovs/poller.py b/python/ovs/poller.py
index 2f3fcf9b6..ef67e6763 100644
--- a/python/ovs/poller.py
+++ b/python/ovs/poller.py
@@ -31,14 +31,21 @@  except ImportError:
     SSL = None
 
 try:
-    import eventlet.patcher
+    from eventlet import patcher as eventlet_patcher
 
     def _using_eventlet_green_select():
-        return eventlet.patcher.is_monkey_patched(select)
+        return eventlet_patcher.is_monkey_patched(select)
 except:
+    eventlet_patcher = None
     def _using_eventlet_green_select():
         return False
 
+try:
+    from gevent import monkey as gevent_monkey
+except:
+    gevent_monkey = None
+
+
 vlog = ovs.vlog.Vlog("poller")
 
 POLLIN = 0x001
@@ -257,3 +264,26 @@  class Poller(object):
     def __reset(self):
         self.poll = SelectPoll()
         self.timeout = -1
+
+
+"""
+Returns the original select.poll() object. If select.poll is monkey patched
+by eventlet or gevent library, it gets the original select.poll and returns
+an object of it using the eventlet.patcher.original/gevent.monkey.get_original
+functions.
+
+As a last resort, if there is any exception it returns the SelectPoll() object.
+"""
+def get_system_poll():
+    try:
+        if _using_eventlet_green_select():
+            _system_poll = eventlet_patcher.original("select").poll
+        elif gevent_monkey and gevent_monkey.is_object_patched(
+                'select', 'poll'):
+            _system_poll = gevent_monkey.get_original('select', 'poll')
+        else:
+            _system_poll = select.poll
+    except:
+        _system_poll = SelectPoll
+
+    return _system_poll()
diff --git a/python/ovs/socket_util.py b/python/ovs/socket_util.py
index 403104936..8e582fe91 100644
--- a/python/ovs/socket_util.py
+++ b/python/ovs/socket_util.py
@@ -162,8 +162,8 @@  def make_unix_socket(style, nonblock, bind_path, connect_path, short=False):
 
 
 def check_connection_completion(sock):
-    p = ovs.poller.SelectPoll()
     if sys.platform == "win32":
+        p = ovs.poller.SelectPoll()
         event = winutils.get_new_event(None, False, True, None)
         # Receive notification of readiness for writing, of completed
         # connection or multipoint join operation, and of socket closure.
@@ -173,6 +173,7 @@  def check_connection_completion(sock):
                                  win32file.FD_CLOSE)
         p.register(event, ovs.poller.POLLOUT)
     else:
+        p = ovs.poller.get_system_poll()
         p.register(sock, ovs.poller.POLLOUT)
     pfds = p.poll(0)
     if len(pfds) == 1:
@@ -180,7 +181,7 @@  def check_connection_completion(sock):
         if revents & ovs.poller.POLLERR:
             try:
                 # The following should raise an exception.
-                socket.send("\0", socket.MSG_DONTWAIT)
+                sock.send("\0".encode(), socket.MSG_DONTWAIT)
 
                 # (Here's where we end up if it didn't.)
                 # XXX rate-limit
diff --git a/python/ovs/stream.py b/python/ovs/stream.py
index 5996497a5..ca0d84425 100644
--- a/python/ovs/stream.py
+++ b/python/ovs/stream.py
@@ -191,8 +191,15 @@  class Stream(object):
         if error:
             return error, None
         else:
-            status = ovs.socket_util.check_connection_completion(sock)
-            return 0, cls(sock, name, status)
+            err = ovs.socket_util.check_connection_completion(sock)
+            if err == errno.EAGAIN or err == errno.EINPROGRESS:
+                status = errno.EAGAIN
+                err = 0
+            elif err == 0:
+                status = 0
+            else:
+                status = err
+            return err, cls(sock, name, status)
 
     @staticmethod
     def _open(suffix, dscp):
diff --git a/tests/automake.mk b/tests/automake.mk
index 8224e5a4a..0abf29d47 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -421,6 +421,7 @@  CHECK_PYFILES = \
 	tests/test-l7.py \
 	tests/test-ovsdb.py \
 	tests/test-reconnect.py \
+	tests/test-stream.py \
 	tests/MockXenAPI.py \
 	tests/test-unix-socket.py \
 	tests/test-unixctl.py \
diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at
index 014382850..e8a26e971 100644
--- a/tests/ovsdb-idl.at
+++ b/tests/ovsdb-idl.at
@@ -1720,3 +1720,19 @@  OVSDB_CHECK_IDL_COMPOUND_INDEX_WITH_REF([set, simple3 idl-compound-index-with-re
 007: check simple4: empty
 008: End test
 ]])
+
+m4_define([CHECK_STREAM_OPEN_BLOCK_PY],
+  [AT_SETUP([$1])
+   AT_SKIP_IF([test $2 = no])
+   AT_KEYWORDS([Check PY Stream open block - $3])
+   AT_CHECK([ovsdb_start_idltest "ptcp:0:127.0.0.1"])
+   PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
+   WRONG_PORT=$(($TCP_PORT+1))
+   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [0], [ignore])
+   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$WRONG_PORT], [1], [ignore])
+   OVSDB_SERVER_SHUTDOWN
+   AT_CHECK([$3 $srcdir/test-stream.py tcp:127.0.0.1:$TCP_PORT], [1], [ignore])
+   AT_CLEANUP])
+
+CHECK_STREAM_OPEN_BLOCK_PY([Check PY2 Stream open block], [$HAVE_PYTHON], [$PYTHON])
+CHECK_STREAM_OPEN_BLOCK_PY([Check PY3 Stream open block], [$HAVE_PYTHON], [$PYTHON3])
diff --git a/tests/test-stream.py b/tests/test-stream.py
new file mode 100644
index 000000000..4a5117501
--- /dev/null
+++ b/tests/test-stream.py
@@ -0,0 +1,32 @@ 
+# Copyright (c) 2018, Red Hat Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at:
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+
+import ovs.stream
+
+
+def main(argv):
+    remote = argv[1]
+    err, stream = ovs.stream.Stream.open_block(
+            ovs.stream.Stream.open(remote))
+
+    if err or stream is None:
+        sys.exit(1)
+
+    sys.exit(0)
+
+
+if __name__ == '__main__':
+    main(sys.argv)