Message ID | 20231118010703.4154866-4-ihrachys@redhat.com |
---|---|
State | Changes Requested |
Delegated to: | Simon Horman |
Headers | show |
Series | Improve unixctl AF_UNIX backlog handling | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | warning | apply and check: warning |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote: > In nonblocking mode, a POSIX socket may temporarily fail to connect(), > which is indicated by returning EINPROGRESS. This patch will loop > waiting for connect() to succeed, or until EINPROGRESS happens. > > An alternative - and probably a better - path to deal with this error > would be to extend unix stream_class to support connect API, that would > allow the caller to repeat connect() attempts asynchronously. Hi Ihar, Yes, that probably would be better. But I expect it is also more complex. So perhaps such work is appropriate as a follow-up at some point. Because, assuming the patch is correct (I have some questions), it does seem to improve things. And I'm all for improving things even if incrementally. > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> > --- > lib/socket-util-unix.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c > index 0053a61b1..230705ba3 100644 > --- a/lib/socket-util-unix.c > +++ b/lib/socket-util-unix.c > @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock, > int dirfd; > > error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd, linkname); > - if (!error > - && connect(fd, (struct sockaddr*) &un, un_len) > - && errno != EINPROGRESS) { > - error = errno; > + if (!error) { > + for (;;) { > + if (!connect(fd, (struct sockaddr *)&un, un_len)) { > + break; > + } > + if (errno != EINPROGRESS) { > + error = errno; > + break; > + } > + } I have a few questions about this. 1. I see that patch 4/4 also takes into account EAGAIN as an alternative for EINPROGRESS, which is consistent with my reading of connect(2) wrt Unix domain sockets on Linux. And, reading connect(3p) I see that EINPROGRESS is the correct error to check for wrt POSIX. But I am not entirely clear that calling connect() again is the correct behaviour. Is this clear to you? 2. Is it possible that this loop never exits? Or is otherwise abusive resource-wise. > } > free_sockaddr_un(dirfd, linkname); > > -- > 2.41.0 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Mon, Nov 20, 2023 at 11:22 AM Simon Horman <horms@ovn.org> wrote: > On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote: > > In nonblocking mode, a POSIX socket may temporarily fail to connect(), > > which is indicated by returning EINPROGRESS. This patch will loop > > waiting for connect() to succeed, or until EINPROGRESS happens. > > > > An alternative - and probably a better - path to deal with this error > > would be to extend unix stream_class to support connect API, that would > > allow the caller to repeat connect() attempts asynchronously. > > Hi Ihar, > > Yes, that probably would be better. But I expect it is also more complex. > So perhaps such work is appropriate as a follow-up at some point. Because, > assuming the patch is correct (I have some questions), it does seem to > improve things. And I'm all for improving things even if incrementally. > This patch is clearly controversial (and I hate it), thanks for starting this discussion. (Perhaps I should have posted this patch from the series as RFC not to suggest I am happy about it. I am not.) > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> > > --- > > lib/socket-util-unix.c | 14 ++++++++++---- > > 1 file changed, 10 insertions(+), 4 deletions(-) > > > > diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c > > index 0053a61b1..230705ba3 100644 > > --- a/lib/socket-util-unix.c > > +++ b/lib/socket-util-unix.c > > @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock, > > int dirfd; > > > > error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd, > linkname); > > - if (!error > > - && connect(fd, (struct sockaddr*) &un, un_len) > > - && errno != EINPROGRESS) { > > - error = errno; > > + if (!error) { > > + for (;;) { > > + if (!connect(fd, (struct sockaddr *)&un, un_len)) { > > + break; > > + } > > + if (errno != EINPROGRESS) { > > + error = errno; > > + break; > > + } > > + } > > I have a few questions about this. > > 1. I see that patch 4/4 also takes into account EAGAIN as an > alternative for EINPROGRESS, which is consistent with my > reading of connect(2) wrt Unix domain sockets on Linux. > And, reading connect(3p) I see that EINPROGRESS is the > correct error to check for wrt POSIX. > > But I am not entirely clear that calling connect() again > is the correct behaviour. Is this clear to you? > It's not. Perhaps there are bugs in the implementation beyond non-graceful handling of EAGAIN for Unix sockets. For example, AFAIU, POSIX implies that if connect() returns EINPROGRESS, then the code should select() to check if FD is ready for write. (Alternatively, consequent calls to connect() may return EALREADY.) This particular snippet from the Linux connect 2 man page describes the proper way to deal with EINPROGRESS: https://man7.org/linux/man-pages/man2/connect.2.html ``` After select(2) indicates writability, use getsockopt(2) to read the SO_ERROR option at level SOL_SOCKET to determine whether connect() completed successfully (SO_ERROR is zero) or unsuccessfully (SO_ERROR is one of the usual error codes listed here, explaining the reason for the failure). ``` I don't see any references to SO_ERROR in the openvswitch tree. Then there's the Linux case for AF_INET sockets that returns EAGAIN instead of EINPROGRESS. In this case, I think consequent connect() calls may require different treatment (repeating connect attempts). What I didn't consider in this patch, I think, is that this EAGAIN behavior is Linux specific, and - probably - other Unixes may decide to return EINPROGRESS in the same case. --- Is it safe to call connect() after EAGAIN? Depends on the definition of "safe". I think it's "safe" to do it with the current implementation of AF_UNIX stream_connect in kernel, see: https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/net/unix/af_unix.c#L1559 Here, once the connect() call times out and bails out to "goto out" with errno set to EAGAIN, it unix_release_sock(newsk, 0); the allocated structures for the socket. Does it mean we can safely rely on this behavior not to change in Linux AND be consistent with any other systems that may decide to EAGAIN on AF_UNIX sockets? I dunno. :( > > 2. Is it possible that this loop never exits? > Or is otherwise abusive resource-wise. > Abuse could be dealt with usleep() on each iteration (perhaps with a backoff). The infinite loop situation would require a mechanism to timeout after a particular number of attempts. We could definitely implement this. But - perhaps at this point it would be better to implement the "connect" unix class endpoint. I will take a look into it in the next revision. --- I suggest we do the following for the next revision. 1. In unix_make_connect, call connect() once. If it returns 0, tag the stream connected (stream->state). 2. If connect() returns EAGAIN or EINPROGRESS, bail out from unix_make_connect but return fd. Otherwise return error. 3. Implement connect() endpoint to unconditionally call connect(). If it returns EAGAIN (on Linux) or EINPROGRESS (POSIX), consider the stream is still connecting. If it's a different error, tag the stream disconnected. AFAIR (1) is already done. (2) will need to know that EAGAIN is a legit answer from connect(), but no loop will be introduced. (3) will need to be written from scratch. I think check_connection_completion may also need to be adjusted to return EAGAIN when Linux AF_UNIX socket is still connecting. Thoughts? > > > } > > free_sockaddr_un(dirfd, linkname); > > > > -- > > 2.41.0 > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > >
On Mon, Nov 20, 2023 at 12:40:58PM -0500, Ihar Hrachyshka wrote: > On Mon, Nov 20, 2023 at 11:22 AM Simon Horman <horms@ovn.org> wrote: > > > On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote: > > > In nonblocking mode, a POSIX socket may temporarily fail to connect(), > > > which is indicated by returning EINPROGRESS. This patch will loop > > > waiting for connect() to succeed, or until EINPROGRESS happens. > > > > > > An alternative - and probably a better - path to deal with this error > > > would be to extend unix stream_class to support connect API, that would > > > allow the caller to repeat connect() attempts asynchronously. > > > > Hi Ihar, > > > > Yes, that probably would be better. But I expect it is also more complex. > > So perhaps such work is appropriate as a follow-up at some point. Because, > > assuming the patch is correct (I have some questions), it does seem to > > improve things. And I'm all for improving things even if incrementally. > > > > This patch is clearly controversial (and I hate it), thanks for starting > this discussion. (Perhaps I should have posted this patch from the series > as RFC not to suggest I am happy about it. I am not.) > > > > > > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> > > > --- > > > lib/socket-util-unix.c | 14 ++++++++++---- > > > 1 file changed, 10 insertions(+), 4 deletions(-) > > > > > > diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c > > > index 0053a61b1..230705ba3 100644 > > > --- a/lib/socket-util-unix.c > > > +++ b/lib/socket-util-unix.c > > > @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock, > > > int dirfd; > > > > > > error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd, > > linkname); > > > - if (!error > > > - && connect(fd, (struct sockaddr*) &un, un_len) > > > - && errno != EINPROGRESS) { > > > - error = errno; > > > + if (!error) { > > > + for (;;) { > > > + if (!connect(fd, (struct sockaddr *)&un, un_len)) { > > > + break; > > > + } > > > + if (errno != EINPROGRESS) { > > > + error = errno; > > > + break; > > > + } > > > + } > > > > I have a few questions about this. > > > > 1. I see that patch 4/4 also takes into account EAGAIN as an > > alternative for EINPROGRESS, which is consistent with my > > reading of connect(2) wrt Unix domain sockets on Linux. > > And, reading connect(3p) I see that EINPROGRESS is the > > correct error to check for wrt POSIX. > > > > But I am not entirely clear that calling connect() again > > is the correct behaviour. Is this clear to you? > > > > It's not. Perhaps there are bugs in the implementation beyond non-graceful > handling of EAGAIN for Unix sockets. > > For example, AFAIU, POSIX implies that if connect() returns EINPROGRESS, > then the code should select() to check if FD is ready for write. > (Alternatively, consequent calls to connect() may return EALREADY.) > > This particular snippet from the Linux connect 2 man page describes the > proper way to deal with EINPROGRESS: > https://man7.org/linux/man-pages/man2/connect.2.html > > ``` > After > select(2) indicates writability, use getsockopt(2) to read > the SO_ERROR option at level SOL_SOCKET to determine > whether connect() completed successfully (SO_ERROR is > zero) or unsuccessfully (SO_ERROR is one of the usual > error codes listed here, explaining the reason for the > failure). > ``` > > I don't see any references to SO_ERROR in the openvswitch tree. > > Then there's the Linux case for AF_INET sockets that returns EAGAIN instead > of EINPROGRESS. In this case, I think consequent connect() calls may > require different treatment (repeating connect attempts). What I didn't > consider in this patch, I think, is that this EAGAIN behavior is Linux > specific, and - probably - other Unixes may decide to return EINPROGRESS in > the same case. > > --- > > Is it safe to call connect() after EAGAIN? Depends on the definition of > "safe". I think it's "safe" to do it with the current implementation of > AF_UNIX stream_connect in kernel, see: > https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/net/unix/af_unix.c#L1559 > > Here, once the connect() call times out and bails out to "goto out" with > errno set to EAGAIN, it unix_release_sock(newsk, 0); the allocated > structures for the socket. > > Does it mean we can safely rely on this behavior not to change in Linux AND > be consistent with any other systems that may decide to EAGAIN on AF_UNIX > sockets? I dunno. :( I think we are on the same page here :( > > 2. Is it possible that this loop never exits? > > Or is otherwise abusive resource-wise. > > > > Abuse could be dealt with usleep() on each iteration (perhaps with a > backoff). The infinite loop situation would require a mechanism to timeout > after a particular number of attempts. We could definitely implement this. > But - perhaps at this point it would be better to implement the "connect" > unix class endpoint. I will take a look into it in the next revision. Yes, that is a good point. Perhaps rather than bandage things up, in ways that we are unclear of the consequences, a new approach is worth investigating. > > --- > > I suggest we do the following for the next revision. > > 1. In unix_make_connect, call connect() once. If it returns 0, tag the > stream connected (stream->state). > 2. If connect() returns EAGAIN or EINPROGRESS, bail out from > unix_make_connect but return fd. Otherwise return error. > 3. Implement connect() endpoint to unconditionally call connect(). If it > returns EAGAIN (on Linux) or EINPROGRESS (POSIX), consider the stream is > still connecting. If it's a different error, tag the stream disconnected. > > AFAIR (1) is already done. (2) will need to know that EAGAIN is a legit > answer from connect(), but no loop will be introduced. (3) will need to be > written from scratch. I think check_connection_completion may also need to > be adjusted to return EAGAIN when Linux AF_UNIX socket is still connecting. 1. Agreed 2. That is my reading of connect(2) [*] 3. I'm a little clear how the "still connecting case" will be handled [*] https://www.man7.org/linux/man-pages/man2/connect.2.html
diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c index 0053a61b1..230705ba3 100644 --- a/lib/socket-util-unix.c +++ b/lib/socket-util-unix.c @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock, int dirfd; error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd, linkname); - if (!error - && connect(fd, (struct sockaddr*) &un, un_len) - && errno != EINPROGRESS) { - error = errno; + if (!error) { + for (;;) { + if (!connect(fd, (struct sockaddr *)&un, un_len)) { + break; + } + if (errno != EINPROGRESS) { + error = errno; + break; + } + } } free_sockaddr_un(dirfd, linkname);
In nonblocking mode, a POSIX socket may temporarily fail to connect(), which is indicated by returning EINPROGRESS. This patch will loop waiting for connect() to succeed, or until EINPROGRESS happens. An alternative - and probably a better - path to deal with this error would be to extend unix stream_class to support connect API, that would allow the caller to repeat connect() attempts asynchronously. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> --- lib/socket-util-unix.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)