diff mbox series

chardev: report the handshake error

Message ID 20230510072531.3937189-1-marcandre.lureau@redhat.com
State New
Headers show
Series chardev: report the handshake error | expand

Commit Message

Marc-André Lureau May 10, 2023, 7:25 a.m. UTC
From: Marc-André Lureau <marcandre.lureau@redhat.com>

This can help to debug connection issues.

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=2196182

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 chardev/char-socket.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Daniel P. Berrangé May 10, 2023, 9:21 a.m. UTC | #1
On Wed, May 10, 2023 at 11:25:31AM +0400, marcandre.lureau@redhat.com wrote:
> From: Marc-André Lureau <marcandre.lureau@redhat.com>
> 
> This can help to debug connection issues.
> 
> Related to:
> https://bugzilla.redhat.com/show_bug.cgi?id=2196182
> 
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  chardev/char-socket.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
Markus Armbruster May 10, 2023, 9:31 a.m. UTC | #2
marcandre.lureau@redhat.com writes:

> From: Marc-André Lureau <marcandre.lureau@redhat.com>
>
> This can help to debug connection issues.
>
> Related to:
> https://bugzilla.redhat.com/show_bug.cgi?id=2196182
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  chardev/char-socket.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 8c58532171..e8e3a743d5 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -742,8 +742,12 @@ static void tcp_chr_websock_handshake(QIOTask *task, gpointer user_data)
>  {
>      Chardev *chr = user_data;
>      SocketChardev *s = user_data;
> +    Error *err = NULL;
>  
> -    if (qio_task_propagate_error(task, NULL)) {
> +    if (qio_task_propagate_error(task, &err)) {
> +        error_reportf_err(err,
> +                          "websock handshake of character device %s failed: ",
> +                          chr->label);

Code smell: reports an error without failing the function.

Should it be a warning instead?

>          tcp_chr_disconnect(chr);
>      } else {
>          if (s->do_telnetopt) {
> @@ -778,8 +782,12 @@ static void tcp_chr_tls_handshake(QIOTask *task,
>  {
>      Chardev *chr = user_data;
>      SocketChardev *s = user_data;
> +    Error *err = NULL;
>  
> -    if (qio_task_propagate_error(task, NULL)) {
> +    if (qio_task_propagate_error(task, &err)) {
> +        error_reportf_err(err,
> +                          "TLS handshake of character device %s failed: ",
> +                          chr->label);
>          tcp_chr_disconnect(chr);
>      } else {
>          if (s->is_websock) {

Likewise.
Marc-André Lureau May 10, 2023, 9:33 a.m. UTC | #3
Hi

On Wed, May 10, 2023 at 1:31 PM Markus Armbruster <armbru@redhat.com> wrote:

> marcandre.lureau@redhat.com writes:
>
> > From: Marc-André Lureau <marcandre.lureau@redhat.com>
> >
> > This can help to debug connection issues.
> >
> > Related to:
> > https://bugzilla.redhat.com/show_bug.cgi?id=2196182
> >
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> >  chardev/char-socket.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index 8c58532171..e8e3a743d5 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -742,8 +742,12 @@ static void tcp_chr_websock_handshake(QIOTask
> *task, gpointer user_data)
> >  {
> >      Chardev *chr = user_data;
> >      SocketChardev *s = user_data;
> > +    Error *err = NULL;
> >
> > -    if (qio_task_propagate_error(task, NULL)) {
> > +    if (qio_task_propagate_error(task, &err)) {
> > +        error_reportf_err(err,
> > +                          "websock handshake of character device %s
> failed: ",
> > +                          chr->label);
>
> Code smell: reports an error without failing the function.
>
> Should it be a warning instead?
>
>
Makes sense, I just did the same as check_report_connect_error() , but I
think they should all be warnings too.

>          tcp_chr_disconnect(chr);
> >      } else {
> >          if (s->do_telnetopt) {
> > @@ -778,8 +782,12 @@ static void tcp_chr_tls_handshake(QIOTask *task,
> >  {
> >      Chardev *chr = user_data;
> >      SocketChardev *s = user_data;
> > +    Error *err = NULL;
> >
> > -    if (qio_task_propagate_error(task, NULL)) {
> > +    if (qio_task_propagate_error(task, &err)) {
> > +        error_reportf_err(err,
> > +                          "TLS handshake of character device %s failed:
> ",
> > +                          chr->label);
> >          tcp_chr_disconnect(chr);
> >      } else {
> >          if (s->is_websock) {
>
> Likewise.
>
Daniel P. Berrangé May 10, 2023, 9:38 a.m. UTC | #4
On Wed, May 10, 2023 at 11:31:40AM +0200, Markus Armbruster wrote:
> marcandre.lureau@redhat.com writes:
> 
> > From: Marc-André Lureau <marcandre.lureau@redhat.com>
> >
> > This can help to debug connection issues.
> >
> > Related to:
> > https://bugzilla.redhat.com/show_bug.cgi?id=2196182
> >
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> >  chardev/char-socket.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index 8c58532171..e8e3a743d5 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -742,8 +742,12 @@ static void tcp_chr_websock_handshake(QIOTask *task, gpointer user_data)
> >  {
> >      Chardev *chr = user_data;
> >      SocketChardev *s = user_data;
> > +    Error *err = NULL;
> >  
> > -    if (qio_task_propagate_error(task, NULL)) {
> > +    if (qio_task_propagate_error(task, &err)) {
> > +        error_reportf_err(err,
> > +                          "websock handshake of character device %s failed: ",
> > +                          chr->label);
> 
> Code smell: reports an error without failing the function.
> 
> Should it be a warning instead?

Well it isn't a warning, this is a fatal error wrt continued use
of the chardev

Not failing the function is expected in this particular code
pattern. These tcp_chr_(tls,websock)_handshake functions are
callbacks that are used to handle an async operations progress.
From the caller's POV, it doesn't matter whether there is an
error or success. It is upto this function to do whatever is
required based on the status, hence the call to disconnect
the chardev on error:

> >          tcp_chr_disconnect(chr);
> >      } else {
> >          if (s->do_telnetopt) {
> > @@ -778,8 +782,12 @@ static void tcp_chr_tls_handshake(QIOTask *task,
> >  {
> >      Chardev *chr = user_data;
> >      SocketChardev *s = user_data;
> > +    Error *err = NULL;
> >  
> > -    if (qio_task_propagate_error(task, NULL)) {
> > +    if (qio_task_propagate_error(task, &err)) {
> > +        error_reportf_err(err,
> > +                          "TLS handshake of character device %s failed: ",
> > +                          chr->label);
> >          tcp_chr_disconnect(chr);
> >      } else {
> >          if (s->is_websock) {
> 
> Likewise.
> 

With regards,
Daniel
Marc-André Lureau May 10, 2023, 9:48 a.m. UTC | #5
Hi

On Wed, May 10, 2023 at 1:39 PM Daniel P. Berrangé <berrange@redhat.com>
wrote:

> On Wed, May 10, 2023 at 11:31:40AM +0200, Markus Armbruster wrote:
> > marcandre.lureau@redhat.com writes:
> >
> > > From: Marc-André Lureau <marcandre.lureau@redhat.com>
> > >
> > > This can help to debug connection issues.
> > >
> > > Related to:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=2196182
> > >
> > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > ---
> > >  chardev/char-socket.c | 12 ++++++++++--
> > >  1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > index 8c58532171..e8e3a743d5 100644
> > > --- a/chardev/char-socket.c
> > > +++ b/chardev/char-socket.c
> > > @@ -742,8 +742,12 @@ static void tcp_chr_websock_handshake(QIOTask
> *task, gpointer user_data)
> > >  {
> > >      Chardev *chr = user_data;
> > >      SocketChardev *s = user_data;
> > > +    Error *err = NULL;
> > >
> > > -    if (qio_task_propagate_error(task, NULL)) {
> > > +    if (qio_task_propagate_error(task, &err)) {
> > > +        error_reportf_err(err,
> > > +                          "websock handshake of character device %s
> failed: ",
> > > +                          chr->label);
> >
> > Code smell: reports an error without failing the function.
> >
> > Should it be a warning instead?
>
> Well it isn't a warning, this is a fatal error wrt continued use
> of the chardev
>
> Not failing the function is expected in this particular code
> pattern. These tcp_chr_(tls,websock)_handshake functions are
> callbacks that are used to handle an async operations progress.
> From the caller's POV, it doesn't matter whether there is an
> error or success. It is upto this function to do whatever is
> required based on the status, hence the call to disconnect
> the chardev on error:
>

I guess it depends on usage, if you have a reconnect= option, then it can
be considered non-fatal and a warning is fine.

Should we check if there is a reconnect to decide whether to print an error
or a warning? no strong opinion..


> > >          tcp_chr_disconnect(chr);
> > >      } else {
> > >          if (s->do_telnetopt) {
> > > @@ -778,8 +782,12 @@ static void tcp_chr_tls_handshake(QIOTask *task,
> > >  {
> > >      Chardev *chr = user_data;
> > >      SocketChardev *s = user_data;
> > > +    Error *err = NULL;
> > >
> > > -    if (qio_task_propagate_error(task, NULL)) {
> > > +    if (qio_task_propagate_error(task, &err)) {
> > > +        error_reportf_err(err,
> > > +                          "TLS handshake of character device %s
> failed: ",
> > > +                          chr->label);
> > >          tcp_chr_disconnect(chr);
> > >      } else {
> > >          if (s->is_websock) {
> >
> > Likewise.
> >
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
>
>
>
Markus Armbruster May 10, 2023, 10:34 a.m. UTC | #6
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, May 10, 2023 at 11:31:40AM +0200, Markus Armbruster wrote:
>> marcandre.lureau@redhat.com writes:
>> 
>> > From: Marc-André Lureau <marcandre.lureau@redhat.com>
>> >
>> > This can help to debug connection issues.
>> >
>> > Related to:
>> > https://bugzilla.redhat.com/show_bug.cgi?id=2196182
>> >
>> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> > ---
>> >  chardev/char-socket.c | 12 ++++++++++--
>> >  1 file changed, 10 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
>> > index 8c58532171..e8e3a743d5 100644
>> > --- a/chardev/char-socket.c
>> > +++ b/chardev/char-socket.c
>> > @@ -742,8 +742,12 @@ static void tcp_chr_websock_handshake(QIOTask *task, gpointer user_data)
>> >  {
>> >      Chardev *chr = user_data;
>> >      SocketChardev *s = user_data;
>> > +    Error *err = NULL;
>> >  
>> > -    if (qio_task_propagate_error(task, NULL)) {
>> > +    if (qio_task_propagate_error(task, &err)) {
>> > +        error_reportf_err(err,
>> > +                          "websock handshake of character device %s failed: ",
>> > +                          chr->label);
>> 
>> Code smell: reports an error without failing the function.
>> 
>> Should it be a warning instead?
>
> Well it isn't a warning, this is a fatal error wrt continued use
> of the chardev
>
> Not failing the function is expected in this particular code
> pattern. These tcp_chr_(tls,websock)_handshake functions are
> callbacks that are used to handle an async operations progress.
> From the caller's POV, it doesn't matter whether there is an
> error or success. It is upto this function to do whatever is
> required based on the status, hence the call to disconnect
> the chardev on error:
>
>> >          tcp_chr_disconnect(chr);

Can this asynchronous task be started from QMP?

If yes, how is this error reported back to the QMP client?

[...]
Daniel P. Berrangé May 10, 2023, 10:43 a.m. UTC | #7
On Wed, May 10, 2023 at 12:34:59PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Wed, May 10, 2023 at 11:31:40AM +0200, Markus Armbruster wrote:
> >> marcandre.lureau@redhat.com writes:
> >> 
> >> > From: Marc-André Lureau <marcandre.lureau@redhat.com>
> >> >
> >> > This can help to debug connection issues.
> >> >
> >> > Related to:
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=2196182
> >> >
> >> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >> > ---
> >> >  chardev/char-socket.c | 12 ++++++++++--
> >> >  1 file changed, 10 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> >> > index 8c58532171..e8e3a743d5 100644
> >> > --- a/chardev/char-socket.c
> >> > +++ b/chardev/char-socket.c
> >> > @@ -742,8 +742,12 @@ static void tcp_chr_websock_handshake(QIOTask *task, gpointer user_data)
> >> >  {
> >> >      Chardev *chr = user_data;
> >> >      SocketChardev *s = user_data;
> >> > +    Error *err = NULL;
> >> >  
> >> > -    if (qio_task_propagate_error(task, NULL)) {
> >> > +    if (qio_task_propagate_error(task, &err)) {
> >> > +        error_reportf_err(err,
> >> > +                          "websock handshake of character device %s failed: ",
> >> > +                          chr->label);
> >> 
> >> Code smell: reports an error without failing the function.
> >> 
> >> Should it be a warning instead?
> >
> > Well it isn't a warning, this is a fatal error wrt continued use
> > of the chardev
> >
> > Not failing the function is expected in this particular code
> > pattern. These tcp_chr_(tls,websock)_handshake functions are
> > callbacks that are used to handle an async operations progress.
> > From the caller's POV, it doesn't matter whether there is an
> > error or success. It is upto this function to do whatever is
> > required based on the status, hence the call to disconnect
> > the chardev on error:
> >
> >> >          tcp_chr_disconnect(chr);
> 
> Can this asynchronous task be started from QMP?

Yes, from chardev-add.

> If yes, how is this error reported back to the QMP client?

It isn't, as chardev-add has already completed and returned
"success" to the client at this point IIRC.


With regards,
Daniel
Markus Armbruster May 10, 2023, 12:33 p.m. UTC | #8
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, May 10, 2023 at 12:34:59PM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > On Wed, May 10, 2023 at 11:31:40AM +0200, Markus Armbruster wrote:
>> >> marcandre.lureau@redhat.com writes:
>> >> 
>> >> > From: Marc-André Lureau <marcandre.lureau@redhat.com>
>> >> >
>> >> > This can help to debug connection issues.
>> >> >
>> >> > Related to:
>> >> > https://bugzilla.redhat.com/show_bug.cgi?id=2196182
>> >> >
>> >> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> >> > ---
>> >> >  chardev/char-socket.c | 12 ++++++++++--
>> >> >  1 file changed, 10 insertions(+), 2 deletions(-)
>> >> >
>> >> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
>> >> > index 8c58532171..e8e3a743d5 100644
>> >> > --- a/chardev/char-socket.c
>> >> > +++ b/chardev/char-socket.c
>> >> > @@ -742,8 +742,12 @@ static void tcp_chr_websock_handshake(QIOTask *task, gpointer user_data)
>> >> >  {
>> >> >      Chardev *chr = user_data;
>> >> >      SocketChardev *s = user_data;
>> >> > +    Error *err = NULL;
>> >> >  
>> >> > -    if (qio_task_propagate_error(task, NULL)) {
>> >> > +    if (qio_task_propagate_error(task, &err)) {
>> >> > +        error_reportf_err(err,
>> >> > +                          "websock handshake of character device %s failed: ",
>> >> > +                          chr->label);
>> >> 
>> >> Code smell: reports an error without failing the function.
>> >> 
>> >> Should it be a warning instead?
>> >
>> > Well it isn't a warning, this is a fatal error wrt continued use
>> > of the chardev
>> >
>> > Not failing the function is expected in this particular code
>> > pattern. These tcp_chr_(tls,websock)_handshake functions are
>> > callbacks that are used to handle an async operations progress.
>> > From the caller's POV, it doesn't matter whether there is an
>> > error or success. It is upto this function to do whatever is
>> > required based on the status, hence the call to disconnect
>> > the chardev on error:
>> >
>> >> >          tcp_chr_disconnect(chr);
>> 
>> Can this asynchronous task be started from QMP?
>
> Yes, from chardev-add.
>
>> If yes, how is this error reported back to the QMP client?
>
> It isn't, as chardev-add has already completed and returned
> "success" to the client at this point IIRC.

chardev-add's documentation doesn't even hint at this.  It should.

Is there really no need for the QMP client to know?

"QMP command mererly kicks off a task, returns success before the task
is done, and while the task can still fail" isn't unusual.  When the
task can take a long / unbounded time, it's necessary to keep QMP
available.

We have a few flavors of such commands, mostly for historical reasons.

There are ad hoc solutions like "command kicks off, event on successful
completion".  If you're lucky, there's even "event on unsuccessful
completion".  Example: device_del, DEVICE_DELETED,
DEVICE_UNPLUG_GUEST_ERROR.  The latter is a recent addition.

A much better developed solution is the Job abstraction.  Provides
commands to query and control jobs in flight, and an event on status
change.  Any error from the asynchronous part gets propagated to the
(synchronous) query.

Migration is another long-running task, and a world of its own.  I wish
it was a Job instead.

When we add another asynchronous task, and decide against use of Jobs
for whatever reasons, we should at least make our ad hoc solution as
good as the better existing ad hoc solutions: properly documented, and
with suitable error reporting.
diff mbox series

Patch

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 8c58532171..e8e3a743d5 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -742,8 +742,12 @@  static void tcp_chr_websock_handshake(QIOTask *task, gpointer user_data)
 {
     Chardev *chr = user_data;
     SocketChardev *s = user_data;
+    Error *err = NULL;
 
-    if (qio_task_propagate_error(task, NULL)) {
+    if (qio_task_propagate_error(task, &err)) {
+        error_reportf_err(err,
+                          "websock handshake of character device %s failed: ",
+                          chr->label);
         tcp_chr_disconnect(chr);
     } else {
         if (s->do_telnetopt) {
@@ -778,8 +782,12 @@  static void tcp_chr_tls_handshake(QIOTask *task,
 {
     Chardev *chr = user_data;
     SocketChardev *s = user_data;
+    Error *err = NULL;
 
-    if (qio_task_propagate_error(task, NULL)) {
+    if (qio_task_propagate_error(task, &err)) {
+        error_reportf_err(err,
+                          "TLS handshake of character device %s failed: ",
+                          chr->label);
         tcp_chr_disconnect(chr);
     } else {
         if (s->is_websock) {