diff mbox

[2/4] os-posix: report error message when lock file failed

Message ID 1411638384-5844-3-git-send-email-arei.gonglei@huawei.com
State New
Headers show

Commit Message

Gonglei (Arei) Sept. 25, 2014, 9:46 a.m. UTC
From: Gonglei <arei.gonglei@huawei.com>

It will cause that create vm failed When manager
tool is killed forcibly (kill -9 libvirtd_pid),
the file not was unlink, and unlock. It's better
that report the error message for users.

Signed-off-by: Huangweidong <weidong.huang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
---
 os-posix.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Markus Armbruster Oct. 1, 2014, 8:45 a.m. UTC | #1
<arei.gonglei@huawei.com> writes:

> From: Gonglei <arei.gonglei@huawei.com>
>
> It will cause that create vm failed When manager
> tool is killed forcibly (kill -9 libvirtd_pid),
> the file not was unlink, and unlock. It's better
> that report the error message for users.
>
> Signed-off-by: Huangweidong <weidong.huang@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  os-posix.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/os-posix.c b/os-posix.c
> index 9d5ae70..89831dc 100644
> --- a/os-posix.c
> +++ b/os-posix.c
> @@ -316,6 +316,7 @@ int qemu_create_pidfile(const char *filename)
>          return -1;
>      }
>      if (lockf(fd, F_TLOCK, 0) == -1) {
> +        error_report("lock file '%s' failed: %s", filename, strerror(errno));
>          close(fd);
>          return -1;
>      }

Only called from main():

    if (pid_file && qemu_create_pidfile(pid_file) != 0) {
        os_pidfile_error();
        exit(1);
    }

I suspect the error reporting is os_pidfile_error()'s job.  And it
actually does it (POSIX version shown, W32 is simpler):

void os_pidfile_error(void)
{
    if (daemonize) {
        uint8_t status = 1;
        if (write(fds[1], &status, 1) != 1) {
            perror("daemonize. Writing to pipe\n");
        }
    } else
        fprintf(stderr, "Could not acquire pid file: %s\n", strerror(errno));
}

Are you sure your patch makes sense?
Gonglei Oct. 1, 2014, 9:12 a.m. UTC | #2
Hi,

> Subject: Re: [Qemu-devel] [PATCH 2/4] os-posix: report error message when
> lock file failed
> 
> <arei.gonglei@huawei.com> writes:
> 
> > From: Gonglei <arei.gonglei@huawei.com>
> >
> > It will cause that create vm failed When manager
> > tool is killed forcibly (kill -9 libvirtd_pid),
> > the file not was unlink, and unlock. It's better
> > that report the error message for users.
> >
> > Signed-off-by: Huangweidong <weidong.huang@huawei.com>
> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> > ---
> >  os-posix.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/os-posix.c b/os-posix.c
> > index 9d5ae70..89831dc 100644
> > --- a/os-posix.c
> > +++ b/os-posix.c
> > @@ -316,6 +316,7 @@ int qemu_create_pidfile(const char *filename)
> >          return -1;
> >      }
> >      if (lockf(fd, F_TLOCK, 0) == -1) {
> > +        error_report("lock file '%s' failed: %s", filename, strerror(errno));
> >          close(fd);
> >          return -1;
> >      }
> 
> Only called from main():
> 
Yes, indeed. 

>     if (pid_file && qemu_create_pidfile(pid_file) != 0) {
>         os_pidfile_error();
>         exit(1);
>     }
> 
> I suspect the error reporting is os_pidfile_error()'s job.  And it
> actually does it (POSIX version shown, W32 is simpler):
> 
> void os_pidfile_error(void)
> {
>     if (daemonize) {
>         uint8_t status = 1;
>         if (write(fds[1], &status, 1) != 1) {
>             perror("daemonize. Writing to pipe\n");
>         }
>     } else
>         fprintf(stderr, "Could not acquire pid file: %s\n", strerror(errno));
> }
> 
> Are you sure your patch makes sense?

Yes, I'm sure it make sense. And I had tested, the result is OK as expected.

When daemonize variable is true, the os_pidfile_error() usually don't report an error,
because "if (write(fds[1], &status, 1) != 1)" is always false. On the other
hand, we should assure that we can get the original error message
when lock failed but not other information, such as "Could not acquire pid file...".

Best regards,
-Gonglei
Markus Armbruster Oct. 1, 2014, 10:24 a.m. UTC | #3
Gonglei <arei.gonglei@hotmail.com> writes:

> Hi,
>
>> Subject: Re: [Qemu-devel] [PATCH 2/4] os-posix: report error message when
>> lock file failed
>> 
>> <arei.gonglei@huawei.com> writes:
>> 
>> > From: Gonglei <arei.gonglei@huawei.com>
>> >
>> > It will cause that create vm failed When manager
>> > tool is killed forcibly (kill -9 libvirtd_pid),
>> > the file not was unlink, and unlock. It's better
>> > that report the error message for users.
>> >
>> > Signed-off-by: Huangweidong <weidong.huang@huawei.com>
>> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> > ---
>> >  os-posix.c | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/os-posix.c b/os-posix.c
>> > index 9d5ae70..89831dc 100644
>> > --- a/os-posix.c
>> > +++ b/os-posix.c
>> > @@ -316,6 +316,7 @@ int qemu_create_pidfile(const char *filename)
>> >          return -1;
>> >      }
>> >      if (lockf(fd, F_TLOCK, 0) == -1) {
>> > +        error_report("lock file '%s' failed: %s", filename, strerror(errno));
>> >          close(fd);
>> >          return -1;
>> >      }
>> 
>> Only called from main():
>> 
> Yes, indeed. 
>
>>     if (pid_file && qemu_create_pidfile(pid_file) != 0) {
>>         os_pidfile_error();
>>         exit(1);
>>     }
>> 
>> I suspect the error reporting is os_pidfile_error()'s job.  And it
>> actually does it (POSIX version shown, W32 is simpler):
>> 
>> void os_pidfile_error(void)
>> {
>>     if (daemonize) {
>>         uint8_t status = 1;
>>         if (write(fds[1], &status, 1) != 1) {
>>             perror("daemonize. Writing to pipe\n");
>>         }
>>     } else
>>         fprintf(stderr, "Could not acquire pid file: %s\n", strerror(errno));
>> }
>> 
>> Are you sure your patch makes sense?
>
> Yes, I'm sure it make sense. And I had tested, the result is OK as expected.
>
> When daemonize variable is true, the os_pidfile_error() usually don't
> report an error,
> because "if (write(fds[1], &status, 1) != 1)" is always false. On the other
> hand, we should assure that we can get the original error message
> when lock failed but not other information, such as "Could not acquire
> pid file...".

Even if the new error message makes sense, reporting errors in two
places doesn't.

qemu_create_pidfile() is designed to leave the actual error reporting to
its caller, which delegates it to os_pidfile_error().

If daemonize, os_pidfile_error() signals the failure to the parent
process instead of reporting to stderr.  The parent does the reporting,
in os_daemonize().  If this isn't good enough, you're certainly welcome
to improve it.

Just noticed that commit e5048d1 "os-posix: report error message when
lock file failed" already messed this up: error reporting is spread over
three places: qemu_create_pidfile(), os_pidfile_error() and
os_daemonize().

Please pick one method to report errors: either just in
qemu_create_pidfile(), or in os_pidfile_error() (normal case) and
os_daemonize() (if daemonize).  I don't particularly care which one you
pick, as long as it works with and without -daemonize.
Gonglei Oct. 1, 2014, 11:58 a.m. UTC | #4
> Subject: Re: [Qemu-devel] [PATCH 2/4] os-posix: report error message when
> lock file failed
> 
> Gonglei <arei.gonglei@hotmail.com> writes:
> 
> > Hi,
> >
> >> Subject: Re: [Qemu-devel] [PATCH 2/4] os-posix: report error message when
> >> lock file failed
> >>
> >> <arei.gonglei@huawei.com> writes:
> >>
> >> > From: Gonglei <arei.gonglei@huawei.com>
> >> >
> >> > It will cause that create vm failed When manager
> >> > tool is killed forcibly (kill -9 libvirtd_pid),
> >> > the file not was unlink, and unlock. It's better
> >> > that report the error message for users.
> >> >
> >> > Signed-off-by: Huangweidong <weidong.huang@huawei.com>
> >> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> >> > ---
> >> >  os-posix.c | 1 +
> >> >  1 file changed, 1 insertion(+)
> >> >
> >> > diff --git a/os-posix.c b/os-posix.c
> >> > index 9d5ae70..89831dc 100644
> >> > --- a/os-posix.c
> >> > +++ b/os-posix.c
> >> > @@ -316,6 +316,7 @@ int qemu_create_pidfile(const char *filename)
> >> >          return -1;
> >> >      }
> >> >      if (lockf(fd, F_TLOCK, 0) == -1) {
> >> > +        error_report("lock file '%s' failed: %s", filename,
> strerror(errno));
> >> >          close(fd);
> >> >          return -1;
> >> >      }
> >>
> >> Only called from main():
> >>
> > Yes, indeed.
> >
> >>     if (pid_file && qemu_create_pidfile(pid_file) != 0) {
> >>         os_pidfile_error();
> >>         exit(1);
> >>     }
> >>
> >> I suspect the error reporting is os_pidfile_error()'s job.  And it
> >> actually does it (POSIX version shown, W32 is simpler):
> >>
> >> void os_pidfile_error(void)
> >> {
> >>     if (daemonize) {
> >>         uint8_t status = 1;
> >>         if (write(fds[1], &status, 1) != 1) {
> >>             perror("daemonize. Writing to pipe\n");
> >>         }
> >>     } else
> >>         fprintf(stderr, "Could not acquire pid file: %s\n", strerror(errno));
> >> }
> >>
> >> Are you sure your patch makes sense?
> >
> > Yes, I'm sure it make sense. And I had tested, the result is OK as expected.
> >
> > When daemonize variable is true, the os_pidfile_error() usually don't
> > report an error,
> > because "if (write(fds[1], &status, 1) != 1)" is always false. On the other
> > hand, we should assure that we can get the original error message
> > when lock failed but not other information, such as "Could not acquire
> > pid file...".
> 
> Even if the new error message makes sense, reporting errors in two
> places doesn't.
> 
> qemu_create_pidfile() is designed to leave the actual error reporting to
> its caller, which delegates it to os_pidfile_error().
> 
> If daemonize, os_pidfile_error() signals the failure to the parent
> process instead of reporting to stderr.  The parent does the reporting,
> in os_daemonize().  If this isn't good enough, you're certainly welcome
> to improve it.
> 
> Just noticed that commit e5048d1 "os-posix: report error message when
> lock file failed" already messed this up: error reporting is spread over
> three places: qemu_create_pidfile(), os_pidfile_error() and
> os_daemonize().
> 
> Please pick one method to report errors: either just in
> qemu_create_pidfile(), or in os_pidfile_error() (normal case) and
> os_daemonize() (if daemonize).  I don't particularly care which one you
> pick, as long as it works with and without -daemonize.

If we can get the root reason of one error easier, why not?
(I encountered the scenario described in commit message,
I only repeated that issue and eventually got the root reason is locking
pid file failed)

I don't think it is a problem that reporting errors over
two or three places. 

And I often encounter this coding style:

Int funA() {
   ...
   Printf(".....");
   Return -1;
}

Int funcB() {
    If (funA() < 0) {
        Printf("something wrong\n");
        Return -1;
    }
}

Best regards,
-Gonglei
Markus Armbruster Oct. 1, 2014, 12:38 p.m. UTC | #5
Gonglei <arei.gonglei@hotmail.com> writes:

>> Subject: Re: [Qemu-devel] [PATCH 2/4] os-posix: report error message when
>> lock file failed
>> 
>> Gonglei <arei.gonglei@hotmail.com> writes:
>> 
>> > Hi,
>> >
>> >> Subject: Re: [Qemu-devel] [PATCH 2/4] os-posix: report error message when
>> >> lock file failed
>> >>
>> >> <arei.gonglei@huawei.com> writes:
>> >>
>> >> > From: Gonglei <arei.gonglei@huawei.com>
>> >> >
>> >> > It will cause that create vm failed When manager
>> >> > tool is killed forcibly (kill -9 libvirtd_pid),
>> >> > the file not was unlink, and unlock. It's better
>> >> > that report the error message for users.
>> >> >
>> >> > Signed-off-by: Huangweidong <weidong.huang@huawei.com>
>> >> > Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> >> > ---
>> >> >  os-posix.c | 1 +
>> >> >  1 file changed, 1 insertion(+)
>> >> >
>> >> > diff --git a/os-posix.c b/os-posix.c
>> >> > index 9d5ae70..89831dc 100644
>> >> > --- a/os-posix.c
>> >> > +++ b/os-posix.c
>> >> > @@ -316,6 +316,7 @@ int qemu_create_pidfile(const char *filename)
>> >> >          return -1;
>> >> >      }
>> >> >      if (lockf(fd, F_TLOCK, 0) == -1) {
>> >> > +        error_report("lock file '%s' failed: %s", filename,
>> strerror(errno));
>> >> >          close(fd);
>> >> >          return -1;
>> >> >      }
>> >>
>> >> Only called from main():
>> >>
>> > Yes, indeed.
>> >
>> >>     if (pid_file && qemu_create_pidfile(pid_file) != 0) {
>> >>         os_pidfile_error();
>> >>         exit(1);
>> >>     }
>> >>
>> >> I suspect the error reporting is os_pidfile_error()'s job.  And it
>> >> actually does it (POSIX version shown, W32 is simpler):
>> >>
>> >> void os_pidfile_error(void)
>> >> {
>> >>     if (daemonize) {
>> >>         uint8_t status = 1;
>> >>         if (write(fds[1], &status, 1) != 1) {
>> >>             perror("daemonize. Writing to pipe\n");
>> >>         }
>> >>     } else
>> >>         fprintf(stderr, "Could not acquire pid file: %s\n", strerror(errno));
>> >> }
>> >>
>> >> Are you sure your patch makes sense?
>> >
>> > Yes, I'm sure it make sense. And I had tested, the result is OK as expected.
>> >
>> > When daemonize variable is true, the os_pidfile_error() usually don't
>> > report an error,
>> > because "if (write(fds[1], &status, 1) != 1)" is always false. On the other
>> > hand, we should assure that we can get the original error message
>> > when lock failed but not other information, such as "Could not acquire
>> > pid file...".
>> 
>> Even if the new error message makes sense, reporting errors in two
>> places doesn't.
>> 
>> qemu_create_pidfile() is designed to leave the actual error reporting to
>> its caller, which delegates it to os_pidfile_error().
>> 
>> If daemonize, os_pidfile_error() signals the failure to the parent
>> process instead of reporting to stderr.  The parent does the reporting,
>> in os_daemonize().  If this isn't good enough, you're certainly welcome
>> to improve it.
>> 
>> Just noticed that commit e5048d1 "os-posix: report error message when
>> lock file failed" already messed this up: error reporting is spread over
>> three places: qemu_create_pidfile(), os_pidfile_error() and
>> os_daemonize().
>> 
>> Please pick one method to report errors: either just in
>> qemu_create_pidfile(), or in os_pidfile_error() (normal case) and
>> os_daemonize() (if daemonize).  I don't particularly care which one you
>> pick, as long as it works with and without -daemonize.
>
> If we can get the root reason of one error easier, why not?
> (I encountered the scenario described in commit message,
> I only repeated that issue and eventually got the root reason is locking
> pid file failed)
>
> I don't think it is a problem that reporting errors over
> two or three places. 
>
> And I often encounter this coding style:
>
> Int funA() {
>    ...
>    Printf(".....");
>    Return -1;
> }
>
> Int funcB() {
>     If (funA() < 0) {
>         Printf("something wrong\n");
>         Return -1;
>     }
> }

Just because you see this often doesn't make it good code.  It produces
two error messages rather than one.  One of them is usually useless.
Michael Tokarev Oct. 24, 2014, 7:32 a.m. UTC | #6
On 09/25/2014 01:46 PM, arei.gonglei@huawei.com wrote:
> From: Gonglei <arei.gonglei@huawei.com>
> 
> It will cause that create vm failed When manager
> tool is killed forcibly (kill -9 libvirtd_pid),
> the file not was unlink, and unlock. It's better
> that report the error message for users.
> 
> Signed-off-by: Huangweidong <weidong.huang@huawei.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> ---
>  os-posix.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/os-posix.c b/os-posix.c
> index 9d5ae70..89831dc 100644
> --- a/os-posix.c
> +++ b/os-posix.c
> @@ -316,6 +316,7 @@ int qemu_create_pidfile(const char *filename)
>          return -1;
>      }
>      if (lockf(fd, F_TLOCK, 0) == -1) {
> +        error_report("lock file '%s' failed: %s", filename, strerror(errno));
>          close(fd);
>          return -1;
>      }


I think I'll just revert this patch, because indeed, it makes
no sense to do this like that.

Context:

 qemu_create_pidfile() is only created from main(), and there,
 if that function returns failure, os_pidfile_error() function
 is called, to, guess that, report error (which is done differently
 whenever we're daemonizing or not).

 qemu_create_pidfile() function has several error returns, this
 lockf() failure is one of them, there are others (another shown
 in the patch context too).

 So this patch makes whole thing inconsistent at least.

 If we need to show error message when we're daemonizing, it
 looks like we should modify os_pidfile_error() routine to always
 report error and only after that check for daemon mode.  This way
 all errors will be reported the same way.

But I really wonder if we actually need os_pidfile_error() in the
first place, why this can't be done in qemu_create_pidfile().

So, I'm reverting this change now, to be revisited very soon.

Thanks,

/mjt
Gonglei (Arei) Oct. 24, 2014, 8:56 a.m. UTC | #7
On 2014/10/24 15:32, Michael Tokarev wrote:

> On 09/25/2014 01:46 PM, arei.gonglei@huawei.com wrote:
>> From: Gonglei <arei.gonglei@huawei.com>
>>
>> It will cause that create vm failed When manager
>> tool is killed forcibly (kill -9 libvirtd_pid),
>> the file not was unlink, and unlock. It's better
>> that report the error message for users.
>>
>> Signed-off-by: Huangweidong <weidong.huang@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> ---
>>  os-posix.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/os-posix.c b/os-posix.c
>> index 9d5ae70..89831dc 100644
>> --- a/os-posix.c
>> +++ b/os-posix.c
>> @@ -316,6 +316,7 @@ int qemu_create_pidfile(const char *filename)
>>          return -1;
>>      }
>>      if (lockf(fd, F_TLOCK, 0) == -1) {
>> +        error_report("lock file '%s' failed: %s", filename, strerror(errno));
>>          close(fd);
>>          return -1;
>>      }
> 
> 
> I think I'll just revert this patch, because indeed, it makes
> no sense to do this like that.
> 
> Context:
> 
>  qemu_create_pidfile() is only created from main(), and there,
>  if that function returns failure, os_pidfile_error() function
>  is called, to, guess that, report error (which is done differently
>  whenever we're daemonizing or not).
> 
>  qemu_create_pidfile() function has several error returns, this
>  lockf() failure is one of them, there are others (another shown
>  in the patch context too).
> 

Yes, agree.

>  So this patch makes whole thing inconsistent at least.
> 
>  If we need to show error message when we're daemonizing, it
>  looks like we should modify os_pidfile_error() routine to always
>  report error and only after that check for daemon mode.  This way
>  all errors will be reported the same way.
> 

In os_pidfile_error(), like below...
if (write(fds[1], &status, 1) != 1) {
            perror("daemonize. Writing to pipe\n");
}

will call its child process to report error:

 else if (status == 1) {
                fprintf(stderr, "Could not acquire pidfile\n");
                exit(1);
            }
I just want to make the error message more clear so that
people can find the root reasons of errors as soon as possible. :)

> But I really wonder if we actually need os_pidfile_error() in the
> first place, why this can't be done in qemu_create_pidfile().
> 

> So, I'm reverting this change now, to be revisited very soon.
> 

If you think that's better, I'm fine with it.

Best regards,
-Gonglei
diff mbox

Patch

diff --git a/os-posix.c b/os-posix.c
index 9d5ae70..89831dc 100644
--- a/os-posix.c
+++ b/os-posix.c
@@ -316,6 +316,7 @@  int qemu_create_pidfile(const char *filename)
         return -1;
     }
     if (lockf(fd, F_TLOCK, 0) == -1) {
+        error_report("lock file '%s' failed: %s", filename, strerror(errno));
         close(fd);
         return -1;
     }