diff mbox series

[v2,03/10] qga: treat get-guest-fsinfo as "best effort"

Message ID 20220616142659.3184115-4-jsnow@redhat.com
State New
Headers show
Series Improve reliability of VM tests | expand

Commit Message

John Snow June 16, 2022, 2:26 p.m. UTC
In some container environments, there may be references to block devices
witnessable from a container through /proc/self/mountinfo that reference
devices we simply don't have access to in the container, and could not
provide information about.

Instead of failing the entire fsinfo command, return stub information
for these failed lookups.

This allows test-qga to pass under docker tests, which are in turn used
by the CentOS VM tests.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 qga/commands-posix.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Marc-André Lureau June 16, 2022, 2:35 p.m. UTC | #1
Hi

On Thu, Jun 16, 2022 at 6:27 PM John Snow <jsnow@redhat.com> wrote:

> In some container environments, there may be references to block devices
> witnessable from a container through /proc/self/mountinfo that reference
> devices we simply don't have access to in the container, and could not
> provide information about.
>
> Instead of failing the entire fsinfo command, return stub information
> for these failed lookups.
>
> This allows test-qga to pass under docker tests, which are in turn used
> by the CentOS VM tests.
>
> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  qga/commands-posix.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> index 0469dc409d4..5989d4dca9d 100644
> --- a/qga/commands-posix.c
> +++ b/qga/commands-posix.c
> @@ -1207,7 +1207,13 @@ static void build_guest_fsinfo_for_device(char
> const *devpath,
>
>      syspath = realpath(devpath, NULL);
>      if (!syspath) {
> -        error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
> +        if (errno == ENOENT) {
> +            /* This devpath may not exist because of container config,
> etc. */
> +            fprintf(stderr, "realpath(%s) returned NULL/ENOENT\n",
> devpath);
>

qga uses g_critical() (except for some win32 code paths atm)


> +            fs->name = g_strdup("??\?-ENOENT");
>

Hmm, maybe we should make the field optional instead.


> +        } else {
> +            error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
> +        }
>          return;
>      }
>
> --
> 2.34.3
>
>
>
John Snow June 16, 2022, 2:43 p.m. UTC | #2
On Thu, Jun 16, 2022 at 10:36 AM Marc-André Lureau
<marcandre.lureau@gmail.com> wrote:
>
> Hi
>
> On Thu, Jun 16, 2022 at 6:27 PM John Snow <jsnow@redhat.com> wrote:
>>
>> In some container environments, there may be references to block devices
>> witnessable from a container through /proc/self/mountinfo that reference
>> devices we simply don't have access to in the container, and could not
>> provide information about.
>>
>> Instead of failing the entire fsinfo command, return stub information
>> for these failed lookups.
>>
>> This allows test-qga to pass under docker tests, which are in turn used
>> by the CentOS VM tests.
>>
>> Signed-off-by: John Snow <jsnow@redhat.com>
>> ---
>>  qga/commands-posix.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>> index 0469dc409d4..5989d4dca9d 100644
>> --- a/qga/commands-posix.c
>> +++ b/qga/commands-posix.c
>> @@ -1207,7 +1207,13 @@ static void build_guest_fsinfo_for_device(char const *devpath,
>>
>>      syspath = realpath(devpath, NULL);
>>      if (!syspath) {
>> -        error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
>> +        if (errno == ENOENT) {
>> +            /* This devpath may not exist because of container config, etc. */
>> +            fprintf(stderr, "realpath(%s) returned NULL/ENOENT\n", devpath);
>
>
> qga uses g_critical() (except for some win32 code paths atm)

Whoops, this is a debugging thing that I left in by accident. I was
just so excited that after testing overnight, everything worked. :)

>
>>
>> +            fs->name = g_strdup("??\?-ENOENT");
>
>
> Hmm, maybe we should make the field optional instead.

Does that harm compatibility in a meaningful way? I'm happy to do
whatever QGA maintainers want me to do. I just did something quick and
dirty to get it working at all as a conversation starter. O:-)

>
>>
>> +        } else {
>> +            error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
>> +        }
>>          return;
>>      }
>>
>> --
>> 2.34.3
>>
>>
>
>
> --
> Marc-André Lureau
Thomas Huth June 17, 2022, 9:17 a.m. UTC | #3
On 16/06/2022 16.43, John Snow wrote:
> On Thu, Jun 16, 2022 at 10:36 AM Marc-André Lureau
> <marcandre.lureau@gmail.com> wrote:
>>
>> Hi
>>
>> On Thu, Jun 16, 2022 at 6:27 PM John Snow <jsnow@redhat.com> wrote:
>>>
>>> In some container environments, there may be references to block devices
>>> witnessable from a container through /proc/self/mountinfo that reference
>>> devices we simply don't have access to in the container, and could not
>>> provide information about.
>>>
>>> Instead of failing the entire fsinfo command, return stub information
>>> for these failed lookups.
>>>
>>> This allows test-qga to pass under docker tests, which are in turn used
>>> by the CentOS VM tests.
>>>
>>> Signed-off-by: John Snow <jsnow@redhat.com>
>>> ---
>>>   qga/commands-posix.c | 8 +++++++-
>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>>> index 0469dc409d4..5989d4dca9d 100644
>>> --- a/qga/commands-posix.c
>>> +++ b/qga/commands-posix.c
>>> @@ -1207,7 +1207,13 @@ static void build_guest_fsinfo_for_device(char const *devpath,
>>>
>>>       syspath = realpath(devpath, NULL);
>>>       if (!syspath) {
>>> -        error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
>>> +        if (errno == ENOENT) {
>>> +            /* This devpath may not exist because of container config, etc. */
>>> +            fprintf(stderr, "realpath(%s) returned NULL/ENOENT\n", devpath);
>>
>>
>> qga uses g_critical() (except for some win32 code paths atm)
> 
> Whoops, this is a debugging thing that I left in by accident. I was
> just so excited that after testing overnight, everything worked. :)
> 
>>
>>>
>>> +            fs->name = g_strdup("??\?-ENOENT");
>>
>>
>> Hmm, maybe we should make the field optional instead.
> 
> Does that harm compatibility in a meaningful way? I'm happy to do
> whatever QGA maintainers want me to do. I just did something quick and
> dirty to get it working at all as a conversation starter. O:-)

Should the device get ignored instead of returning up a dummy device? ... at 
least that's what I'd expect at a quick glance at the problem...

  Thomas
Daniel P. Berrangé June 17, 2022, 9:49 a.m. UTC | #4
On Thu, Jun 16, 2022 at 06:35:44PM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Thu, Jun 16, 2022 at 6:27 PM John Snow <jsnow@redhat.com> wrote:
> 
> > In some container environments, there may be references to block devices
> > witnessable from a container through /proc/self/mountinfo that reference
> > devices we simply don't have access to in the container, and could not
> > provide information about.
> >
> > Instead of failing the entire fsinfo command, return stub information
> > for these failed lookups.
> >
> > This allows test-qga to pass under docker tests, which are in turn used
> > by the CentOS VM tests.
> >
> > Signed-off-by: John Snow <jsnow@redhat.com>
> > ---
> >  qga/commands-posix.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> > index 0469dc409d4..5989d4dca9d 100644
> > --- a/qga/commands-posix.c
> > +++ b/qga/commands-posix.c
> > @@ -1207,7 +1207,13 @@ static void build_guest_fsinfo_for_device(char
> > const *devpath,
> >
> >      syspath = realpath(devpath, NULL);
> >      if (!syspath) {
> > -        error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
> > +        if (errno == ENOENT) {
> > +            /* This devpath may not exist because of container config,
> > etc. */
> > +            fprintf(stderr, "realpath(%s) returned NULL/ENOENT\n",
> > devpath);
> >
> 
> qga uses g_critical() (except for some win32 code paths atm)
> 
> 
> > +            fs->name = y
> >
> 
> Hmm, maybe we should make the field optional instead.

In my own testing, this method is called in various scenarios.
Some example:

  devpath==/sys/dev/block/253:0
  syspath==/sys/devices/virtual/block/dm-0

    => fs->name == dm-0

  devpath==/sys/devices/virtual/block/dm-0/slaves/nvme0n1p4
  syspath==/sys/devices/pci0000:00/0000:00:1d.0/0000:02:00.0/nvme/nvme0/nvme0n1/nvme0n1p4

    => fs->name == nvme0n1p4

  devpath==/sys/dev/block/259:2
  syspath==/sys/devices/pci0000:00/0000:00:1d.0/0000:02:00.0/nvme/nvme0/nvme0n1/nvme0n1p2

    => fs->name == nvme0n1p2

We set fs->name from  basename(syspath)

If the realpath call fails, we could use  basename(devpath). That
would sometimes give the correct answer, and in other types it
would at least give the major:minor number, which an admin can
manually correlate if desired via /proc/partitions.

If we want to be really advanced, we could just open /proc/partitions
and resolve the proper name ourselves, but that's probably overkill

  basename(sysfspath)

is better than g_strdup("??\?-ENOENT")  IMHO

With regards,
Daniel
John Snow June 17, 2022, 2:04 p.m. UTC | #5
On Fri, Jun 17, 2022, 5:49 AM Daniel P. Berrangé <berrange@redhat.com>
wrote:

> On Thu, Jun 16, 2022 at 06:35:44PM +0400, Marc-André Lureau wrote:
> > Hi
> >
> > On Thu, Jun 16, 2022 at 6:27 PM John Snow <jsnow@redhat.com> wrote:
> >
> > > In some container environments, there may be references to block
> devices
> > > witnessable from a container through /proc/self/mountinfo that
> reference
> > > devices we simply don't have access to in the container, and could not
> > > provide information about.
> > >
> > > Instead of failing the entire fsinfo command, return stub information
> > > for these failed lookups.
> > >
> > > This allows test-qga to pass under docker tests, which are in turn used
> > > by the CentOS VM tests.
> > >
> > > Signed-off-by: John Snow <jsnow@redhat.com>
> > > ---
> > >  qga/commands-posix.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> > > index 0469dc409d4..5989d4dca9d 100644
> > > --- a/qga/commands-posix.c
> > > +++ b/qga/commands-posix.c
> > > @@ -1207,7 +1207,13 @@ static void build_guest_fsinfo_for_device(char
> > > const *devpath,
> > >
> > >      syspath = realpath(devpath, NULL);
> > >      if (!syspath) {
> > > -        error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
> > > +        if (errno == ENOENT) {
> > > +            /* This devpath may not exist because of container config,
> > > etc. */
> > > +            fprintf(stderr, "realpath(%s) returned NULL/ENOENT\n",
> > > devpath);
> > >
> >
> > qga uses g_critical() (except for some win32 code paths atm)
> >
> >
> > > +            fs->name = y
> > >
> >
> > Hmm, maybe we should make the field optional instead.
>
> In my own testing, this method is called in various scenarios.
> Some example:
>
>   devpath==/sys/dev/block/253:0
>   syspath==/sys/devices/virtual/block/dm-0
>
>     => fs->name == dm-0
>
>   devpath==/sys/devices/virtual/block/dm-0/slaves/nvme0n1p4
>
> syspath==/sys/devices/pci0000:00/0000:00:1d.0/0000:02:00.0/nvme/nvme0/nvme0n1/nvme0n1p4
>
>     => fs->name == nvme0n1p4
>
>   devpath==/sys/dev/block/259:2
>
> syspath==/sys/devices/pci0000:00/0000:00:1d.0/0000:02:00.0/nvme/nvme0/nvme0n1/nvme0n1p2
>
>     => fs->name == nvme0n1p2
>
> We set fs->name from  basename(syspath)
>
> If the realpath call fails, we could use  basename(devpath). That
> would sometimes give the correct answer, and in other types it
> would at least give the major:minor number, which an admin can
> manually correlate if desired via /proc/partitions.
>
> If we want to be really advanced, we could just open /proc/partitions
> and resolve the proper name ourselves, but that's probably overkill
>
>   basename(sysfspath)
>
> is better than g_strdup("??\?-ENOENT")  IMHO
>

Sure! I had something like that initially, but chickened out specifically
because I thought major:minor was a nonsense kind of reply, so I opted for
more egregiously obvious nonsense. I figured I'd find strong opinions that
way ;)

I'm just not sure how this data is used in practice so I had no insight as
to what would be best. I can use the basename, sure.

(Should I also add an optional flag field that indicates the path was not
resolvable, do you think? I guess we can always add it later if needed, but
not sure if i need to head that one off at the pass.)

As for Thomas' comment: I wasn't entirely clear on precisely when we'd run
into this scenario and I didn't know if it was a good idea to skip the
entries entirely. Maybe getting platform mount information even if we can't
access it is still important when working with containers? I don't know one
way or the other TBQH. I'm not very well traveled with devices,
filesystems, and permissions where containers are concerned.

/shrug


> With regards,
> Daniel
> --
> |: https://berrange.com      -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
>
>
Daniel P. Berrangé June 17, 2022, 2:29 p.m. UTC | #6
On Fri, Jun 17, 2022 at 10:04:14AM -0400, John Snow wrote:
> On Fri, Jun 17, 2022, 5:49 AM Daniel P. Berrangé <berrange@redhat.com>
> wrote:
> 
> > On Thu, Jun 16, 2022 at 06:35:44PM +0400, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Thu, Jun 16, 2022 at 6:27 PM John Snow <jsnow@redhat.com> wrote:
> > >
> > > > In some container environments, there may be references to block
> > devices
> > > > witnessable from a container through /proc/self/mountinfo that
> > reference
> > > > devices we simply don't have access to in the container, and could not
> > > > provide information about.
> > > >
> > > > Instead of failing the entire fsinfo command, return stub information
> > > > for these failed lookups.
> > > >
> > > > This allows test-qga to pass under docker tests, which are in turn used
> > > > by the CentOS VM tests.
> > > >
> > > > Signed-off-by: John Snow <jsnow@redhat.com>
> > > > ---
> > > >  qga/commands-posix.c | 8 +++++++-
> > > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> > > > index 0469dc409d4..5989d4dca9d 100644
> > > > --- a/qga/commands-posix.c
> > > > +++ b/qga/commands-posix.c
> > > > @@ -1207,7 +1207,13 @@ static void build_guest_fsinfo_for_device(char
> > > > const *devpath,
> > > >
> > > >      syspath = realpath(devpath, NULL);
> > > >      if (!syspath) {
> > > > -        error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
> > > > +        if (errno == ENOENT) {
> > > > +            /* This devpath may not exist because of container config,
> > > > etc. */
> > > > +            fprintf(stderr, "realpath(%s) returned NULL/ENOENT\n",
> > > > devpath);
> > > >
> > >
> > > qga uses g_critical() (except for some win32 code paths atm)
> > >
> > >
> > > > +            fs->name = y
> > > >
> > >
> > > Hmm, maybe we should make the field optional instead.
> >
> > In my own testing, this method is called in various scenarios.
> > Some example:
> >
> >   devpath==/sys/dev/block/253:0
> >   syspath==/sys/devices/virtual/block/dm-0
> >
> >     => fs->name == dm-0
> >
> >   devpath==/sys/devices/virtual/block/dm-0/slaves/nvme0n1p4
> >
> > syspath==/sys/devices/pci0000:00/0000:00:1d.0/0000:02:00.0/nvme/nvme0/nvme0n1/nvme0n1p4
> >
> >     => fs->name == nvme0n1p4
> >
> >   devpath==/sys/dev/block/259:2
> >
> > syspath==/sys/devices/pci0000:00/0000:00:1d.0/0000:02:00.0/nvme/nvme0/nvme0n1/nvme0n1p2
> >
> >     => fs->name == nvme0n1p2
> >
> > We set fs->name from  basename(syspath)
> >
> > If the realpath call fails, we could use  basename(devpath). That
> > would sometimes give the correct answer, and in other types it
> > would at least give the major:minor number, which an admin can
> > manually correlate if desired via /proc/partitions.
> >
> > If we want to be really advanced, we could just open /proc/partitions
> > and resolve the proper name ourselves, but that's probably overkill
> >
> >   basename(sysfspath)
> >
> > is better than g_strdup("??\?-ENOENT")  IMHO
> >
> 
> Sure! I had something like that initially, but chickened out specifically
> because I thought major:minor was a nonsense kind of reply, so I opted for
> more egregiously obvious nonsense. I figured I'd find strong opinions that
> way ;)

It is a different format but it is semantically giving similar info.

If we want to just leave it empty though that's fine too.

> 
> I'm just not sure how this data is used in practice so I had no insight as
> to what would be best. I can use the basename, sure.
> 
> (Should I also add an optional flag field that indicates the path was not
> resolvable, do you think? I guess we can always add it later if needed, but
> not sure if i need to head that one off at the pass.)
> 
> As for Thomas' comment: I wasn't entirely clear on precisely when we'd run
> into this scenario and I didn't know if it was a good idea to skip the
> entries entirely. Maybe getting platform mount information even if we can't
> access it is still important when working with containers? I don't know one
> way or the other TBQH. I'm not very well traveled with devices,
> filesystems, and permissions where containers are concerned.

I view the primary purpose of this command to be offering a way to
enumerate filesystems. Whether we report what block device the FS
on host is a secondary purpose.  So as long as we can fullfill the
primary purpose, its sufficient IMHO.

With regards,
Daniel
diff mbox series

Patch

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 0469dc409d4..5989d4dca9d 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -1207,7 +1207,13 @@  static void build_guest_fsinfo_for_device(char const *devpath,
 
     syspath = realpath(devpath, NULL);
     if (!syspath) {
-        error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
+        if (errno == ENOENT) {
+            /* This devpath may not exist because of container config, etc. */
+            fprintf(stderr, "realpath(%s) returned NULL/ENOENT\n", devpath);
+            fs->name = g_strdup("??\?-ENOENT");
+        } else {
+            error_setg_errno(errp, errno, "realpath(\"%s\")", devpath);
+        }
         return;
     }