diff mbox series

qmp: add query-qemu-capabilities

Message ID 20190222153254.22739-1-stefanha@redhat.com
State New
Headers show
Series qmp: add query-qemu-capabilities | expand

Commit Message

Stefan Hajnoczi Feb. 22, 2019, 3:32 p.m. UTC
QMP clients can usually detect the presence of features via schema
introspection.  There are rare features that do not involve schema
changes and are therefore impossible to detect with schema
introspection.

This patch adds the query-qemu-capabilities command.  It returns a list
of capabilities that this QEMU supports.

The decision to make this a command rather than something statically
defined in the schema is intentional.  It allows QEMU to decide which
capabilities are available at runtime, if necessary.

This new interface is necessary so that QMP clients can discover that
migrating disk image files is safe with cache.direct=off on Linux.
There is no other way to detect whether or not QEMU supports this.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 qapi/misc.json | 42 ++++++++++++++++++++++++++++++++++++++++++
 qmp.c          | 18 ++++++++++++++++++
 2 files changed, 60 insertions(+)

Comments

Markus Armbruster Feb. 25, 2019, 8:50 a.m. UTC | #1
Stefan Hajnoczi <stefanha@redhat.com> writes:

> QMP clients can usually detect the presence of features via schema
> introspection.  There are rare features that do not involve schema
> changes and are therefore impossible to detect with schema
> introspection.
>
> This patch adds the query-qemu-capabilities command.  It returns a list
> of capabilities that this QEMU supports.

The name "capabilities" could be confusing, because we already have QMP
capabilities, complete with command qmp_capabilities.  Would "features"
work?

> The decision to make this a command rather than something statically
> defined in the schema is intentional.  It allows QEMU to decide which
> capabilities are available at runtime, if necessary.
>
> This new interface is necessary so that QMP clients can discover that
> migrating disk image files is safe with cache.direct=off on Linux.
> There is no other way to detect whether or not QEMU supports this.

I think what's unsaid here is that we don't want to make a completely
arbitrary schema change just to carry this bit of information.  We
could, but we don't want to.  Correct?

>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Peter Krempa <pkrempa@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  qapi/misc.json | 42 ++++++++++++++++++++++++++++++++++++++++++
>  qmp.c          | 18 ++++++++++++++++++
>  2 files changed, 60 insertions(+)
>
> diff --git a/qapi/misc.json b/qapi/misc.json
> index 8b3ca4fdd3..9cf24919a3 100644
> --- a/qapi/misc.json
> +++ b/qapi/misc.json
> @@ -3051,3 +3051,45 @@
>    'data': 'NumaOptions',
>    'allow-preconfig': true
>  }
> +
> +##
> +# @QemuCapability:
> +#
> +# Capabilities that are not otherwise introspectable.
> +#
> +# @migrate-with-cache-direct-off-on-linux:
> +#  It is safe to migrate disk image files with cache.direct=off on Linux.
> +#  Previously only cache.direct=on was supported for live migration.

Somewhat unusual formatting due to long name.  I double-checked the doc
comment gets processed correctly.  Good.

> +#
> +# Since: 4.0
> +##
> +{ 'enum': 'QemuCapability',
> +  'data': [
> +    'migrate-with-cache-direct-off-on-linux'
> +  ]
> +}
> +
> +##
> +# @QemuCapabilities:
> +#
> +# Capability information.
> +#
> +# @capabilities: supported capabilities
> +#
> +# Since: 4.0
> +##
> +{ 'struct': 'QemuCapabilities',
> +  'data': { 'capabilities': ['QemuCapability'] } }
> +
> +##
> +# @query-qemu-capabilities:
> +#
> +# Return the capabilities supported by this QEMU.  Most features can be
> +# detected via schema introspection but some are not observable from the
> +# schema.  This command offers a way to check for the presence of such
> +# features.
> +#
> +# Since: 4.0
> +##
> +{ 'command': 'query-qemu-capabilities',
> +  'returns': 'QemuCapabilities' }

Capabilities are flags: either present or absent, thus effectively
boolean.  I don't have a specific reason to doubt limiting capabilities
to boolean.  However, doing QemuCapabilities like

   { 'struct': 'QemuCapabilities',
     'data': { 'migrate-with-cache-direct-off-on-linux': 'bool' } }

could grow to support other values more easily.  What do you think?

> diff --git a/qmp.c b/qmp.c
> index b92d62cd5f..91a1228be7 100644
> --- a/qmp.c
> +++ b/qmp.c
> @@ -717,3 +717,21 @@ MemoryInfo *qmp_query_memory_size_summary(Error **errp)
>  
>      return mem_info;
>  }
> +
> +QemuCapabilities *qmp_query_qemu_capabilities(Error **errp)
> +{
> +    QemuCapabilities *caps = g_new0(QemuCapabilities, 1);
> +    QemuCapabilityList **prev = &caps->capabilities;
> +    QemuCapability cap_val;
> +
> +    /* Add all QemuCapability enum values defined in the schema */
> +    for (cap_val = 0; cap_val < QEMU_CAPABILITY__MAX; cap_val++) {
> +        QemuCapabilityList *cap = g_new0(QemuCapabilityList, 1);
> +
> +        cap->value = cap_val;
> +        *prev = cap;
> +        prev = &cap->next;
> +    }
> +
> +    return caps;
> +}

All capabilities known to this build of QEMU are always present.  Okay;
no need to complicate things until we need to.  I just didn't expect it
to be this simple after the commit message's "It allows QEMU to decide
which capabilities are available at runtime, if necessary."
Peter Krempa Feb. 25, 2019, 9:28 a.m. UTC | #2
On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > QMP clients can usually detect the presence of features via schema
> > introspection.  There are rare features that do not involve schema
> > changes and are therefore impossible to detect with schema
> > introspection.
> >
> > This patch adds the query-qemu-capabilities command.  It returns a list
> > of capabilities that this QEMU supports.
> 
> The name "capabilities" could be confusing, because we already have QMP
> capabilities, complete with command qmp_capabilities.  Would "features"
> work?
> 
> > The decision to make this a command rather than something statically
> > defined in the schema is intentional.  It allows QEMU to decide which
> > capabilities are available at runtime, if necessary.
> >
> > This new interface is necessary so that QMP clients can discover that
> > migrating disk image files is safe with cache.direct=off on Linux.
> > There is no other way to detect whether or not QEMU supports this.
> 
> I think what's unsaid here is that we don't want to make a completely
> arbitrary schema change just to carry this bit of information.  We
> could, but we don't want to.  Correct?

One example of such 'unrelated' change was a recent query- command which
is used by libvirt just to detect presence of the feature but the
queried result is never used and not very useful.

> > Cc: Markus Armbruster <armbru@redhat.com>
> > Cc: Peter Krempa <pkrempa@redhat.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  qapi/misc.json | 42 ++++++++++++++++++++++++++++++++++++++++++
> >  qmp.c          | 18 ++++++++++++++++++
> >  2 files changed, 60 insertions(+)

[...]

> > +QemuCapabilities *qmp_query_qemu_capabilities(Error **errp)
> > +{
> > +    QemuCapabilities *caps = g_new0(QemuCapabilities, 1);
> > +    QemuCapabilityList **prev = &caps->capabilities;
> > +    QemuCapability cap_val;
> > +
> > +    /* Add all QemuCapability enum values defined in the schema */
> > +    for (cap_val = 0; cap_val < QEMU_CAPABILITY__MAX; cap_val++) {
> > +        QemuCapabilityList *cap = g_new0(QemuCapabilityList, 1);
> > +
> > +        cap->value = cap_val;
> > +        *prev = cap;
> > +        prev = &cap->next;
> > +    }
> > +
> > +    return caps;
> > +}
> 
> All capabilities known to this build of QEMU are always present.  Okay;
> no need to complicate things until we need to.  I just didn't expect it
> to be this simple after the commit message's "It allows QEMU to decide
> which capabilities are available at runtime, if necessary."

I'm slightly worried of misuse of the possibility to change the behavior
on runtime. In libvirt we cache the capabilities to prevent a "chicken
and egg" problem where we need to know what qemu is able to do when
generating the command line which is obviously prior to starting qemu.
This means that we will want to cache even information determined by
interpreting results of this API.

If any further addition is not as simple as this one it may be
challenging to be able to interpret the result correctly and be able to
cache it.

Libvirt's use of capabilities is split to pre-startup steps and runtime
usage. For the pre-startup case [A]  we obviously want to use the cached
capabilities which are obtained by running same qemu with a different
configuration that will be used later. After qemu is started we use
QMP to interact [B] with it also depending to the capabilities
determined from the cache. Scenario [B] also allows us to re-probe some
things but they need to be strictly usable only with QMP afterwards.

The proposed API with the 'runtime' behaviour allows for these 3
scenarios for a capability:
1) Static capability (as this patch shows)
    This is easy to cache and also supports both [A] and [B]

2) Capability depending on configuration
    [A] is out of the window but if QMP only use is necessary we can
    adapt.

3) Capability depending on host state.
    Same commandline might not result in same behaviour. Obviously can't
    be cached at all, but libvirt would not do it anyways. [B] is
    possible, but it's unpleasant.

I propose that the docs for the function filling the result (and perhaps
also the documentation for the QMP interface) clarify and/or guide devs
to avoid situations 2) and 3) if possible and motivate them to document
the limitations on when the capability is detactable.

Additionally a new field could be added that e.g. pledges that the given
capability is of type 1) as described above and thus can be easily
cached. That way we can make sure programatically that we pre-cache only
the 'good' capabilities.

Other than the above, this is a welcome improvement as I've personally
ran into scenarios where a feature in qemu was fixed but it was
impossible to detect whether given qemu version contains the fix as it
did not have any influence on the QMP schema.
Stefan Hajnoczi Feb. 25, 2019, 5:06 p.m. UTC | #3
On Mon, Feb 25, 2019 at 09:50:26AM +0100, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > QMP clients can usually detect the presence of features via schema
> > introspection.  There are rare features that do not involve schema
> > changes and are therefore impossible to detect with schema
> > introspection.
> >
> > This patch adds the query-qemu-capabilities command.  It returns a list
> > of capabilities that this QEMU supports.
> 
> The name "capabilities" could be confusing, because we already have QMP
> capabilities, complete with command qmp_capabilities.  Would "features"
> work?

Sure, will fix.

> > The decision to make this a command rather than something statically
> > defined in the schema is intentional.  It allows QEMU to decide which
> > capabilities are available at runtime, if necessary.
> >
> > This new interface is necessary so that QMP clients can discover that
> > migrating disk image files is safe with cache.direct=off on Linux.
> > There is no other way to detect whether or not QEMU supports this.
> 
> I think what's unsaid here is that we don't want to make a completely
> arbitrary schema change just to carry this bit of information.  We
> could, but we don't want to.  Correct?

Yes, exactly.

> > +{ 'struct': 'QemuCapabilities',
> > +  'data': { 'capabilities': ['QemuCapability'] } }
> > +
> > +##
> > +# @query-qemu-capabilities:
> > +#
> > +# Return the capabilities supported by this QEMU.  Most features can be
> > +# detected via schema introspection but some are not observable from the
> > +# schema.  This command offers a way to check for the presence of such
> > +# features.
> > +#
> > +# Since: 4.0
> > +##
> > +{ 'command': 'query-qemu-capabilities',
> > +  'returns': 'QemuCapabilities' }
> 
> Capabilities are flags: either present or absent, thus effectively
> boolean.  I don't have a specific reason to doubt limiting capabilities
> to boolean.  However, doing QemuCapabilities like
> 
>    { 'struct': 'QemuCapabilities',
>      'data': { 'migrate-with-cache-direct-off-on-linux': 'bool' } }
> 
> could grow to support other values more easily.  What do you think?

Oops, I didn't think of the obvious solution!  Thank you, will fix.
Stefan Hajnoczi Feb. 25, 2019, 5:40 p.m. UTC | #4
On Mon, Feb 25, 2019 at 10:28:46AM +0100, Peter Krempa wrote:
> On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:
> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> > 
> > > QMP clients can usually detect the presence of features via schema
> > > introspection.  There are rare features that do not involve schema
> > > changes and are therefore impossible to detect with schema
> > > introspection.
> > >
> > > This patch adds the query-qemu-capabilities command.  It returns a list
> > > of capabilities that this QEMU supports.
> > 
> > The name "capabilities" could be confusing, because we already have QMP
> > capabilities, complete with command qmp_capabilities.  Would "features"
> > work?
> > 
> > > The decision to make this a command rather than something statically
> > > defined in the schema is intentional.  It allows QEMU to decide which
> > > capabilities are available at runtime, if necessary.
> > >
> > > This new interface is necessary so that QMP clients can discover that
> > > migrating disk image files is safe with cache.direct=off on Linux.
> > > There is no other way to detect whether or not QEMU supports this.
> > 
> > I think what's unsaid here is that we don't want to make a completely
> > arbitrary schema change just to carry this bit of information.  We
> > could, but we don't want to.  Correct?
> 
> One example of such 'unrelated' change was a recent query- command which
> is used by libvirt just to detect presence of the feature but the
> queried result is never used and not very useful.
> 
> > > Cc: Markus Armbruster <armbru@redhat.com>
> > > Cc: Peter Krempa <pkrempa@redhat.com>
> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > ---
> > >  qapi/misc.json | 42 ++++++++++++++++++++++++++++++++++++++++++
> > >  qmp.c          | 18 ++++++++++++++++++
> > >  2 files changed, 60 insertions(+)
> 
> [...]
> 
> > > +QemuCapabilities *qmp_query_qemu_capabilities(Error **errp)
> > > +{
> > > +    QemuCapabilities *caps = g_new0(QemuCapabilities, 1);
> > > +    QemuCapabilityList **prev = &caps->capabilities;
> > > +    QemuCapability cap_val;
> > > +
> > > +    /* Add all QemuCapability enum values defined in the schema */
> > > +    for (cap_val = 0; cap_val < QEMU_CAPABILITY__MAX; cap_val++) {
> > > +        QemuCapabilityList *cap = g_new0(QemuCapabilityList, 1);
> > > +
> > > +        cap->value = cap_val;
> > > +        *prev = cap;
> > > +        prev = &cap->next;
> > > +    }
> > > +
> > > +    return caps;
> > > +}
> > 
> > All capabilities known to this build of QEMU are always present.  Okay;
> > no need to complicate things until we need to.  I just didn't expect it
> > to be this simple after the commit message's "It allows QEMU to decide
> > which capabilities are available at runtime, if necessary."
> 
> I'm slightly worried of misuse of the possibility to change the behavior
> on runtime. In libvirt we cache the capabilities to prevent a "chicken
> and egg" problem where we need to know what qemu is able to do when
> generating the command line which is obviously prior to starting qemu.
> This means that we will want to cache even information determined by
> interpreting results of this API.
> 
> If any further addition is not as simple as this one it may be
> challenging to be able to interpret the result correctly and be able to
> cache it.
> 
> Libvirt's use of capabilities is split to pre-startup steps and runtime
> usage. For the pre-startup case [A]  we obviously want to use the cached
> capabilities which are obtained by running same qemu with a different
> configuration that will be used later. After qemu is started we use
> QMP to interact [B] with it also depending to the capabilities
> determined from the cache. Scenario [B] also allows us to re-probe some
> things but they need to be strictly usable only with QMP afterwards.
> 
> The proposed API with the 'runtime' behaviour allows for these 3
> scenarios for a capability:
> 1) Static capability (as this patch shows)
>     This is easy to cache and also supports both [A] and [B]
> 
> 2) Capability depending on configuration
>     [A] is out of the window but if QMP only use is necessary we can
>     adapt.

Does "configuration" mean "QEMU command-line"?  The result of the query
command should not depend on command-line arguments.

> 3) Capability depending on host state.
>     Same commandline might not result in same behaviour. Obviously can't
>     be cached at all, but libvirt would not do it anyways. [B] is
>     possible, but it's unpleasant.

Say the kernel or a library dependency is updated, and this enables a
feature that QEMU was capable of but couldn't advertise before.  I guess
this might happen and this is why I noted that the features could be
selected at runtime.

> I propose that the docs for the function filling the result (and perhaps
> also the documentation for the QMP interface) clarify and/or guide devs
> to avoid situations 2) and 3) if possible and motivate them to document
> the limitations on when the capability is detactable.
> 
> Additionally a new field could be added that e.g. pledges that the given
> capability is of type 1) as described above and thus can be easily
> cached. That way we can make sure programatically that we pre-cache only
> the 'good' capabilities.
> 
> Other than the above, this is a welcome improvement as I've personally
> ran into scenarios where a feature in qemu was fixed but it was
> impossible to detect whether given qemu version contains the fix as it
> did not have any influence on the QMP schema.

I'd like to make things as simpler as possible, but no simpler :).

The simplest option is that the advertised features are set in stone at
build time (e.g. selected with #ifdef if necessary).  But then we have
no way of selecting features at runtime (e.g. based on kernel features).

What do you think?

Stefan
Markus Armbruster Feb. 26, 2019, 7:23 a.m. UTC | #5
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Mon, Feb 25, 2019 at 09:50:26AM +0100, Markus Armbruster wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> 
>> > QMP clients can usually detect the presence of features via schema
>> > introspection.  There are rare features that do not involve schema
>> > changes and are therefore impossible to detect with schema
>> > introspection.
>> >
>> > This patch adds the query-qemu-capabilities command.  It returns a list
>> > of capabilities that this QEMU supports.
>> 
>> The name "capabilities" could be confusing, because we already have QMP
>> capabilities, complete with command qmp_capabilities.  Would "features"
>> work?
>
> Sure, will fix.
>
>> > The decision to make this a command rather than something statically
>> > defined in the schema is intentional.  It allows QEMU to decide which
>> > capabilities are available at runtime, if necessary.
>> >
>> > This new interface is necessary so that QMP clients can discover that
>> > migrating disk image files is safe with cache.direct=off on Linux.
>> > There is no other way to detect whether or not QEMU supports this.
>> 
>> I think what's unsaid here is that we don't want to make a completely
>> arbitrary schema change just to carry this bit of information.  We
>> could, but we don't want to.  Correct?
>
> Yes, exactly.

Then let's rephrase a little:

     QMP clients can usually detect the presence of features via schema
     introspection.  There are rare features that do not involve schema
     changes.  To make them detectable with schema introspection, we'd
     have to make some arbitrary schema change.  Annoying.

     The new query-qemu-features command lets us avoid that.  It returns
     a list of features supported by this QEMU.

     The decision to make this a command rather than something statically
     defined in the schema is intentional.  It allows QEMU to decide which
     capabilities are available at runtime, if necessary.

     Use the new command to declare migrating disk image files is safe
     with cache.direct=off on Linux.

[...]
Peter Krempa Feb. 26, 2019, 7:44 a.m. UTC | #6
On Mon, Feb 25, 2019 at 17:40:01 +0000, Stefan Hajnoczi wrote:
> On Mon, Feb 25, 2019 at 10:28:46AM +0100, Peter Krempa wrote:
> > On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:

[...]

> > I'm slightly worried of misuse of the possibility to change the behavior
> > on runtime. In libvirt we cache the capabilities to prevent a "chicken
> > and egg" problem where we need to know what qemu is able to do when
> > generating the command line which is obviously prior to starting qemu.
> > This means that we will want to cache even information determined by
> > interpreting results of this API.
> > 
> > If any further addition is not as simple as this one it may be
> > challenging to be able to interpret the result correctly and be able to
> > cache it.
> > 
> > Libvirt's use of capabilities is split to pre-startup steps and runtime
> > usage. For the pre-startup case [A]  we obviously want to use the cached
> > capabilities which are obtained by running same qemu with a different
> > configuration that will be used later. After qemu is started we use
> > QMP to interact [B] with it also depending to the capabilities
> > determined from the cache. Scenario [B] also allows us to re-probe some
> > things but they need to be strictly usable only with QMP afterwards.
> > 
> > The proposed API with the 'runtime' behaviour allows for these 3
> > scenarios for a capability:
> > 1) Static capability (as this patch shows)
> >     This is easy to cache and also supports both [A] and [B]
> > 
> > 2) Capability depending on configuration
> >     [A] is out of the window but if QMP only use is necessary we can
> >     adapt.
> 
> Does "configuration" mean "QEMU command-line"?  The result of the query
> command should not depend on command-line arguments.

Yes exactly. There probably is possibility that something would be
detectable only after a shared library load which in turn would depend
on the command line ...

> 
> > 3) Capability depending on host state.
> >     Same commandline might not result in same behaviour. Obviously can't
> >     be cached at all, but libvirt would not do it anyways. [B] is
> >     possible, but it's unpleasant.
> 
> Say the kernel or a library dependency is updated, and this enables a
> feature that QEMU was capable of but couldn't advertise before.  I guess
> this might happen and this is why I noted that the features could be
> selected at runtime.

Yeah, such scenario is really breaking our caching even now. I don't
want to say it's bad. It may even be necessary in some scenarios. Both
this and the above scenario may be necessary eventually. Libvirt
certainly can make use of the detection for QMP use if there is such a
thing.

> > I propose that the docs for the function filling the result (and perhaps
> > also the documentation for the QMP interface) clarify and/or guide devs
> > to avoid situations 2) and 3) if possible and motivate them to document
> > the limitations on when the capability is detactable.
> > 
> > Additionally a new field could be added that e.g. pledges that the given
> > capability is of type 1) as described above and thus can be easily
> > cached. That way we can make sure programatically that we pre-cache only
> > the 'good' capabilities.
> > 
> > Other than the above, this is a welcome improvement as I've personally
> > ran into scenarios where a feature in qemu was fixed but it was
> > impossible to detect whether given qemu version contains the fix as it
> > did not have any influence on the QMP schema.
> 
> I'd like to make things as simpler as possible, but no simpler :).
> 
> The simplest option is that the advertised features are set in stone at
> build time (e.g. selected with #ifdef if necessary).  But then we have
> no way of selecting features at runtime (e.g. based on kernel features).
> 
> What do you think?

I really don't want to limit the possibilities for the API. My goal is
only that it's obvious both from the docs and preferably also from the
returned data (so that we can filter what to cache to prevent mistakes)
that a given capability bit is static or dynamic.

I think adding the field to the returned data which will be set
according to how the capability was detected should be simple enough,
but I will be okay with just documenting any caveats along with the
inidividual capabilities. In that case an comment should encourage those
coming after you to document it properly.
Markus Armbruster Feb. 26, 2019, 9:09 a.m. UTC | #7
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Mon, Feb 25, 2019 at 10:28:46AM +0100, Peter Krempa wrote:
>> On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:
>> > Stefan Hajnoczi <stefanha@redhat.com> writes:
>> > 
>> > > QMP clients can usually detect the presence of features via schema
>> > > introspection.  There are rare features that do not involve schema
>> > > changes and are therefore impossible to detect with schema
>> > > introspection.
>> > >
>> > > This patch adds the query-qemu-capabilities command.  It returns a list
>> > > of capabilities that this QEMU supports.
>> > 
>> > The name "capabilities" could be confusing, because we already have QMP
>> > capabilities, complete with command qmp_capabilities.  Would "features"
>> > work?
>> > 
>> > > The decision to make this a command rather than something statically
>> > > defined in the schema is intentional.  It allows QEMU to decide which
>> > > capabilities are available at runtime, if necessary.
>> > >
>> > > This new interface is necessary so that QMP clients can discover that
>> > > migrating disk image files is safe with cache.direct=off on Linux.
>> > > There is no other way to detect whether or not QEMU supports this.
>> > 
>> > I think what's unsaid here is that we don't want to make a completely
>> > arbitrary schema change just to carry this bit of information.  We
>> > could, but we don't want to.  Correct?
>> 
>> One example of such 'unrelated' change was a recent query- command which
>> is used by libvirt just to detect presence of the feature but the
>> queried result is never used and not very useful.

If that query- command really has no other use, we might want to
deprecate it in favor of a query-qemu-features feature.

>> > > Cc: Markus Armbruster <armbru@redhat.com>
>> > > Cc: Peter Krempa <pkrempa@redhat.com>
>> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> > > ---
>> > >  qapi/misc.json | 42 ++++++++++++++++++++++++++++++++++++++++++
>> > >  qmp.c          | 18 ++++++++++++++++++
>> > >  2 files changed, 60 insertions(+)
>> 
>> [...]
>> 
>> > > +QemuCapabilities *qmp_query_qemu_capabilities(Error **errp)
>> > > +{
>> > > +    QemuCapabilities *caps = g_new0(QemuCapabilities, 1);
>> > > +    QemuCapabilityList **prev = &caps->capabilities;
>> > > +    QemuCapability cap_val;
>> > > +
>> > > +    /* Add all QemuCapability enum values defined in the schema */
>> > > +    for (cap_val = 0; cap_val < QEMU_CAPABILITY__MAX; cap_val++) {
>> > > +        QemuCapabilityList *cap = g_new0(QemuCapabilityList, 1);
>> > > +
>> > > +        cap->value = cap_val;
>> > > +        *prev = cap;
>> > > +        prev = &cap->next;
>> > > +    }
>> > > +
>> > > +    return caps;
>> > > +}
>> > 
>> > All capabilities known to this build of QEMU are always present.  Okay;
>> > no need to complicate things until we need to.  I just didn't expect it
>> > to be this simple after the commit message's "It allows QEMU to decide
>> > which capabilities are available at runtime, if necessary."
>> 
>> I'm slightly worried of misuse of the possibility to change the behavior
>> on runtime. In libvirt we cache the capabilities to prevent a "chicken
>> and egg" problem where we need to know what qemu is able to do when
>> generating the command line which is obviously prior to starting qemu.
>> This means that we will want to cache even information determined by
>> interpreting results of this API.

Good point.

>> If any further addition is not as simple as this one it may be
>> challenging to be able to interpret the result correctly and be able to
>> cache it.
>> 
>> Libvirt's use of capabilities is split to pre-startup steps and runtime
>> usage. For the pre-startup case [A]  we obviously want to use the cached
>> capabilities which are obtained by running same qemu with a different
>> configuration that will be used later. After qemu is started we use
>> QMP to interact [B] with it also depending to the capabilities
>> determined from the cache. Scenario [B] also allows us to re-probe some
>> things but they need to be strictly usable only with QMP afterwards.

Let's examine possible extents of features.  Here are a few obvious
ones:

* Compile-time static

  Probe results remain valid until the QEMU binary changes.  Even across
  hosts (but libvirt doesn't care for that).

* Load-time static, i.e. doesn't change once QEMU runs, but the next run
  might see another value.

  Caching probe results beyond a single run is wrong.

* Dynamic, i.e. can change while QEMU runs

  Probing is wrong (TOCTTOU).

Note that compile-time static is stronger than necessary for caching
probe results, and load-time static already too weak.  Is there
something useful in between?

>> The proposed API with the 'runtime' behaviour allows for these 3
>> scenarios for a capability:
>> 1) Static capability (as this patch shows)
>>     This is easy to cache and also supports both [A] and [B]
>> 
>> 2) Capability depending on configuration
>>     [A] is out of the window but if QMP only use is necessary we can
>>     adapt.
>
> Does "configuration" mean "QEMU command-line"?  The result of the query
> command should not depend on command-line arguments.
>
>> 3) Capability depending on host state.
>>     Same commandline might not result in same behaviour. Obviously can't
>>     be cached at all, but libvirt would not do it anyways. [B] is
>>     possible, but it's unpleasant.
>
> Say the kernel or a library dependency is updated, and this enables a
> feature that QEMU was capable of but couldn't advertise before.  I guess
> this might happen and this is why I noted that the features could be
> selected at runtime.

A library update should not affect its running users, so it's still
load-time static.

A full kernel update requires a reboot.  Load-time static.

A kernel module update could conceivably change a feature's status.
That means dynamic.  If we treat it as load-time static anyway, we miss
the feature's appearance.  Tolerable.  But if an update pulls the rug
out from the feature... less so.

Same when a feature depends on a local daemon or a remote service.

>> I propose that the docs for the function filling the result (and perhaps
>> also the documentation for the QMP interface) clarify and/or guide devs
>> to avoid situations 2) and 3) if possible and motivate them to document
>> the limitations on when the capability is detactable.

Makes sense to me.

>> Additionally a new field could be added that e.g. pledges that the given
>> capability is of type 1) as described above and thus can be easily
>> cached. That way we can make sure programatically that we pre-cache only
>> the 'good' capabilities.

Less error prone than relying on just documentation, I guess.

>> Other than the above, this is a welcome improvement as I've personally
>> ran into scenarios where a feature in qemu was fixed but it was
>> impossible to detect whether given qemu version contains the fix as it
>> did not have any influence on the QMP schema.
>
> I'd like to make things as simpler as possible, but no simpler :).
>
> The simplest option is that the advertised features are set in stone at
> build time (e.g. selected with #ifdef if necessary).  But then we have
> no way of selecting features at runtime (e.g. based on kernel features).
>
> What do you think?

I think we should map the problem space, identify cases with actual
uses, then design a solution that covers them and can plausibly grow to
cover cases we anticipate.

The migrate-with-cache-direct-off-on-linux case is compile-time static.

Do we have or have we had a case that is not compile-time static?
Stefan Hajnoczi Feb. 27, 2019, 2:51 p.m. UTC | #8
On Tue, Feb 26, 2019 at 10:09:19AM +0100, Markus Armbruster wrote:
> The migrate-with-cache-direct-off-on-linux case is compile-time static.

By the way, I will change "migrate-with-cache-direct-off-on-linux" to
"migrate-with-cache-direct-off".  Linux builds will report this feature
while non-Linux builds will not.  If we ever add non-Linux support, then
we can report this feature in those builds too.

Stefan
Stefan Hajnoczi Feb. 27, 2019, 3:15 p.m. UTC | #9
On Tue, Feb 26, 2019 at 10:09:19AM +0100, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> > On Mon, Feb 25, 2019 at 10:28:46AM +0100, Peter Krempa wrote:
> >> On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:
> >> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> >> Other than the above, this is a welcome improvement as I've personally
> >> ran into scenarios where a feature in qemu was fixed but it was
> >> impossible to detect whether given qemu version contains the fix as it
> >> did not have any influence on the QMP schema.
> >
> > I'd like to make things as simpler as possible, but no simpler :).
> >
> > The simplest option is that the advertised features are set in stone at
> > build time (e.g. selected with #ifdef if necessary).  But then we have
> > no way of selecting features at runtime (e.g. based on kernel features).
> >
> > What do you think?
> 
> I think we should map the problem space, identify cases with actual
> uses, then design a solution that covers them and can plausibly grow to
> cover cases we anticipate.
> 
> The migrate-with-cache-direct-off-on-linux case is compile-time static.
> 
> Do we have or have we had a case that is not compile-time static?

I can't think of something that both requires query-qemu-features and
isn't compile-time static.  But that could be because not many things
require query-qemu-features :).

The philosophy on Linux seems to be: distros compile QEMU against
headers that accurately reflect the system features that are available.
Therefore there isn't a lot of runtime fallback logic in case a new QEMU
is running on an old host.  This means we can get by with compile-time
static features only.

One possible exception is the ENOSYS/ENOTSUP fallback code in
block/file-posix.c:handle_aiocb_write_zeroes() and
handle_aiocb_copy_range().

Stefan
Daniel P. Berrangé Feb. 27, 2019, 3:25 p.m. UTC | #10
On Wed, Feb 27, 2019 at 03:15:35PM +0000, Stefan Hajnoczi wrote:
> On Tue, Feb 26, 2019 at 10:09:19AM +0100, Markus Armbruster wrote:
> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> > > On Mon, Feb 25, 2019 at 10:28:46AM +0100, Peter Krempa wrote:
> > >> On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:
> > >> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> > >> Other than the above, this is a welcome improvement as I've personally
> > >> ran into scenarios where a feature in qemu was fixed but it was
> > >> impossible to detect whether given qemu version contains the fix as it
> > >> did not have any influence on the QMP schema.
> > >
> > > I'd like to make things as simpler as possible, but no simpler :).
> > >
> > > The simplest option is that the advertised features are set in stone at
> > > build time (e.g. selected with #ifdef if necessary).  But then we have
> > > no way of selecting features at runtime (e.g. based on kernel features).
> > >
> > > What do you think?
> > 
> > I think we should map the problem space, identify cases with actual
> > uses, then design a solution that covers them and can plausibly grow to
> > cover cases we anticipate.
> > 
> > The migrate-with-cache-direct-off-on-linux case is compile-time static.
> > 
> > Do we have or have we had a case that is not compile-time static?
> 
> I can't think of something that both requires query-qemu-features and
> isn't compile-time static.  But that could be because not many things
> require query-qemu-features :).
> 
> The philosophy on Linux seems to be: distros compile QEMU against
> headers that accurately reflect the system features that are available.
> Therefore there isn't a lot of runtime fallback logic in case a new QEMU
> is running on an old host.  This means we can get by with compile-time
> static features only.

This assumption is probably going to be more troublesome in future. It
is increasingly common to deploy libvirt & QEMU inside docker containers,
and there's not a strict requirement that the docker container image
is the same distro / version as the docker host. ie a Fedora docker
image could be run on a RHEL host & vica-verca. IOW we could see an older
kernel than what we compiled on. I don't know how common such mismatches
will be in practice, but at the Docker level there's nothing preventing
such mismatches, which means some people will inevitably end up running
in such scenarios whether we like it or not. 

Regards,
Daniel
Stefan Hajnoczi Feb. 27, 2019, 5:12 p.m. UTC | #11
On Wed, Feb 27, 2019 at 03:25:25PM +0000, Daniel P. Berrangé wrote:
> On Wed, Feb 27, 2019 at 03:15:35PM +0000, Stefan Hajnoczi wrote:
> > On Tue, Feb 26, 2019 at 10:09:19AM +0100, Markus Armbruster wrote:
> > > Stefan Hajnoczi <stefanha@redhat.com> writes:
> > > > On Mon, Feb 25, 2019 at 10:28:46AM +0100, Peter Krempa wrote:
> > > >> On Mon, Feb 25, 2019 at 09:50:26 +0100, Markus Armbruster wrote:
> > > >> > Stefan Hajnoczi <stefanha@redhat.com> writes:
> > > >> Other than the above, this is a welcome improvement as I've personally
> > > >> ran into scenarios where a feature in qemu was fixed but it was
> > > >> impossible to detect whether given qemu version contains the fix as it
> > > >> did not have any influence on the QMP schema.
> > > >
> > > > I'd like to make things as simpler as possible, but no simpler :).
> > > >
> > > > The simplest option is that the advertised features are set in stone at
> > > > build time (e.g. selected with #ifdef if necessary).  But then we have
> > > > no way of selecting features at runtime (e.g. based on kernel features).
> > > >
> > > > What do you think?
> > > 
> > > I think we should map the problem space, identify cases with actual
> > > uses, then design a solution that covers them and can plausibly grow to
> > > cover cases we anticipate.
> > > 
> > > The migrate-with-cache-direct-off-on-linux case is compile-time static.
> > > 
> > > Do we have or have we had a case that is not compile-time static?
> > 
> > I can't think of something that both requires query-qemu-features and
> > isn't compile-time static.  But that could be because not many things
> > require query-qemu-features :).
> > 
> > The philosophy on Linux seems to be: distros compile QEMU against
> > headers that accurately reflect the system features that are available.
> > Therefore there isn't a lot of runtime fallback logic in case a new QEMU
> > is running on an old host.  This means we can get by with compile-time
> > static features only.
> 
> This assumption is probably going to be more troublesome in future. It
> is increasingly common to deploy libvirt & QEMU inside docker containers,
> and there's not a strict requirement that the docker container image
> is the same distro / version as the docker host. ie a Fedora docker
> image could be run on a RHEL host & vica-verca. IOW we could see an older

Off-topic, the standard spelling seems to be "vice versa".  Is this a
typo or another accepted spelling?

> kernel than what we compiled on. I don't know how common such mismatches
> will be in practice, but at the Docker level there's nothing preventing
> such mismatches, which means some people will inevitably end up running
> in such scenarios whether we like it or not. 

Good point.

Peter's suggestion for a cacheable flag sounds useful.  I can propose
that in the next revision.

Stefan
Markus Armbruster Feb. 27, 2019, 7:25 p.m. UTC | #12
Stefan Hajnoczi <stefanha@gmail.com> writes:

> On Tue, Feb 26, 2019 at 10:09:19AM +0100, Markus Armbruster wrote:
>> The migrate-with-cache-direct-off-on-linux case is compile-time static.
>
> By the way, I will change "migrate-with-cache-direct-off-on-linux" to
> "migrate-with-cache-direct-off".  Linux builds will report this feature
> while non-Linux builds will not.  If we ever add non-Linux support, then
> we can report this feature in those builds too.

Good idea.
Stefan Hajnoczi March 8, 2019, 10:35 a.m. UTC | #13
On Fri, Feb 22, 2019 at 03:32:54PM +0000, Stefan Hajnoczi wrote:
> QMP clients can usually detect the presence of features via schema
> introspection.  There are rare features that do not involve schema
> changes and are therefore impossible to detect with schema
> introspection.
> 
> This patch adds the query-qemu-capabilities command.  It returns a list
> of capabilities that this QEMU supports.
> 
> The decision to make this a command rather than something statically
> defined in the schema is intentional.  It allows QEMU to decide which
> capabilities are available at runtime, if necessary.
> 
> This new interface is necessary so that QMP clients can discover that
> migrating disk image files is safe with cache.direct=off on Linux.
> There is no other way to detect whether or not QEMU supports this.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Peter Krempa <pkrempa@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  qapi/misc.json | 42 ++++++++++++++++++++++++++++++++++++++++++
>  qmp.c          | 18 ++++++++++++++++++
>  2 files changed, 60 insertions(+)

Markus: query-qemu-features is no longer needed by the page cache
invalidation feature since a related QMP schema change is being made now
and clients can detect the feature via that change.

I'm inclined to leave this patch series for now.  Later someone can pick
it up again if there is a feature that cannot be detected via a schema
change.
Markus Armbruster March 8, 2019, 12:15 p.m. UTC | #14
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Fri, Feb 22, 2019 at 03:32:54PM +0000, Stefan Hajnoczi wrote:
>> QMP clients can usually detect the presence of features via schema
>> introspection.  There are rare features that do not involve schema
>> changes and are therefore impossible to detect with schema
>> introspection.
>> 
>> This patch adds the query-qemu-capabilities command.  It returns a list
>> of capabilities that this QEMU supports.
>> 
>> The decision to make this a command rather than something statically
>> defined in the schema is intentional.  It allows QEMU to decide which
>> capabilities are available at runtime, if necessary.
>> 
>> This new interface is necessary so that QMP clients can discover that
>> migrating disk image files is safe with cache.direct=off on Linux.
>> There is no other way to detect whether or not QEMU supports this.
>> 
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Peter Krempa <pkrempa@redhat.com>
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> ---
>>  qapi/misc.json | 42 ++++++++++++++++++++++++++++++++++++++++++
>>  qmp.c          | 18 ++++++++++++++++++
>>  2 files changed, 60 insertions(+)
>
> Markus: query-qemu-features is no longer needed by the page cache
> invalidation feature since a related QMP schema change is being made now
> and clients can detect the feature via that change.
>
> I'm inclined to leave this patch series for now.  Later someone can pick
> it up again if there is a feature that cannot be detected via a schema
> change.

Mekse sense.  I figure we'll likely want the command at some point.  And
that's when we should add it.  Hold on to your git branch until then.
diff mbox series

Patch

diff --git a/qapi/misc.json b/qapi/misc.json
index 8b3ca4fdd3..9cf24919a3 100644
--- a/qapi/misc.json
+++ b/qapi/misc.json
@@ -3051,3 +3051,45 @@ 
   'data': 'NumaOptions',
   'allow-preconfig': true
 }
+
+##
+# @QemuCapability:
+#
+# Capabilities that are not otherwise introspectable.
+#
+# @migrate-with-cache-direct-off-on-linux:
+#  It is safe to migrate disk image files with cache.direct=off on Linux.
+#  Previously only cache.direct=on was supported for live migration.
+#
+# Since: 4.0
+##
+{ 'enum': 'QemuCapability',
+  'data': [
+    'migrate-with-cache-direct-off-on-linux'
+  ]
+}
+
+##
+# @QemuCapabilities:
+#
+# Capability information.
+#
+# @capabilities: supported capabilities
+#
+# Since: 4.0
+##
+{ 'struct': 'QemuCapabilities',
+  'data': { 'capabilities': ['QemuCapability'] } }
+
+##
+# @query-qemu-capabilities:
+#
+# Return the capabilities supported by this QEMU.  Most features can be
+# detected via schema introspection but some are not observable from the
+# schema.  This command offers a way to check for the presence of such
+# features.
+#
+# Since: 4.0
+##
+{ 'command': 'query-qemu-capabilities',
+  'returns': 'QemuCapabilities' }
diff --git a/qmp.c b/qmp.c
index b92d62cd5f..91a1228be7 100644
--- a/qmp.c
+++ b/qmp.c
@@ -717,3 +717,21 @@  MemoryInfo *qmp_query_memory_size_summary(Error **errp)
 
     return mem_info;
 }
+
+QemuCapabilities *qmp_query_qemu_capabilities(Error **errp)
+{
+    QemuCapabilities *caps = g_new0(QemuCapabilities, 1);
+    QemuCapabilityList **prev = &caps->capabilities;
+    QemuCapability cap_val;
+
+    /* Add all QemuCapability enum values defined in the schema */
+    for (cap_val = 0; cap_val < QEMU_CAPABILITY__MAX; cap_val++) {
+        QemuCapabilityList *cap = g_new0(QemuCapabilityList, 1);
+
+        cap->value = cap_val;
+        *prev = cap;
+        prev = &cap->next;
+    }
+
+    return caps;
+}