[RFC,v3,0/5] QMP support for cold-plugging devices

Message ID	20211117144703.16305-1-damien.hedde@greensocs.com
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Damien Hedde <damien.hedde@greensocs.com> To: qemu-devel@nongnu.org Subject: [RFC PATCH v3 0/5] QMP support for cold-plugging devices Date: Wed, 17 Nov 2021 15:46:58 +0100 Message-Id: <20211117144703.16305-1-damien.hedde@greensocs.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=5.135.226.135; envelope-from=damien.hedde@greensocs.com; helo=beetle.greensocs.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Cc: Damien Hedde <damien.hedde@greensocs.com>, edgar.iglesias@xilinx.com, =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= <berrange@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Eric Blake <eblake@redhat.com>, Mark Burton <mark.burton@greensocs.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Markus Armbruster <armbru@redhat.com>, Eric Auger <eric.auger@redhat.com>, Mirela Grujic <mirela.grujic@greensocs.com>, Alistair Francis <alistair.francis@wdc.com>, Gerd Hoffmann <kraxel@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= <philmd@redhat.com> Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	QMP support for cold-plugging devices \| expand [RFC,v3,0/5] QMP support for cold-plugging devices [RFC,v3,1/5] rename MachineInitPhase enum constants for QAPI compatibility [RFC,v3,2/5] qapi: Implement query-machine-phase QMP command [RFC,v3,3/5] qapi: Implement x-machine-init QMP command [RFC,v3,4/5] qapi: Allow device_add to execute in machine initialized phase [RFC,v3,5/5] docs/system: improve doc about preconfig

Damien Hedde Nov. 17, 2021, 2:46 p.m. UTC

Hi all,

This series adds support for cold-plugging devices using QMP
commands. It is a step towards machine configuration using QMP, but
it does not allow the user to add more devices than he could do with
the CLI options before.

Right now we can add a device using 2 ways:
+ giving "-device" CLI option. In that case the device is
  cold-plugged: it is created before the machine becomes ready.
+ issuing device_add QMP command. In that case the device is
  hot-plugged (the command can not be issued before the machine is
  ready).

This series allows to issue device_add QMP to cold-plug a device
like we do with "-device" CLI option. All added QMP commands are
marked as 'unstable' in qapi as they are part of preconfig.
We achieve this by introducing a new 'x-machine-init' command to
stop the VM creation at a point where we can cold-plug devices.

We are aware of the ongoing discussion about preconfig future (see
[1]). This patchset includes no major modifications from the v2 (but
the scope is reduced) and "x-machine-init" simply stops the
configuration between qemu_board_init() and qemu_create_cli_devices()
function calls.

As a consequence, in the current state, the timeline is:
+ "x-machine-init" command
+ "device_add" cold-plug commands (no fw_cfg legacy order support)
+ "x-exit-preconfig" command will then trigger the following
+ "-soundhw" CLI options
+ "-fw_cfg" CLI options
+ usb devices creation
+ "-device" CLI options (with fw_cfg legacy order support)
+ some other devices creation (with fw_cfg legacy order support)

We don't know if the differences between -device/device_add are
acceptable or not. To reduce/remove them we could move the
"x-machine-init" stopping point. What do you think ?

Patches 1, 3 and 5 miss a review.

The series is organized as follow:

+ Patches 1 and 2 converts the MachinePhase enum to a qapi definition
  and add the 'query-machine-phase'. It allows to introspect the
  current machine phase during preconfig as we will now be able to
  reach several machine phases using QMP.
+ Patch 3 adds the 'x-machine-init' QMP command to stop QEMU at
  machine-initialized phase during preconfig.
+ Patch 4 allows issuing device_add QMP command during the
  machine-initialized phase.
+ Patch 5 improves the doc about preconfig in consequence. 

[1]: https://lore.kernel.org/qemu-devel/b31f442d28920447690a6b8cee865bdbacde1283.1635160056.git.mprivozn@redhat.com

Thanks for your feedback.

---
Damien

v3:
 + extracted patches related to cold-plugging devices from the v2 rfc
 + updated for rebase (qapi 'unstable' feature addition and 7.0 version bump)
 + fix check for x-machine-init command (and drop Alistair's ack-by on
   patch 4)
 + extracted only a bit of the doc patch
 + drop qdev_set_id patch because it was merged in Kevin's
   "qdev: Add JSON -device" series which did a lot of cleanups in
   device_add related functions:
   https://lore.kernel.org/qemu-devel/20211008133442.141332-1-kwolf@redhat.com

v2 was part of this rfc:
https://lore.kernel.org/qemu-devel/20210922161405.140018-1-damien.hedde@greensocs.com

Damien Hedde (1):
  docs/system: improve doc about preconfig

Mirela Grujic (4):
  rename MachineInitPhase enum constants for QAPI compatibility
  qapi: Implement query-machine-phase QMP command
  qapi: Implement x-machine-init QMP command
  qapi: Allow device_add to execute in machine initialized phase

 docs/system/managed-startup.rst | 20 +++++++-
 qapi/machine.json               | 87 +++++++++++++++++++++++++++++++++
 qapi/qdev.json                  |  3 +-
 include/hw/qdev-core.h          | 30 +-----------
 hw/core/machine-qmp-cmds.c      | 11 ++++-
 hw/core/machine.c               |  6 +--
 hw/core/qdev.c                  |  7 ++-
 hw/pci/pci.c                    |  2 +-
 hw/usb/core.c                   |  2 +-
 hw/virtio/virtio-iommu.c        |  2 +-
 monitor/hmp.c                   |  2 +-
 monitor/misc.c                  |  2 +-
 softmmu/qdev-monitor.c          | 15 ++++--
 softmmu/vl.c                    | 23 ++++++---
 ui/console.c                    |  3 +-
 15 files changed, 164 insertions(+), 51 deletions(-)

Markus Armbruster Nov. 20, 2021, 9 a.m. UTC | #1

Damien Hedde <damien.hedde@greensocs.com> writes:

> Hi all,
>
> This series adds support for cold-plugging devices using QMP
> commands. It is a step towards machine configuration using QMP, but
> it does not allow the user to add more devices than he could do with
> the CLI options before.
>
> Right now we can add a device using 2 ways:
> + giving "-device" CLI option. In that case the device is
>   cold-plugged: it is created before the machine becomes ready.
> + issuing device_add QMP command. In that case the device is
>   hot-plugged (the command can not be issued before the machine is
>   ready).
>
> This series allows to issue device_add QMP to cold-plug a device
> like we do with "-device" CLI option. All added QMP commands are
> marked as 'unstable' in qapi as they are part of preconfig.
> We achieve this by introducing a new 'x-machine-init' command to
> stop the VM creation at a point where we can cold-plug devices.
>
> We are aware of the ongoing discussion about preconfig future (see
> [1]). This patchset includes no major modifications from the v2 (but
> the scope is reduced) and "x-machine-init" simply stops the
> configuration between qemu_board_init() and qemu_create_cli_devices()
> function calls.
>
> As a consequence, in the current state, the timeline is:

"current state" = with this series applied?

> + "x-machine-init" command
> + "device_add" cold-plug commands (no fw_cfg legacy order support)
> + "x-exit-preconfig" command will then trigger the following
> + "-soundhw" CLI options
> + "-fw_cfg" CLI options
> + usb devices creation
> + "-device" CLI options (with fw_cfg legacy order support)
> + some other devices creation (with fw_cfg legacy order support)
>
> We don't know if the differences between -device/device_add are
> acceptable or not. To reduce/remove them we could move the
> "x-machine-init" stopping point. What do you think ?

I'm not sure I understand this paragraph.

I understand the difference between -device and device_add in master:
cold vs. hot plug.

Your patch series makes "cold" device_add possible, i.e. it reduces
(eliminates?) the difference between -device and device_add when the
latter is "cold".

What difference remains that moving 'the "x-machine-init" stopping
point' would 'reduce/remove'?

> Patches 1, 3 and 5 miss a review.
>
> The series is organized as follow:
>
> + Patches 1 and 2 converts the MachinePhase enum to a qapi definition
>   and add the 'query-machine-phase'. It allows to introspect the
>   current machine phase during preconfig as we will now be able to
>   reach several machine phases using QMP.

If we fold MachinePhase into RunState, we can reuse query-status.

Having two state machines run one after the other feels like one too
many.

> + Patch 3 adds the 'x-machine-init' QMP command to stop QEMU at
>   machine-initialized phase during preconfig.
> + Patch 4 allows issuing device_add QMP command during the
>   machine-initialized phase.
> + Patch 5 improves the doc about preconfig in consequence. 

I understand you want to make progress towards machine configuration
with QMP.  However, QEMU startup is (in my educated opinion) a hole, and
we should be wary of digging deeper.

The "timeline" you gave above illustrates this.  It's a complicated
shuffling of command line options and QMP commands that basically nobody
can keep in working memory.  We have reshuffled it / made it more
complicated quite a few times already to support new features.  Based on
your cover letter, I figure you're making it more complicated once more.

At some point, we need to stop digging us deeper into the hole.  This is
not an objection to merging your work.  It's a call to stop and think.

Let me quote the sketch I posted to the "Stabilize preconfig" thread:

1. Start event loop

2. Feed it CLI left to right.  Each option runs a handler just like each
   QMP command does.

   Options that read a configuration file inject the file into the feed.

   Options that create a monitor create it suspended.

   Options may advance the phase / run state, and they may require
   certain phase(s).

3. When we're done with CLI, resume any monitors we created.

4. Monitors now feed commands to the event loop.  Commands may advance
   the phase / run state, and they may require certain phase(s).

Users can do as much or as little with the CLI as they want.  You'd
probably want to set up a QMP monitor and no more.

device_add becomes possible at a certain state of the phase / run state
machine.  It changes from cold to hot plug at a certain later state.

> [1]: https://lore.kernel.org/qemu-devel/b31f442d28920447690a6b8cee865bdbacde1283.1635160056.git.mprivozn@redhat.com
>
> Thanks for your feedback.

Damien Hedde Nov. 23, 2021, 4:11 p.m. UTC | #2

On 11/20/21 10:00, Markus Armbruster wrote:
> Damien Hedde <damien.hedde@greensocs.com> writes:
> 
>> Hi all,
>>
>> This series adds support for cold-plugging devices using QMP
>> commands. It is a step towards machine configuration using QMP, but
>> it does not allow the user to add more devices than he could do with
>> the CLI options before.
>>
>> Right now we can add a device using 2 ways:
>> + giving "-device" CLI option. In that case the device is
>>    cold-plugged: it is created before the machine becomes ready.
>> + issuing device_add QMP command. In that case the device is
>>    hot-plugged (the command can not be issued before the machine is
>>    ready).
>>
>> This series allows to issue device_add QMP to cold-plug a device
>> like we do with "-device" CLI option. All added QMP commands are
>> marked as 'unstable' in qapi as they are part of preconfig.
>> We achieve this by introducing a new 'x-machine-init' command to
>> stop the VM creation at a point where we can cold-plug devices.
>>
>> We are aware of the ongoing discussion about preconfig future (see
>> [1]). This patchset includes no major modifications from the v2 (but
>> the scope is reduced) and "x-machine-init" simply stops the
>> configuration between qemu_board_init() and qemu_create_cli_devices()
>> function calls.
>>
>> As a consequence, in the current state, the timeline is:
> 
> "current state" = with this series applied?

yes. this patchset adds the first two steps.

> 
>> + "x-machine-init" command
>> + "device_add" cold-plug commands (no fw_cfg legacy order support)
>> + "x-exit-preconfig" command will then trigger the following
>> + "-soundhw" CLI options
>> + "-fw_cfg" CLI options
>> + usb devices creation
>> + "-device" CLI options (with fw_cfg legacy order support)
>> + some other devices creation (with fw_cfg legacy order support)
>>
>> We don't know if the differences between -device/device_add are
>> acceptable or not. To reduce/remove them we could move the
>> "x-machine-init" stopping point. What do you think ?
> 
> I'm not sure I understand this paragraph.
> 
> I understand the difference between -device and device_add in master:
> cold vs. hot plug.
> 
> Your patch series makes "cold" device_add possible, i.e. it reduces
> (eliminates?) the difference between -device and device_add when the
> latter is "cold".

Yes.
Apart, before this patchset cold-plugging with device_add was not 
possible at all.

So, any difference between -device and a cold device_add is added here. 
(no bad intention, the patch did not move since v1 and this part of the 
code is just really tricky to understand...)

> 
> What difference remains that moving 'the "x-machine-init" stopping
> point' would 'reduce/remove'?

To answer this, let's take a look at qemu_create_cli_devices() (I 
removed some comments).

| 1 static void qemu_create_cli_devices(void)
| 2 {
| 3     DeviceOption *opt;
| 4
| 5     soundhw_init();
| 6
| 7     qemu_opts_foreach(qemu_find_opts("fw_cfg"),
| 8                       parse_fw_cfg, fw_cfg_find(), &error_fatal);
| 9
|10     /* init USB devices */
|11     if (machine_usb(current_machine)) {
|12         if (foreach_device_config(DEV_USB, usb_parse) < 0)
|13             exit(1);
|14     }
|15
|16     /* init generic devices */
|17     rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
|18     qemu_opts_foreach(qemu_find_opts("device"),
|19                       device_init_func, NULL, &error_fatal);
|20     QTAILQ_FOREACH(opt, &device_opts, next) {
|21         loc_push_restore(&opt->loc);
|22         qdev_device_add_from_qdict(opt->opts, true, &error_fatal);
|23         loc_pop(&opt->loc);
|24     }
|25     rom_reset_order_override();
|26 }

The configuration timeline is:
Line 3    : handle "-soundhw" (deprecated).
Line 7-8  : handle "-fw_cfg"
Line 10-14: related to USB devices
Line 18-19: handle "-device" CLI options (legacy cli format)
Line 20-24: handle "-device" CLI options (json format)

With this patchset implementation, we pause just before calling this 
function (it seemed logical to stop here, given the machine phase). But 
the above timeline happens after we paused and issued device_add to cold 
plug devices. As a consequence there is a difference between (1) giving 
a -device option and (2) issuing a device_add at this pause point.

The biggest difference is the fw_cfg option I think: it is related with 
the rom_set_order_override()/rom_reset_order_override() (line 17 and 
25). There is also the usb devices parts in between. I lack the 
knowledge about fw_cfg/usb to tell if it is important or not.

What I wanted to say is I don't know if the difference is acceptable. If 
we want device_add to support all -device use cases, it is not. In that 
case we need to stop either in the middle of this function (line 15) or 
at the end (better with your sketch in mind).

Note that rom_set_order_override()/rom_reset_order_override() only 
set/reset a switch variable that changes how fw_cfg files are sorted. It 
could be integrated into device_add code (and removed from the above 
function) without changing the behavior.

> 
>> Patches 1, 3 and 5 miss a review.
>>
>> The series is organized as follow:
>>
>> + Patches 1 and 2 converts the MachinePhase enum to a qapi definition
>>    and add the 'query-machine-phase'. It allows to introspect the
>>    current machine phase during preconfig as we will now be able to
>>    reach several machine phases using QMP.
> 
> If we fold MachinePhase into RunState, we can reuse query-status.
> 
> Having two state machines run one after the other feels like one too
> many.
> 
>> + Patch 3 adds the 'x-machine-init' QMP command to stop QEMU at
>>    machine-initialized phase during preconfig.
>> + Patch 4 allows issuing device_add QMP command during the
>>    machine-initialized phase.
>> + Patch 5 improves the doc about preconfig in consequence.
> 
> I understand you want to make progress towards machine configuration
> with QMP.  However, QEMU startup is (in my educated opinion) a hole, and
> we should be wary of digging deeper.
> 
> The "timeline" you gave above illustrates this.  It's a complicated
> shuffling of command line options and QMP commands that basically nobody
> can keep in working memory.  We have reshuffled it / made it more
> complicated quite a few times already to support new features.  Based on
> your cover letter, I figure you're making it more complicated once more.
> 
> At some point, we need to stop digging us deeper into the hole.  This is
> not an objection to merging your work.  It's a call to stop and think.

That's why we re-posted this as RFC. Reading the preconfig thread, I had 
the feeling what we've initially proposed 6 months ago was not going 
into the direction discussed in the thread. We don't want to put more 
effort in a dead-end but we are committed into fixing it so that it fits 
into the good direction.

Do you mean we should wait for "stabilize preconfig" thread to arrive to 
some conclusion before we continue to work on this ?

Thanks,
Damien

> 
> Let me quote the sketch I posted to the "Stabilize preconfig" thread:
> 
> 1. Start event loop
> 
> 2. Feed it CLI left to right.  Each option runs a handler just like each
>     QMP command does.
> 
>     Options that read a configuration file inject the file into the feed.
> 
>     Options that create a monitor create it suspended.
> 
>     Options may advance the phase / run state, and they may require
>     certain phase(s).
> 
> 3. When we're done with CLI, resume any monitors we created.
> 
> 4. Monitors now feed commands to the event loop.  Commands may advance
>     the phase / run state, and they may require certain phase(s).
> 
> Users can do as much or as little with the CLI as they want.  You'd
> probably want to set up a QMP monitor and no more.
> 
> device_add becomes possible at a certain state of the phase / run state
> machine.  It changes from cold to hot plug at a certain later state.
> 
>> [1]: https://lore.kernel.org/qemu-devel/b31f442d28920447690a6b8cee865bdbacde1283.1635160056.git.mprivozn@redhat.com
>>
>> Thanks for your feedback.
>

Markus Armbruster Nov. 24, 2021, 1:50 p.m. UTC | #3

Damien Hedde <damien.hedde@greensocs.com> writes:

> On 11/20/21 10:00, Markus Armbruster wrote:
>> Damien Hedde <damien.hedde@greensocs.com> writes:
>> 
>>> Hi all,
>>>
>>> This series adds support for cold-plugging devices using QMP
>>> commands. It is a step towards machine configuration using QMP, but
>>> it does not allow the user to add more devices than he could do with
>>> the CLI options before.
>>>
>>> Right now we can add a device using 2 ways:
>>> + giving "-device" CLI option. In that case the device is
>>>    cold-plugged: it is created before the machine becomes ready.
>>> + issuing device_add QMP command. In that case the device is
>>>    hot-plugged (the command can not be issued before the machine is
>>>    ready).
>>>
>>> This series allows to issue device_add QMP to cold-plug a device
>>> like we do with "-device" CLI option. All added QMP commands are
>>> marked as 'unstable' in qapi as they are part of preconfig.
>>> We achieve this by introducing a new 'x-machine-init' command to
>>> stop the VM creation at a point where we can cold-plug devices.
>>>
>>> We are aware of the ongoing discussion about preconfig future (see
>>> [1]). This patchset includes no major modifications from the v2 (but
>>> the scope is reduced) and "x-machine-init" simply stops the
>>> configuration between qemu_board_init() and qemu_create_cli_devices()
>>> function calls.
>>>
>>> As a consequence, in the current state, the timeline is:
>> 
>> "current state" = with this series applied?
>
> yes. this patchset adds the first two steps.
>
>>> + "x-machine-init" command
>>> + "device_add" cold-plug commands (no fw_cfg legacy order support)
>>> + "x-exit-preconfig" command will then trigger the following
>>> + "-soundhw" CLI options
>>> + "-fw_cfg" CLI options
>>> + usb devices creation
>>> + "-device" CLI options (with fw_cfg legacy order support)
>>> + some other devices creation (with fw_cfg legacy order support)
>>>
>>> We don't know if the differences between -device/device_add are
>>> acceptable or not. To reduce/remove them we could move the
>>> "x-machine-init" stopping point. What do you think ?
>> 
>> I'm not sure I understand this paragraph.
>> I understand the difference between -device and device_add in master:
>> cold vs. hot plug.
>> 
>> Your patch series makes "cold" device_add possible, i.e. it reduces
>> (eliminates?) the difference between -device and device_add when the
>> latter is "cold".
>
> Yes.
> Apart, before this patchset cold-plugging with device_add was not
> possible at all.
>
> So, any difference between -device and a cold device_add is added
> here. (no bad intention, the patch did not move since v1 and this part
> of the code is just really tricky to understand...)
>
>> What difference remains that moving 'the "x-machine-init" stopping
>> point' would 'reduce/remove'?
>
> To answer this, let's take a look at qemu_create_cli_devices() (I
> removed some comments).
>
> | 1 static void qemu_create_cli_devices(void)
> | 2 {
> | 3     DeviceOption *opt;
> | 4
> | 5     soundhw_init();
> | 6
> | 7     qemu_opts_foreach(qemu_find_opts("fw_cfg"),
> | 8                       parse_fw_cfg, fw_cfg_find(), &error_fatal);
> | 9
> |10     /* init USB devices */
> |11     if (machine_usb(current_machine)) {
> |12         if (foreach_device_config(DEV_USB, usb_parse) < 0)
> |13             exit(1);
> |14     }
> |15
> |16     /* init generic devices */
> |17     rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
> |18     qemu_opts_foreach(qemu_find_opts("device"),
> |19                       device_init_func, NULL, &error_fatal);
> |20     QTAILQ_FOREACH(opt, &device_opts, next) {
> |21         loc_push_restore(&opt->loc);
> |22         qdev_device_add_from_qdict(opt->opts, true, &error_fatal);
> |23         loc_pop(&opt->loc);
> |24     }
> |25     rom_reset_order_override();
> |26 }
>
> The configuration timeline is:
> Line 3    : handle "-soundhw" (deprecated).
> Line 7-8  : handle "-fw_cfg"
> Line 10-14: related to USB devices
> Line 18-19: handle "-device" CLI options (legacy cli format)
> Line 20-24: handle "-device" CLI options (json format)
>
> With this patchset implementation, we pause just before calling this
> function (it seemed logical to stop here, given the machine
> phase). But the above timeline happens after we paused and issued
> device_add to cold plug devices. As a consequence there is a
> difference between (1) giving a -device option and (2) issuing a
> device_add at this pause point.

I see.

> The biggest difference is the fw_cfg option I think: it is related
> with the rom_set_order_override()/rom_reset_order_override() (line 17
> and 25). There is also the usb devices parts in between. I lack the 
> knowledge about fw_cfg/usb to tell if it is important or not.
>
> What I wanted to say is I don't know if the difference is
> acceptable. If we want device_add to support all -device use cases, it
> is not. In that case we need to stop either in the middle of this
> function (line 15) or at the end (better with your sketch in mind).
>
> Note that rom_set_order_override()/rom_reset_order_override() only
> set/reset a switch variable that changes how fw_cfg files are
> sorted. It could be integrated into device_add code (and removed from
> the above function) without changing the behavior.

For me, the part that puts me off is interleaving CLI and QMP.

We process the CLI in an order few people understand, and only while
staring at the code.  That's bad.

Injecting QMP at certain points in that sequence can only make it worse.

If I understand your proposal correctly, we need special QMP commands to
opt into / manage QMP command injection, not least to avoid incompatible
change.

For instance, this needs to keep working:

1. Plug a SCSI HBA with CLI, say -device virtio-scsi,id=scsi-hba0

2. Plug a SCSI device with QMP, say {"execute": "device_add",
   "arguments": {"driver": "scsi-cd"}}

>>> Patches 1, 3 and 5 miss a review.
>>>
>>> The series is organized as follow:
>>>
>>> + Patches 1 and 2 converts the MachinePhase enum to a qapi definition
>>>    and add the 'query-machine-phase'. It allows to introspect the
>>>    current machine phase during preconfig as we will now be able to
>>>    reach several machine phases using QMP.
>> 
>> If we fold MachinePhase into RunState, we can reuse query-status.
>> 
>> Having two state machines run one after the other feels like one too
>> many.
>> 
>>> + Patch 3 adds the 'x-machine-init' QMP command to stop QEMU at
>>>    machine-initialized phase during preconfig.
>>> + Patch 4 allows issuing device_add QMP command during the
>>>    machine-initialized phase.
>>> + Patch 5 improves the doc about preconfig in consequence.
>> 
>> I understand you want to make progress towards machine configuration
>> with QMP.  However, QEMU startup is (in my educated opinion) a hole, and
>> we should be wary of digging deeper.
>> 
>> The "timeline" you gave above illustrates this.  It's a complicated
>> shuffling of command line options and QMP commands that basically nobody
>> can keep in working memory.  We have reshuffled it / made it more
>> complicated quite a few times already to support new features.  Based on
>> your cover letter, I figure you're making it more complicated once more.
>> 
>> At some point, we need to stop digging us deeper into the hole.  This is
>> not an objection to merging your work.  It's a call to stop and think.
>
> That's why we re-posted this as RFC. Reading the preconfig thread, I
> had the feeling what we've initially proposed 6 months ago was not
> going into the direction discussed in the thread. We don't want to put
> more effort in a dead-end but we are committed into fixing it so that
> it fits into the good direction.

Appreciated!

> Do you mean we should wait for "stabilize preconfig" thread to arrive
> to some conclusion before we continue to work on this ?

"Wait for" feels dangerously passive.  "Push for"?

> Thanks,
> Damien
>
>> Let me quote the sketch I posted to the "Stabilize preconfig" thread:
>> 
>> 1. Start event loop
>> 
>> 2. Feed it CLI left to right.  Each option runs a handler just like each
>>     QMP command does.
>> 
>>     Options that read a configuration file inject the file into the feed.
>> 
>>     Options that create a monitor create it suspended.
>> 
>>     Options may advance the phase / run state, and they may require
>>     certain phase(s).
>> 
>> 3. When we're done with CLI, resume any monitors we created.
>> 
>> 4. Monitors now feed commands to the event loop.  Commands may advance
>>     the phase / run state, and they may require certain phase(s).
>> 
>> Users can do as much or as little with the CLI as they want.  You'd
>> probably want to set up a QMP monitor and no more.
>> 
>> device_add becomes possible at a certain state of the phase / run state
>> machine.  It changes from cold to hot plug at a certain later state.
>> 
>>> [1]: https://lore.kernel.org/qemu-devel/b31f442d28920447690a6b8cee865bdbacde1283.1635160056.git.mprivozn@redhat.com
>>>
>>> Thanks for your feedback.
>>

Daniel P. Berrangé Nov. 24, 2021, 2:07 p.m. UTC | #4

On Wed, Nov 24, 2021 at 02:50:11PM +0100, Markus Armbruster wrote:
> Damien Hedde <damien.hedde@greensocs.com> writes:
> 
> > The biggest difference is the fw_cfg option I think: it is related
> > with the rom_set_order_override()/rom_reset_order_override() (line 17
> > and 25). There is also the usb devices parts in between. I lack the 
> > knowledge about fw_cfg/usb to tell if it is important or not.
> >
> > What I wanted to say is I don't know if the difference is
> > acceptable. If we want device_add to support all -device use cases, it
> > is not. In that case we need to stop either in the middle of this
> > function (line 15) or at the end (better with your sketch in mind).
> >
> > Note that rom_set_order_override()/rom_reset_order_override() only
> > set/reset a switch variable that changes how fw_cfg files are
> > sorted. It could be integrated into device_add code (and removed from
> > the above function) without changing the behavior.
> 
> For me, the part that puts me off is interleaving CLI and QMP.
> 
> We process the CLI in an order few people understand, and only while
> staring at the code.  That's bad.
> 
> Injecting QMP at certain points in that sequence can only make it worse.

Yep, I share your unease here.. especially wrt this quoted text
from later:

  > >> Users can do as much or as little with the CLI as they want.  You'd
  > >> probably want to set up a QMP monitor and no more.

I would say that is a case of overkill. It can only make our
lives harder as maintainers in the long term, if we have to
worry about such arbitrary mixing of QMP and CLI. This is
also why I'm pretty uneasy about the 'preconfig' stuff as
implemented today in general.

It is a half-way house that doesn't really give mgmt apps
what they want, which is a 100% QAPI-only config. If mgmt
apps start using preconfig, it won't make life any better
for them and will also lock QEMU maintainers into supporting
this half-way house.

We have a bit of a track record with QEMU of introducing
partial solutions and never quite finishing the job. There's
little strong incentive to ever finish it, if you can freely
mix both old and new style forever, and thus maintainers are
burdened forever with both.

IMHO, we should only try to support the non-mixed scenarios

  - 100% of hardware configured via CLI args
  - 100% of hardware configured via QAPI (whether live in
    QMP, or fed in via a QAPI based JSON/YAML config file)

so that we only have two clear cases we need to worry about
dealing with.

Focus our efforts 100% of the 100% QAPI scenario and don't
divert energy into short term hybrid solutions.


> >> Let me quote the sketch I posted to the "Stabilize preconfig" thread:
> >> 
> >> 1. Start event loop
> >> 
> >> 2. Feed it CLI left to right.  Each option runs a handler just like each
> >>     QMP command does.
> >> 
> >>     Options that read a configuration file inject the file into the feed.
> >> 
> >>     Options that create a monitor create it suspended.
> >> 
> >>     Options may advance the phase / run state, and they may require
> >>     certain phase(s).
> >> 
> >> 3. When we're done with CLI, resume any monitors we created.
> >> 
> >> 4. Monitors now feed commands to the event loop.  Commands may advance
> >>     the phase / run state, and they may require certain phase(s).
> >> 
> >> Users can do as much or as little with the CLI as they want.  You'd
> >> probably want to set up a QMP monitor and no more.
> >> 
> >> device_add becomes possible at a certain state of the phase / run state
> >> machine.  It changes from cold to hot plug at a certain later state.
> >> 
> >>> [1]: https://lore.kernel.org/qemu-devel/b31f442d28920447690a6b8cee865bdbacde1283.1635160056.git.mprivozn@redhat.com
> >>>
> >>> Thanks for your feedback.
> >> 
> 

Regards,
Daniel

Markus Armbruster Nov. 24, 2021, 2:51 p.m. UTC | #5

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Nov 24, 2021 at 02:50:11PM +0100, Markus Armbruster wrote:
>> Damien Hedde <damien.hedde@greensocs.com> writes:
>> 
>> > The biggest difference is the fw_cfg option I think: it is related
>> > with the rom_set_order_override()/rom_reset_order_override() (line 17
>> > and 25). There is also the usb devices parts in between. I lack the 
>> > knowledge about fw_cfg/usb to tell if it is important or not.
>> >
>> > What I wanted to say is I don't know if the difference is
>> > acceptable. If we want device_add to support all -device use cases, it
>> > is not. In that case we need to stop either in the middle of this
>> > function (line 15) or at the end (better with your sketch in mind).
>> >
>> > Note that rom_set_order_override()/rom_reset_order_override() only
>> > set/reset a switch variable that changes how fw_cfg files are
>> > sorted. It could be integrated into device_add code (and removed from
>> > the above function) without changing the behavior.
>> 
>> For me, the part that puts me off is interleaving CLI and QMP.
>> 
>> We process the CLI in an order few people understand, and only while
>> staring at the code.  That's bad.
>> 
>> Injecting QMP at certain points in that sequence can only make it worse.
>
> Yep, I share your unease here.. especially wrt this quoted text
> from later:
>
>   > >> Users can do as much or as little with the CLI as they want.  You'd
>   > >> probably want to set up a QMP monitor and no more.
>
> I would say that is a case of overkill. It can only make our
> lives harder as maintainers in the long term, if we have to
> worry about such arbitrary mixing of QMP and CLI. This is
> also why I'm pretty uneasy about the 'preconfig' stuff as
> implemented today in general.
>
> It is a half-way house that doesn't really give mgmt apps
> what they want, which is a 100% QAPI-only config. If mgmt
> apps start using preconfig, it won't make life any better
> for them and will also lock QEMU maintainers into supporting
> this half-way house.

Misunderstanding?  The paragraph you quoted is about this design:

    1. Start event loop
    
    2. Feed it CLI left to right.  Each option runs a handler just like each
        QMP command does.
    
        Options that read a configuration file inject the file into the feed.
    
        Options that create a monitor create it suspended.
    
        Options may advance the phase / run state, and they may require
        certain phase(s).
    
    3. When we're done with CLI, resume any monitors we created.
    
    4. Monitors now feed commands to the event loop.  Commands may advance
        the phase / run state, and they may require certain phase(s).
    
    Users can do as much or as little with the CLI as they want.  You'd
    probably want to set up a QMP monitor and no more.
    
    device_add becomes possible at a certain state of the phase / run state
    machine.  It changes from cold to hot plug at a certain later state.

Certainly enables 100% QAPI-only config.  It just doesn't *force* you to
100%.  Feature.

> We have a bit of a track record with QEMU of introducing
> partial solutions and never quite finishing the job. There's
> little strong incentive to ever finish it, if you can freely
> mix both old and new style forever, and thus maintainers are
> burdened forever with both.
>
> IMHO, we should only try to support the non-mixed scenarios
>
>   - 100% of hardware configured via CLI args
>   - 100% of hardware configured via QAPI (whether live in
>     QMP, or fed in via a QAPI based JSON/YAML config file)
>
> so that we only have two clear cases we need to worry about
> dealing with.
>
> Focus our efforts 100% of the 100% QAPI scenario and don't
> divert energy into short term hybrid solutions.

The design above pretty much requires 100% QAPI.

It's based on the notion that there's no real difference between a CLI
option and a QMP command that doesn't return a value.  So treat the CLI
more like a monitor.

For sanity's sake, make it not race with the other monitors by starting
them suspended.

This design is arguably *less* hybrid than one that treats a (severely
dumbed down) CLI unlike a monitor.

Damien Hedde Nov. 25, 2021, 12:42 p.m. UTC | #6

On 11/24/21 15:51, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
>> On Wed, Nov 24, 2021 at 02:50:11PM +0100, Markus Armbruster wrote:
>>> Damien Hedde <damien.hedde@greensocs.com> writes:
>>>
>>>> The biggest difference is the fw_cfg option I think: it is related
>>>> with the rom_set_order_override()/rom_reset_order_override() (line 17
>>>> and 25). There is also the usb devices parts in between. I lack the
>>>> knowledge about fw_cfg/usb to tell if it is important or not.
>>>>
>>>> What I wanted to say is I don't know if the difference is
>>>> acceptable. If we want device_add to support all -device use cases, it
>>>> is not. In that case we need to stop either in the middle of this
>>>> function (line 15) or at the end (better with your sketch in mind).
>>>>
>>>> Note that rom_set_order_override()/rom_reset_order_override() only
>>>> set/reset a switch variable that changes how fw_cfg files are
>>>> sorted. It could be integrated into device_add code (and removed from
>>>> the above function) without changing the behavior.
>>>
>>> For me, the part that puts me off is interleaving CLI and QMP.
>>>
>>> We process the CLI in an order few people understand, and only while
>>> staring at the code.  That's bad.
>>>
>>> Injecting QMP at certain points in that sequence can only make it worse.
>>
>> Yep, I share your unease here.. especially wrt this quoted text
>> from later:
>>
>>    > >> Users can do as much or as little with the CLI as they want.  You'd
>>    > >> probably want to set up a QMP monitor and no more.
>>
>> I would say that is a case of overkill. It can only make our
>> lives harder as maintainers in the long term, if we have to
>> worry about such arbitrary mixing of QMP and CLI. This is
>> also why I'm pretty uneasy about the 'preconfig' stuff as
>> implemented today in general.
>>
>> It is a half-way house that doesn't really give mgmt apps
>> what they want, which is a 100% QAPI-only config. If mgmt
>> apps start using preconfig, it won't make life any better
>> for them and will also lock QEMU maintainers into supporting
>> this half-way house.
> 
> Misunderstanding?  The paragraph you quoted is about this design:
> 
>      1. Start event loop
>      
>      2. Feed it CLI left to right.  Each option runs a handler just like each
>          QMP command does.
>      
>          Options that read a configuration file inject the file into the feed.
>      
>          Options that create a monitor create it suspended.
>      
>          Options may advance the phase / run state, and they may require
>          certain phase(s).
>      
>      3. When we're done with CLI, resume any monitors we created.
>      
>      4. Monitors now feed commands to the event loop.  Commands may advance
>          the phase / run state, and they may require certain phase(s).
>      
>      Users can do as much or as little with the CLI as they want.  You'd
>      probably want to set up a QMP monitor and no more.
>      
>      device_add becomes possible at a certain state of the phase / run state
>      machine.  It changes from cold to hot plug at a certain later state.
> 
> Certainly enables 100% QAPI-only config.  It just doesn't *force* you to
> 100%.  Feature.
> 
>> We have a bit of a track record with QEMU of introducing
>> partial solutions and never quite finishing the job. There's
>> little strong incentive to ever finish it, if you can freely
>> mix both old and new style forever, and thus maintainers are
>> burdened forever with both.
>>
>> IMHO, we should only try to support the non-mixed scenarios
>>
>>    - 100% of hardware configured via CLI args
>>    - 100% of hardware configured via QAPI (whether live in
>>      QMP, or fed in via a QAPI based JSON/YAML config file)
>>
>> so that we only have two clear cases we need to worry about
>> dealing with.
>>
>> Focus our efforts 100% of the 100% QAPI scenario and don't
>> divert energy into short term hybrid solutions.
> 
> The design above pretty much requires 100% QAPI.
> 
> It's based on the notion that there's no real difference between a CLI
> option and a QMP command that doesn't return a value.  So treat the CLI
> more like a monitor.
> 
> For sanity's sake, make it not race with the other monitors by starting
> them suspended.
> 
> This design is arguably *less* hybrid than one that treats a (severely
> dumbed down) CLI unlike a monitor.
>

It seems there is a big gap from where we are now to a full QAPI startup 
support.
Could we adopt a plan which would allow us to progress from where we are 
to full QAPI support in small steps ?

For example, the following:

  1. CLI/QMP interleaving seems to be big issue right now. We could 
solve this by making -preconfig stop only after all CLI options are 
"consumed".
For example if you give -preconfig and some -device, qemu won't stop 
before having created the devices.

Meaning you would do
$qemu [out-of-order CLI with -preconfig] then jump into the monitors.

Depending on your use case, you would have to give a few CLI options so 
that -preconfig stops early enough.

  2. Then we can enable QMP commands one by one corresponding to 
unsupported and needed/cleaned up CLI options. They will check and/or 
advance the phase/runstate.

Basically this would mean we have to first convert CLI options that are 
used last in the startup procedure.
Along the road we will be able to make -preconfig stop earlier and earlier.

You could do a ~0% CLI startup at any point during the development
$qemu -monitor... --preconfig
but you would have a reduced set of configuration possibilities due the 
lack of QAPI support.

In addition, optionally, if we want to go with the left-to-right CLI 
parsing and reading a CLI script file like Markus proposed, we could 
have something like:
$qemu [out-of-order CLI] --preconfig [in-order CLI]
with [in-order CLI] being a 1:1 equivalent of QMP commands. [in-order 
CLI] will still be parsed and executed before jumping in the monitors.

Damien

Daniel P. Berrangé Nov. 25, 2021, 12:55 p.m. UTC | #7

On Wed, Nov 24, 2021 at 03:51:23PM +0100, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Wed, Nov 24, 2021 at 02:50:11PM +0100, Markus Armbruster wrote:
> >> Damien Hedde <damien.hedde@greensocs.com> writes:
> >> 
> >> > The biggest difference is the fw_cfg option I think: it is related
> >> > with the rom_set_order_override()/rom_reset_order_override() (line 17
> >> > and 25). There is also the usb devices parts in between. I lack the 
> >> > knowledge about fw_cfg/usb to tell if it is important or not.
> >> >
> >> > What I wanted to say is I don't know if the difference is
> >> > acceptable. If we want device_add to support all -device use cases, it
> >> > is not. In that case we need to stop either in the middle of this
> >> > function (line 15) or at the end (better with your sketch in mind).
> >> >
> >> > Note that rom_set_order_override()/rom_reset_order_override() only
> >> > set/reset a switch variable that changes how fw_cfg files are
> >> > sorted. It could be integrated into device_add code (and removed from
> >> > the above function) without changing the behavior.
> >> 
> >> For me, the part that puts me off is interleaving CLI and QMP.
> >> 
> >> We process the CLI in an order few people understand, and only while
> >> staring at the code.  That's bad.
> >> 
> >> Injecting QMP at certain points in that sequence can only make it worse.
> >
> > Yep, I share your unease here.. especially wrt this quoted text
> > from later:
> >
> >   > >> Users can do as much or as little with the CLI as they want.  You'd
> >   > >> probably want to set up a QMP monitor and no more.
> >
> > I would say that is a case of overkill. It can only make our
> > lives harder as maintainers in the long term, if we have to
> > worry about such arbitrary mixing of QMP and CLI. This is
> > also why I'm pretty uneasy about the 'preconfig' stuff as
> > implemented today in general.
> >
> > It is a half-way house that doesn't really give mgmt apps
> > what they want, which is a 100% QAPI-only config. If mgmt
> > apps start using preconfig, it won't make life any better
> > for them and will also lock QEMU maintainers into supporting
> > this half-way house.
> 
> Misunderstanding?  The paragraph you quoted is about this design:
> 
>     1. Start event loop
>     
>     2. Feed it CLI left to right.  Each option runs a handler just like each
>         QMP command does.
>     
>         Options that read a configuration file inject the file into the feed.
>     
>         Options that create a monitor create it suspended.
>     
>         Options may advance the phase / run state, and they may require
>         certain phase(s).
>     
>     3. When we're done with CLI, resume any monitors we created.
>     
>     4. Monitors now feed commands to the event loop.  Commands may advance
>         the phase / run state, and they may require certain phase(s).
>     
>     Users can do as much or as little with the CLI as they want.  You'd
>     probably want to set up a QMP monitor and no more.
>     
>     device_add becomes possible at a certain state of the phase / run state
>     machine.  It changes from cold to hot plug at a certain later state.
> 
> Certainly enables 100% QAPI-only config.  It just doesn't *force* you to
> 100%.  Feature.

This is far away from how our CLI handling works today, since we don't
require left-to-right args. Converting existing binaries to this
approach is going to be hard with a high risk of regressions. This
is especiall true if we try todo an incremental conversion, of
different pieces of the CLI and allow arbitrary mixing of CLI and
QMP throughout.

IMHO a pre-requisite for changing CLI arg processing ordering, is
a fresh binary that leaves QemuOpts behind for its CLI, so any
usage is consistent with QAPI. 

> > We have a bit of a track record with QEMU of introducing
> > partial solutions and never quite finishing the job. There's
> > little strong incentive to ever finish it, if you can freely
> > mix both old and new style forever, and thus maintainers are
> > burdened forever with both.
> >
> > IMHO, we should only try to support the non-mixed scenarios
> >
> >   - 100% of hardware configured via CLI args
> >   - 100% of hardware configured via QAPI (whether live in
> >     QMP, or fed in via a QAPI based JSON/YAML config file)
> >
> > so that we only have two clear cases we need to worry about
> > dealing with.
> >
> > Focus our efforts 100% of the 100% QAPI scenario and don't
> > divert energy into short term hybrid solutions.
> 
> The design above pretty much requires 100% QAPI.
> 
> It's based on the notion that there's no real difference between a CLI
> option and a QMP command that doesn't return a value.  So treat the CLI
> more like a monitor.
> 
> For sanity's sake, make it not race with the other monitors by starting
> them suspended.
> 
> This design is arguably *less* hybrid than one that treats a (severely
> dumbed down) CLI unlike a monitor.

Yes, my concern is more about how that gets introduced into the code
for the existing binaries, vs having a clean break where we're 100%
QAPI and no back compat CLI options exist.

Regards,
Daniel

Dr. David Alan Gilbert Nov. 29, 2021, 7:55 p.m. UTC | #8

* Markus Armbruster (armbru@redhat.com) wrote:
> Damien Hedde <damien.hedde@greensocs.com> writes:

> > Patches 1, 3 and 5 miss a review.
> >
> > The series is organized as follow:
> >
> > + Patches 1 and 2 converts the MachinePhase enum to a qapi definition
> >   and add the 'query-machine-phase'. It allows to introspect the
> >   current machine phase during preconfig as we will now be able to
> >   reach several machine phases using QMP.
> 
> If we fold MachinePhase into RunState, we can reuse query-status.
> 
> Having two state machines run one after the other feels like one too
> many.

Be careful, the RunState is API and things watch for events on it, so
any changes to it are delicate.

Dave

> > + Patch 3 adds the 'x-machine-init' QMP command to stop QEMU at
> >   machine-initialized phase during preconfig.
> > + Patch 4 allows issuing device_add QMP command during the
> >   machine-initialized phase.
> > + Patch 5 improves the doc about preconfig in consequence. 
> 
> I understand you want to make progress towards machine configuration
> with QMP.  However, QEMU startup is (in my educated opinion) a hole, and
> we should be wary of digging deeper.
> 
> The "timeline" you gave above illustrates this.  It's a complicated
> shuffling of command line options and QMP commands that basically nobody
> can keep in working memory.  We have reshuffled it / made it more
> complicated quite a few times already to support new features.  Based on
> your cover letter, I figure you're making it more complicated once more.
> 
> At some point, we need to stop digging us deeper into the hole.  This is
> not an objection to merging your work.  It's a call to stop and think.
> 
> Let me quote the sketch I posted to the "Stabilize preconfig" thread:
> 
> 1. Start event loop
> 
> 2. Feed it CLI left to right.  Each option runs a handler just like each
>    QMP command does.
> 
>    Options that read a configuration file inject the file into the feed.
> 
>    Options that create a monitor create it suspended.
> 
>    Options may advance the phase / run state, and they may require
>    certain phase(s).
> 
> 3. When we're done with CLI, resume any monitors we created.
> 
> 4. Monitors now feed commands to the event loop.  Commands may advance
>    the phase / run state, and they may require certain phase(s).
> 
> Users can do as much or as little with the CLI as they want.  You'd
> probably want to set up a QMP monitor and no more.
> 
> device_add becomes possible at a certain state of the phase / run state
> machine.  It changes from cold to hot plug at a certain later state.
> 
> > [1]: https://lore.kernel.org/qemu-devel/b31f442d28920447690a6b8cee865bdbacde1283.1635160056.git.mprivozn@redhat.com
> >
> > Thanks for your feedback.
>

[RFC,v3,0/5] QMP support for cold-plugging devices

Message

Comments