diff mbox

Fix exit on 'pci_add' Monitor command

Message ID 20090924111601.1d3668d2@doriath
State Superseded
Headers show

Commit Message

Luiz Capitulino Sept. 24, 2009, 2:16 p.m. UTC
If the user issues one of the following commands to the Monitor:

pci_add pci_addr=auto nic model=None
pci_add pci_addr=auto nic model=?

QEMU will exit, because the function used to perform sanity
checks (qemu_check_nic_model_list()) exits on error.

This function is used by the startup code, where it makes
sense to exit on error, but in the Monitor it doesn't.

Changing qemu_check_nic_model_list() to not exit on error
is not possible though, as it's used by the board init
code (the PC one), where all board specific code must have
void return.

The way I've chosen to fix this was to introduce a new function
called pci_nic_supported(), which checks if the NIC is supported
and returns true or false accordingly.

The new function is used only by the Monitor, it performs the
necessary check and returns an error in case the NIC is not
supported, thus qemu_check_nic_model_list()'s exit is never trigged.

The following should be observed:

1. Only the specified NIC is checked, the default one is assumed
to be supported

2. The NIC query command (model=?) won't work with pci_add, the
right way to do this with the Monitor is to add a new command

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
---
 hw/pci-hotplug.c |    4 ++++
 hw/pci.c         |   11 +++++++++++
 hw/pci.h         |    1 +
 3 files changed, 16 insertions(+), 0 deletions(-)

Comments

Markus Armbruster Sept. 24, 2009, 6:12 p.m. UTC | #1
Luiz Capitulino <lcapitulino@redhat.com> writes:

> If the user issues one of the following commands to the Monitor:
>
> pci_add pci_addr=auto nic model=None
> pci_add pci_addr=auto nic model=?
>
> QEMU will exit, because the function used to perform sanity
> checks (qemu_check_nic_model_list()) exits on error.

Yes.  I meant to fix this, but you beat me to the line.

There might be more bugs like this one.

> This function is used by the startup code, where it makes
> sense to exit on error, but in the Monitor it doesn't.
>
> Changing qemu_check_nic_model_list() to not exit on error
> is not possible though, as it's used by the board init
> code (the PC one), where all board specific code must have
> void return.
>
> The way I've chosen to fix this was to introduce a new function
> called pci_nic_supported(), which checks if the NIC is supported
> and returns true or false accordingly.
>
> The new function is used only by the Monitor, it performs the
> necessary check and returns an error in case the NIC is not
> supported, thus qemu_check_nic_model_list()'s exit is never trigged.
>
> The following should be observed:
>
> 1. Only the specified NIC is checked, the default one is assumed
> to be supported
>
> 2. The NIC query command (model=?) won't work with pci_add, the
> right way to do this with the Monitor is to add a new command
>
> Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

It's a minimal fix, and I trust it works.  But is it the proper fix?

It works by checking the model before calling pci_nic_init() on behalf
of monitor cmmand "pci_add ... nic ...", so that when pci_nic_init()
checks the model again, it always succeeds, and thus never exits.

My minor complaint is that the new model check pci_nic_supported()
duplicates the existing check in qemu_check_nic_model_list().

My major complaint is that I'd rather see the code cleaned up there.
It's perfectly fine for code that can run only during startup to
terminate the program on configuration error.  Code to be used after
startup (used from monitor, in particular) must not do that.  Instead,
it should return failure up the call chain, until we reach either
startup code or monitor code, where the policy how to handle the error
resides.
Luiz Capitulino Sept. 24, 2009, 7:31 p.m. UTC | #2
On Thu, 24 Sep 2009 20:12:30 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Luiz Capitulino <lcapitulino@redhat.com> writes:
> 
> > If the user issues one of the following commands to the Monitor:
> >
> > pci_add pci_addr=auto nic model=None
> > pci_add pci_addr=auto nic model=?
> >
> > QEMU will exit, because the function used to perform sanity
> > checks (qemu_check_nic_model_list()) exits on error.
> 
> Yes.  I meant to fix this, but you beat me to the line.
> 
> There might be more bugs like this one.

 Yeah.

> > This function is used by the startup code, where it makes
> > sense to exit on error, but in the Monitor it doesn't.
> >
> > Changing qemu_check_nic_model_list() to not exit on error
> > is not possible though, as it's used by the board init
> > code (the PC one), where all board specific code must have
> > void return.
> >
> > The way I've chosen to fix this was to introduce a new function
> > called pci_nic_supported(), which checks if the NIC is supported
> > and returns true or false accordingly.
> >
> > The new function is used only by the Monitor, it performs the
> > necessary check and returns an error in case the NIC is not
> > supported, thus qemu_check_nic_model_list()'s exit is never trigged.
> >
> > The following should be observed:
> >
> > 1. Only the specified NIC is checked, the default one is assumed
> > to be supported
> >
> > 2. The NIC query command (model=?) won't work with pci_add, the
> > right way to do this with the Monitor is to add a new command
> >
> > Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
> 
> It's a minimal fix, and I trust it works.  But is it the proper fix?

 Well, given the current situation I agree it's not perfect but
I think it's good enough.

> It works by checking the model before calling pci_nic_init() on behalf
> of monitor cmmand "pci_add ... nic ...", so that when pci_nic_init()
> checks the model again, it always succeeds, and thus never exits.
> 
> My minor complaint is that the new model check pci_nic_supported()
> duplicates the existing check in qemu_check_nic_model_list().

 Yes, I don't like this either.. Although qemu_check_nic_model_list()
also checks for the default model, this is a bonus check. :)

> My major complaint is that I'd rather see the code cleaned up there.
> It's perfectly fine for code that can run only during startup to
> terminate the program on configuration error.  Code to be used after
> startup (used from monitor, in particular) must not do that.  Instead,
> it should return failure up the call chain, until we reach either
> startup code or monitor code, where the policy how to handle the error
> resides.

 What cleanup do you suggest?

 Note that it's not only about exit(), the function also has some
fprintf()s. If a big refactor is needed to properly fix this,
I guess we will have to live with the bug for a long time...
Mark McLoughlin Sept. 24, 2009, 8:07 p.m. UTC | #3
On Thu, 2009-09-24 at 20:12 +0200, Markus Armbruster wrote:
> My major complaint is that I'd rather see the code cleaned up there.
> It's perfectly fine for code that can run only during startup to
> terminate the program on configuration error.  Code to be used after
> startup (used from monitor, in particular) must not do that.  Instead,
> it should return failure up the call chain, until we reach either
> startup code or monitor code, where the policy how to handle the error
> resides.

Agree, I'd like to see it cleaned up.

However, Luiz's patch fixes the most serious side effect without a major
re-factoring, so I'd like to see that go in first (and stable-0.11) and
do the re-factoring later.

Cheers,
Mark.
Markus Armbruster Sept. 25, 2009, 1:50 a.m. UTC | #4
Luiz Capitulino <lcapitulino@redhat.com> writes:

> On Thu, 24 Sep 2009 20:12:30 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
[...]
>> My major complaint is that I'd rather see the code cleaned up there.
>> It's perfectly fine for code that can run only during startup to
>> terminate the program on configuration error.  Code to be used after
>> startup (used from monitor, in particular) must not do that.  Instead,
>> it should return failure up the call chain, until we reach either
>> startup code or monitor code, where the policy how to handle the error
>> resides.
>
>  What cleanup do you suggest?
>
>  Note that it's not only about exit(), the function also has some
> fprintf()s. If a big refactor is needed to properly fix this,
> I guess we will have to live with the bug for a long time...

See the patch series I'm going to post.
Markus Armbruster Sept. 25, 2009, 1:51 a.m. UTC | #5
Mark McLoughlin <markmc@redhat.com> writes:

> On Thu, 2009-09-24 at 20:12 +0200, Markus Armbruster wrote:
>> My major complaint is that I'd rather see the code cleaned up there.
>> It's perfectly fine for code that can run only during startup to
>> terminate the program on configuration error.  Code to be used after
>> startup (used from monitor, in particular) must not do that.  Instead,
>> it should return failure up the call chain, until we reach either
>> startup code or monitor code, where the policy how to handle the error
>> resides.
>
> Agree, I'd like to see it cleaned up.
>
> However, Luiz's patch fixes the most serious side effect without a major
> re-factoring, so I'd like to see that go in first (and stable-0.11) and
> do the re-factoring later.

Certainly fine with me if my cleanup is deemed to invasive for stable.
diff mbox

Patch

diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c
index f3dc421..89974a0 100644
--- a/hw/pci-hotplug.c
+++ b/hw/pci-hotplug.c
@@ -46,6 +46,10 @@  static PCIDevice *qemu_pci_hot_add_nic(Monitor *mon,
         monitor_printf(mon, "Parameter addr not supported\n");
         return NULL;
     }
+
+    if (nd_table[ret].model && !pci_nic_supported(nd_table[ret].model))
+        return NULL;
+
     return pci_nic_init(&nd_table[ret], "rtl8139", devaddr);
 }
 
diff --git a/hw/pci.c b/hw/pci.c
index 64d70ed..c5975bc 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -806,6 +806,17 @@  static const char * const pci_nic_names[] = {
     NULL
 };
 
+int pci_nic_supported(const char *model)
+{
+    int i;
+
+    for (i = 0; pci_nic_names[i]; i++)
+        if (strcmp(model, pci_nic_names[i]) == 0)
+            return 1;
+
+    return 0;
+}
+
 /* Initialize a PCI NIC.  */
 PCIDevice *pci_nic_init(NICInfo *nd, const char *default_model,
                         const char *default_devaddr)
diff --git a/hw/pci.h b/hw/pci.h
index caba5c8..ba748ff 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -252,6 +252,7 @@  PCIBus *pci_register_bus(DeviceState *parent, const char *name,
                          pci_set_irq_fn set_irq, pci_map_irq_fn map_irq,
                          void *irq_opaque, int devfn_min, int nirq);
 
+int pci_nic_supported(const char *model);
 PCIDevice *pci_nic_init(NICInfo *nd, const char *default_model,
                         const char *default_devaddr);
 void pci_data_write(void *opaque, uint32_t addr, uint32_t val, int len);