Message ID | 7d54c33fa2a928fd317f3a95af61f854f43ba23c.1435195913.git.zhugh.fnst@cn.fujitsu.com |
---|---|
State | New |
Headers | show |
Am 25.06.2015 um 04:17 schrieb Zhu Guihua: > Add a wrapper to specify reset order when registering reset handler, > instead of non-obvious initiazation code ordering. "initialization", and this sentence is not really telling to me. What issue is this solving or, more likely, working around? In the next patch the APIC is being moved from ICC bus (which, as it has a device parent, would seem to not register its own reset handler in qdev.c) to no bus and thus seemingly no implicit reset handler either. We've been trying to get away from reset handlers and move to reset callbacks that propagate through devices and buses for a long time. This patch feels like a step backwards. If there is indeed no simpler solution, this at least deserves a better justification. If more people start using values like 0x1, we'll have the same ordering issues unless we use some global enum to coordinate them. > > Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> > --- > include/hw/hw.h | 4 ++++ > vl.c | 18 +++++++++++++++++- > 2 files changed, 21 insertions(+), 1 deletion(-) Regards, Andreas
On 25/06/2015 04:17, Zhu Guihua wrote: > Add a wrapper to specify reset order when registering reset handler, > instead of non-obvious initiazation code ordering. > > Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> I'm sorry, this is not really acceptable. The initialization code ordering is good because it should be okay to run reset handlers in the same order as code is run. If there are dependencies between reset handlers, a random integer is not a maintainable way to maintain them. Instead, you should have a single reset handler that calls the reset handlers in the right order; for example a qdev bus such as icc_bus always resets children before parents. Are you sure that you want to remove the icc_bus?... What are you gaining exactly by doing so? Paolo > --- > include/hw/hw.h | 4 ++++ > vl.c | 18 +++++++++++++++++- > 2 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/include/hw/hw.h b/include/hw/hw.h > index c78adae..d9375e7 100644 > --- a/include/hw/hw.h > +++ b/include/hw/hw.h > @@ -37,7 +37,11 @@ > #endif > > typedef void QEMUResetHandler(void *opaque); > +typedef uint64_t QEMUResetOrder; > +#define default_reset_order 0x0 > > +void qemu_register_reset_common(QEMUResetHandler *func, void *opaque, > + QEMUResetOrder reset_order); > void qemu_register_reset(QEMUResetHandler *func, void *opaque); > void qemu_unregister_reset(QEMUResetHandler *func, void *opaque); > > diff --git a/vl.c b/vl.c > index 69ad90c..b205a9b 100644 > --- a/vl.c > +++ b/vl.c > @@ -1589,6 +1589,7 @@ typedef struct QEMUResetEntry { > QTAILQ_ENTRY(QEMUResetEntry) entry; > QEMUResetHandler *func; > void *opaque; > + QEMUResetOrder reset_order; > } QEMUResetEntry; > > static QTAILQ_HEAD(reset_handlers, QEMUResetEntry) reset_handlers = > @@ -1672,15 +1673,30 @@ static int qemu_debug_requested(void) > return r; > } > > -void qemu_register_reset(QEMUResetHandler *func, void *opaque) > +void qemu_register_reset_common(QEMUResetHandler *func, void *opaque, > + QEMUResetOrder reset_order) > { > + QEMUResetEntry *item; > QEMUResetEntry *re = g_malloc0(sizeof(QEMUResetEntry)); > > re->func = func; > re->opaque = opaque; > + re->reset_order = reset_order; > + > + QTAILQ_FOREACH(item, &reset_handlers, entry) { > + if (re->reset_order >= item->reset_order) > + continue; > + QTAILQ_INSERT_BEFORE(item, re, entry); > + return; > + } > QTAILQ_INSERT_TAIL(&reset_handlers, re, entry); > } > > +void qemu_register_reset(QEMUResetHandler *func, void *opaque) > +{ > + qemu_register_reset_common(func, opaque, default_reset_order); > +} > + > void qemu_unregister_reset(QEMUResetHandler *func, void *opaque) > { > QEMUResetEntry *re; >
Am 25.06.2015 um 19:00 schrieb Paolo Bonzini: > On 25/06/2015 04:17, Zhu Guihua wrote: >> Add a wrapper to specify reset order when registering reset handler, >> instead of non-obvious initiazation code ordering. >> >> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> > > I'm sorry, this is not really acceptable. The initialization code > ordering is good because it should be okay to run reset handlers in the > same order as code is run. If there are dependencies between reset > handlers, a random integer is not a maintainable way to maintain them. > > Instead, you should have a single reset handler that calls the reset > handlers in the right order; for example a qdev bus such as icc_bus > always resets children before parents. > > Are you sure that you want to remove the icc_bus?... What are you > gaining exactly by doing so? From my view we would be gaining by making the APIC an integral part (child<>) of the CPU in a follow-up step (there's a TODO to make things link<>s). But either way the CPU's existing reset handler should be able to handle CPU/APIC interdependencies just fine, somehow. Which is what Eduardo said on v6 and v7. (Another alternative he raised was a machine init notifier, but I see no code for that after its mention on v7?) Cheers, Andreas
On Thu, 25 Jun 2015 19:28:51 +0200 Andreas Färber <afaerber@suse.de> wrote: > Am 25.06.2015 um 19:00 schrieb Paolo Bonzini: > > On 25/06/2015 04:17, Zhu Guihua wrote: > >> Add a wrapper to specify reset order when registering reset handler, > >> instead of non-obvious initiazation code ordering. > >> > >> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> > > > > I'm sorry, this is not really acceptable. The initialization code > > ordering is good because it should be okay to run reset handlers in the > > same order as code is run. If there are dependencies between reset > > handlers, a random integer is not a maintainable way to maintain them. > > > > Instead, you should have a single reset handler that calls the reset > > handlers in the right order; for example a qdev bus such as icc_bus > > always resets children before parents. > > > > Are you sure that you want to remove the icc_bus?... What are you > > gaining exactly by doing so? > > From my view we would be gaining by making the APIC an integral part > (child<>) of the CPU in a follow-up step (there's a TODO to make things > link<>s). That's one of the reasons, I've asked to get rid of icc-bus. Another reason is to move default APIC MMIO mapping to CPU from board and handle region remapping from CPU itself as it's supposed to be if CPU would program another APIC base. Paolo, would following change be acceptable: x86_cpu_realize() { if (tcg) { memory_region_add_subregion_overlap(cpu->as, apic->mr) } else { if (map_once) memory_region_add_subregion_overlap(get_system_memory(), apic->mr) } } still a hack but a localized to CPU > > But either way the CPU's existing reset handler should be able to handle > CPU/APIC interdependencies just fine, somehow. Which is what Eduardo > said on v6 and v7. (Another alternative he raised was a machine init > notifier, but I see no code for that after its mention on v7?) > > Cheers, > Andreas >
On 26/06/2015 11:19, Igor Mammedov wrote: > That's one of the reasons, I've asked to get rid of icc-bus. > > Another reason is to move default APIC MMIO mapping to CPU from board > and handle region remapping from CPU itself as it's supposed to be if CPU > would program another APIC base. > > Paolo, > would following change be acceptable: > > x86_cpu_realize() { > if (tcg) { > memory_region_add_subregion_overlap(cpu->as, apic->mr) > } else { > if (map_once) > memory_region_add_subregion_overlap(get_system_memory(), apic->mr) > } > } > still a hack but a localized to CPU Yes, this is better. Paolo
On 06/26/2015 01:28 AM, Andreas Färber wrote: > Am 25.06.2015 um 19:00 schrieb Paolo Bonzini: >> On 25/06/2015 04:17, Zhu Guihua wrote: >>> Add a wrapper to specify reset order when registering reset handler, >>> instead of non-obvious initiazation code ordering. >>> >>> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> >> I'm sorry, this is not really acceptable. The initialization code >> ordering is good because it should be okay to run reset handlers in the >> same order as code is run. If there are dependencies between reset >> handlers, a random integer is not a maintainable way to maintain them. >> >> Instead, you should have a single reset handler that calls the reset >> handlers in the right order; for example a qdev bus such as icc_bus >> always resets children before parents. >> >> Are you sure that you want to remove the icc_bus?... What are you >> gaining exactly by doing so? > >From my view we would be gaining by making the APIC an integral part > (child<>) of the CPU in a follow-up step (there's a TODO to make things > link<>s). > > But either way the CPU's existing reset handler should be able to handle > CPU/APIC interdependencies just fine, somehow. Which is what Eduardo > said on v6 and v7. (Another alternative he raised was a machine init > notifier, but I see no code for that after its mention on v7?) According to Eduardo's suggestions on v7, the simpler way is to add a ordering parameter to qemu_register_reset(), so that we can ensure the order of apic reset handler(apic reset must be after the other devices' reset on x86). This way will not influence the initialization code ordering expect apic reset. Can we take this way? or someone have a better one? Thanks, Zhu > > Cheers, > Andreas >
On Tue, 30 Jun 2015 14:31:50 +0800 Zhu Guihua <zhugh.fnst@cn.fujitsu.com> wrote: > > On 06/26/2015 01:28 AM, Andreas Färber wrote: > > Am 25.06.2015 um 19:00 schrieb Paolo Bonzini: > >> On 25/06/2015 04:17, Zhu Guihua wrote: > >>> Add a wrapper to specify reset order when registering reset handler, > >>> instead of non-obvious initiazation code ordering. > >>> > >>> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> > >> I'm sorry, this is not really acceptable. The initialization code > >> ordering is good because it should be okay to run reset handlers in the > >> same order as code is run. If there are dependencies between reset > >> handlers, a random integer is not a maintainable way to maintain them. > >> > >> Instead, you should have a single reset handler that calls the reset > >> handlers in the right order; for example a qdev bus such as icc_bus > >> always resets children before parents. > >> > >> Are you sure that you want to remove the icc_bus?... What are you > >> gaining exactly by doing so? > > >From my view we would be gaining by making the APIC an integral part > > (child<>) of the CPU in a follow-up step (there's a TODO to make things > > link<>s). > > > > But either way the CPU's existing reset handler should be able to handle > > CPU/APIC interdependencies just fine, somehow. Which is what Eduardo > > said on v6 and v7. (Another alternative he raised was a machine init > > notifier, but I see no code for that after its mention on v7?) > > According to Eduardo's suggestions on v7, the simpler way is to add a > ordering parameter > to qemu_register_reset(), so that we can ensure the order of apic reset > handler(apic reset > must be after the other devices' reset on x86). > > This way will not influence the initialization code ordering expect > apic reset. > Can we take this way? or someone have a better one? could you explain once more why apic->reset() doesn't work when it's called from cpu->reset(), please? > > Thanks, > Zhu > > > > > Cheers, > > Andreas > > >
Am 30.06.2015 um 08:31 schrieb Zhu Guihua: > On 06/26/2015 01:28 AM, Andreas Färber wrote: >> Am 25.06.2015 um 19:00 schrieb Paolo Bonzini: >>> On 25/06/2015 04:17, Zhu Guihua wrote: >>>> Add a wrapper to specify reset order when registering reset handler, >>>> instead of non-obvious initiazation code ordering. >>>> >>>> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> >>> I'm sorry, this is not really acceptable. The initialization code >>> ordering is good because it should be okay to run reset handlers in the >>> same order as code is run. If there are dependencies between reset >>> handlers, a random integer is not a maintainable way to maintain them. >>> >>> Instead, you should have a single reset handler that calls the reset >>> handlers in the right order; for example a qdev bus such as icc_bus >>> always resets children before parents. >>> >>> Are you sure that you want to remove the icc_bus?... What are you >>> gaining exactly by doing so? >> >From my view we would be gaining by making the APIC an integral part >> (child<>) of the CPU in a follow-up step (there's a TODO to make things >> link<>s). >> >> But either way the CPU's existing reset handler should be able to handle >> CPU/APIC interdependencies just fine, somehow. Which is what Eduardo >> said on v6 and v7. (Another alternative he raised was a machine init >> notifier, but I see no code for that after its mention on v7?) > > According to Eduardo's suggestions on v7, the simpler way is to add a > ordering parameter > to qemu_register_reset(), so that we can ensure the order of apic reset > handler(apic reset > must be after the other devices' reset on x86). That is a very general statement. Surely the APIC does not need to be reset after *all* other devices but after some particular device(s). Which one is that if not the X86CPU? > This way will not influence the initialization code ordering expect > apic reset. And exactly that's the criticism: You're introducing a generic mechanism to address a single very specific problem. sPAPR already has the MachineClass::reset() callback to order CPU vs. device reset. So if you want a new mechanism you'll need to explain in detail why that is needed and not just say that it solves your issue. Regards, Andreas
On 06/30/2015 05:21 PM, Igor Mammedov wrote: > On Tue, 30 Jun 2015 14:31:50 +0800 > Zhu Guihua <zhugh.fnst@cn.fujitsu.com> wrote: > >> On 06/26/2015 01:28 AM, Andreas Färber wrote: >>> Am 25.06.2015 um 19:00 schrieb Paolo Bonzini: >>>> On 25/06/2015 04:17, Zhu Guihua wrote: >>>>> Add a wrapper to specify reset order when registering reset handler, >>>>> instead of non-obvious initiazation code ordering. >>>>> >>>>> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> >>>> I'm sorry, this is not really acceptable. The initialization code >>>> ordering is good because it should be okay to run reset handlers in the >>>> same order as code is run. If there are dependencies between reset >>>> handlers, a random integer is not a maintainable way to maintain them. >>>> >>>> Instead, you should have a single reset handler that calls the reset >>>> handlers in the right order; for example a qdev bus such as icc_bus >>>> always resets children before parents. >>>> >>>> Are you sure that you want to remove the icc_bus?... What are you >>>> gaining exactly by doing so? >>> >From my view we would be gaining by making the APIC an integral part >>> (child<>) of the CPU in a follow-up step (there's a TODO to make things >>> link<>s). >>> >>> But either way the CPU's existing reset handler should be able to handle >>> CPU/APIC interdependencies just fine, somehow. Which is what Eduardo >>> said on v6 and v7. (Another alternative he raised was a machine init >>> notifier, but I see no code for that after its mention on v7?) >> According to Eduardo's suggestions on v7, the simpler way is to add a >> ordering parameter >> to qemu_register_reset(), so that we can ensure the order of apic reset >> handler(apic reset >> must be after the other devices' reset on x86). >> >> This way will not influence the initialization code ordering expect >> apic reset. >> Can we take this way? or someone have a better one? > could you explain once more why apic->reset() doesn't work > when it's called from cpu->reset(), please? Originally, there are some devices (such as hpet, rtc) reset before apic reset. When these devices reset, they would send irq to apic. As apic reset is behind these devices reset, the apic register could be set to default values. If apic->reset() is called from cpu->reset(), cpu reset is before some devices reset, it lead to apic reset is before them too, so the apic register could not be set to default values. But before guest boots up, the irq request should be rejected. So when linux enables local apic, it would find there are irr requests, then it will cause the following warn_on. [ 1.073487] ------------[ cut here ]------------ [ 1.074019] WARNING: at arch/x86/kernel/apic/apic.c:1401 setup_local_APIC+0x268/0x320() [ 1.075011] Modules linked in: [ 1.076474] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0.sort+ #100 [ 1.077012] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014 [ 1.078011] 0000000000000000 00000000d1b49dbb ffff88007c787da8 ffffffff81649983 [ 1.082011] ffff88007c787de0 ffffffff810b3241 0000000000000001 0000000000000000 [ 1.085012] 00000000000000f0 0000000000000000 00000000ffffffff ffff88007c787df0 [ 1.088012] Call Trace: [ 1.089019] [<ffffffff81649983>] dump_stack+0x19/0x1b [ 1.090017] [<ffffffff810b3241>] warn_slowpath_common+0x61/0x80 [ 1.091015] [<ffffffff810b336a>] warn_slowpath_null+0x1a/0x20 [ 1.092016] [<ffffffff81089ae8>] setup_local_APIC+0x268/0x320 [ 1.093019] [<ffffffff81ad4f02>] native_smp_prepare_cpus+0x294/0x35b [ 1.094018] [<ffffffff81ac1133>] kernel_init_freeable+0xbb/0x217 [ 1.095017] [<ffffffff81636fe0>] ? rest_init+0x80/0x80 [ 1.096015] [<ffffffff81636fee>] kernel_init+0xe/0x180 [ 1.097016] [<ffffffff816598fc>] ret_from_fork+0x7c/0xb0 [ 1.098016] [<ffffffff81636fe0>] ? rest_init+0x80/0x80 [ 1.099017] ---[ end trace d99eba50bffa17c5 ]--- > >> Thanks, >> Zhu >> >>> Cheers, >>> Andreas >>> > . >
On 30 June 2015 at 11:50, Zhu Guihua <zhugh.fnst@cn.fujitsu.com> wrote: > Originally, there are some devices (such as hpet, rtc) reset before apic > reset. > When these devices reset, they would send irq to apic. In our current (arguably broken) reset design, this is a bug in those devices -- they shouldn't be raising interrupts from their reset functions. -- PMM
On Tue, Jun 30, 2015 at 12:24:22PM +0200, Andreas Färber wrote: > Am 30.06.2015 um 08:31 schrieb Zhu Guihua: > > On 06/26/2015 01:28 AM, Andreas Färber wrote: > >> Am 25.06.2015 um 19:00 schrieb Paolo Bonzini: > >>> On 25/06/2015 04:17, Zhu Guihua wrote: > >>>> Add a wrapper to specify reset order when registering reset handler, > >>>> instead of non-obvious initiazation code ordering. > >>>> > >>>> Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> > >>> I'm sorry, this is not really acceptable. The initialization code > >>> ordering is good because it should be okay to run reset handlers in the > >>> same order as code is run. If there are dependencies between reset > >>> handlers, a random integer is not a maintainable way to maintain them. > >>> > >>> Instead, you should have a single reset handler that calls the reset > >>> handlers in the right order; for example a qdev bus such as icc_bus > >>> always resets children before parents. > >>> > >>> Are you sure that you want to remove the icc_bus?... What are you > >>> gaining exactly by doing so? > >> >From my view we would be gaining by making the APIC an integral part > >> (child<>) of the CPU in a follow-up step (there's a TODO to make things > >> link<>s). > >> > >> But either way the CPU's existing reset handler should be able to handle > >> CPU/APIC interdependencies just fine, somehow. Which is what Eduardo > >> said on v6 and v7. (Another alternative he raised was a machine init > >> notifier, but I see no code for that after its mention on v7?) > > > > According to Eduardo's suggestions on v7, the simpler way is to add a > > ordering parameter > > to qemu_register_reset(), so that we can ensure the order of apic reset > > handler(apic reset > > must be after the other devices' reset on x86). > > That is a very general statement. Surely the APIC does not need to be > reset after *all* other devices but after some particular device(s). > Which one is that if not the X86CPU? > > > This way will not influence the initialization code ordering expect > > apic reset. > > And exactly that's the criticism: You're introducing a generic mechanism > to address a single very specific problem. > > sPAPR already has the MachineClass::reset() callback to order CPU vs. > device reset. So if you want a new mechanism you'll need to explain in > detail why that is needed and not just say that it solves your issue. My main point was that relying on a specific ordering of qemu_register_reset() calls to ensure reset ordering was too fragile. Adding an ordering argument to qemu_register_reset() was a suggestion, but now I agree it is unnecessary. Having a reset handler that calls reset code in the right order (like Paolo proposed earlier in this thread) looks much simpler.
On Tue, Jun 30, 2015 at 11:55:59AM +0100, Peter Maydell wrote: > On 30 June 2015 at 11:50, Zhu Guihua <zhugh.fnst@cn.fujitsu.com> wrote: > > Originally, there are some devices (such as hpet, rtc) reset before apic > > reset. > > When these devices reset, they would send irq to apic. > > In our current (arguably broken) reset design, this is a bug in > those devices -- they shouldn't be raising interrupts from their > reset functions. Do we know exactly which devices are currently broken and need to be fixed? Should we add a warning to the interrupt code to detect those cases? It looks like we won't need a solution to ensure reset ordering if we fix the devices (or at least the solution would be just temporary until we fix them).
diff --git a/include/hw/hw.h b/include/hw/hw.h index c78adae..d9375e7 100644 --- a/include/hw/hw.h +++ b/include/hw/hw.h @@ -37,7 +37,11 @@ #endif typedef void QEMUResetHandler(void *opaque); +typedef uint64_t QEMUResetOrder; +#define default_reset_order 0x0 +void qemu_register_reset_common(QEMUResetHandler *func, void *opaque, + QEMUResetOrder reset_order); void qemu_register_reset(QEMUResetHandler *func, void *opaque); void qemu_unregister_reset(QEMUResetHandler *func, void *opaque); diff --git a/vl.c b/vl.c index 69ad90c..b205a9b 100644 --- a/vl.c +++ b/vl.c @@ -1589,6 +1589,7 @@ typedef struct QEMUResetEntry { QTAILQ_ENTRY(QEMUResetEntry) entry; QEMUResetHandler *func; void *opaque; + QEMUResetOrder reset_order; } QEMUResetEntry; static QTAILQ_HEAD(reset_handlers, QEMUResetEntry) reset_handlers = @@ -1672,15 +1673,30 @@ static int qemu_debug_requested(void) return r; } -void qemu_register_reset(QEMUResetHandler *func, void *opaque) +void qemu_register_reset_common(QEMUResetHandler *func, void *opaque, + QEMUResetOrder reset_order) { + QEMUResetEntry *item; QEMUResetEntry *re = g_malloc0(sizeof(QEMUResetEntry)); re->func = func; re->opaque = opaque; + re->reset_order = reset_order; + + QTAILQ_FOREACH(item, &reset_handlers, entry) { + if (re->reset_order >= item->reset_order) + continue; + QTAILQ_INSERT_BEFORE(item, re, entry); + return; + } QTAILQ_INSERT_TAIL(&reset_handlers, re, entry); } +void qemu_register_reset(QEMUResetHandler *func, void *opaque) +{ + qemu_register_reset_common(func, opaque, default_reset_order); +} + void qemu_unregister_reset(QEMUResetHandler *func, void *opaque) { QEMUResetEntry *re;
Add a wrapper to specify reset order when registering reset handler, instead of non-obvious initiazation code ordering. Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com> --- include/hw/hw.h | 4 ++++ vl.c | 18 +++++++++++++++++- 2 files changed, 21 insertions(+), 1 deletion(-)