Message ID | 20171002222301.22832-3-gpiccoli@linux.vnet.ibm.com |
---|---|
State | RFC |
Headers | show |
Series | Some changes in phb4 | expand |
On Mon, 2017-10-02 at 19:23 -0300, Guilherme G. Piccoli wrote: > If we are performing a PHB reset and coming for a bad situation, > (and PHB is fenced), we fail because we don't use ASB by default > in the phb4_init_hw(). > > This patch makes use of this trick when we are resetting PHBs > to workaround possible problems in PHB (aka it might be fenced) > and at least try to perform the complete initialization. > > This proved useful in kdump kernel boot on POWER9 (wspoon). > > Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> > --- > > Tested on wspoon DD 2.01, worked fine. But...after kdump is complete, > system does not reboot automatically. I'll investigate too, could be > another problem or (probably) something due to this change. > > I thought in sending it anyway, as-is, in order to get feedback > and stop my investigation in this direction if it does not worth. > Thanks in advance! > > hw/phb4.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index 99902d3a..12312a5c 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -4312,6 +4312,10 @@ static void phb4_init_hw(struct phb4 *p, bool > first_init) > > PHBDBG(p, "Initializing PHB4...\n"); > > + /* Use ASB to try initialize even if the PHB is fenced. */ > + if (!first_init) > + p->flags |= PHB4_CFG_USE_ASB; > + I'm not sure I'm keen on this solution. We have this code in the creset path which is calling phb4_init_hw() case PHB4_SLOT_CRESET_REINIT: PHBDBG(p, "CRESET: Reinitialization\n"); p->flags &= ~PHB4_AIB_FENCED; p->flags &= ~PHB4_CAPP_RECOVERY; p->flags &= ~PHB4_CFG_USE_ASB; phb4_init_hw(p, false); pci_slot_set_state(slot, PHB4_SLOT_CRESET_FRESET); return pci_slot_set_sm_timeout(slot, msecs_to_tb(100)); It seems we are clearing it, and then you are setting it immediately after. Seems like the initial clearing would be wrong and we're just papering over the issue. Mikey > /* Init_1 - Sync reset > * > * At this point we assume the PHB has already been reset. > @@ -4387,6 +4391,9 @@ static void phb4_init_hw(struct phb4 *p, bool > first_init) > > phb4_write_reg(p, PHB_CTRLR, val); > > + if (!first_init) > + p->flags &= ~PHB4_CFG_USE_ASB; > + > /* Init_18..41 - Architected IODA3 inits */ > phb4_init_ioda3(p); >
On 10/02/2017 09:38 PM, Michael Neuling wrote: > [...] > I'm not sure I'm keen on this solution. We have this code in the creset path > which is calling phb4_init_hw() > > case PHB4_SLOT_CRESET_REINIT: > PHBDBG(p, "CRESET: Reinitialization\n"); > p->flags &= ~PHB4_AIB_FENCED; > p->flags &= ~PHB4_CAPP_RECOVERY; > p->flags &= ~PHB4_CFG_USE_ASB; > phb4_init_hw(p, false); > pci_slot_set_state(slot, PHB4_SLOT_CRESET_FRESET); > return pci_slot_set_sm_timeout(slot, msecs_to_tb(100)); > > It seems we are clearing it, and then you are setting it immediately after. > Seems like the initial clearing would be wrong and we're just papering over the > issue. > Thanks for the review Mikey! I saw this code, and you're right - we might remove it! But even if we don't clear the ASB flag...it does not mean we'll have it set, correct? So, perhaps instead of clearing we should _set it_ on the switch..tomorrow I'll do more experiments on this, since it seems a valid path of investigation. Cheers, Guilherme > Mikey > > >> /* Init_1 - Sync reset >> * >> * At this point we assume the PHB has already been reset. >> @@ -4387,6 +4391,9 @@ static void phb4_init_hw(struct phb4 *p, bool >> first_init) >> >> phb4_write_reg(p, PHB_CTRLR, val); >> >> + if (!first_init) >> + p->flags &= ~PHB4_CFG_USE_ASB; >> + >> /* Init_18..41 - Architected IODA3 inits */ >> phb4_init_ioda3(p); >>
On Mon, 2017-10-02 at 19:23 -0300, Guilherme G. Piccoli wrote: > If we are performing a PHB reset and coming for a bad situation, > (and PHB is fenced), we fail because we don't use ASB by default > in the phb4_init_hw(). > > This patch makes use of this trick when we are resetting PHBs > to workaround possible problems in PHB (aka it might be fenced) > and at least try to perform the complete initialization. > > This proved useful in kdump kernel boot on POWER9 (wspoon). We should never be calling init_hw while fenced. If we do then the bug is elsewhere. The reset code should be checking for fences already and clearing them if needed. > > Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> > --- > > Tested on wspoon DD 2.01, worked fine. But...after kdump is complete, > system does not reboot automatically. I'll investigate too, could be > another problem or (probably) something due to this change. > > I thought in sending it anyway, as-is, in order to get feedback > and stop my investigation in this direction if it does not worth. > Thanks in advance! > > hw/phb4.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index 99902d3a..12312a5c 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -4312,6 +4312,10 @@ static void phb4_init_hw(struct phb4 *p, bool first_init) > > PHBDBG(p, "Initializing PHB4...\n"); > > + /* Use ASB to try initialize even if the PHB is fenced. */ > + if (!first_init) > + p->flags |= PHB4_CFG_USE_ASB; > + > /* Init_1 - Sync reset > * > * At this point we assume the PHB has already been reset. > @@ -4387,6 +4391,9 @@ static void phb4_init_hw(struct phb4 *p, bool first_init) > > phb4_write_reg(p, PHB_CTRLR, val); > > + if (!first_init) > + p->flags &= ~PHB4_CFG_USE_ASB; > + > /* Init_18..41 - Architected IODA3 inits */ > phb4_init_ioda3(p); >
On Tue, 2017-10-03 at 11:38 +1100, Michael Neuling wrote: > I'm not sure I'm keen on this solution. We have this code in the creset path > which is calling phb4_init_hw() > > case PHB4_SLOT_CRESET_REINIT: > PHBDBG(p, "CRESET: Reinitialization\n"); > p->flags &= ~PHB4_AIB_FENCED; > p->flags &= ~PHB4_CAPP_RECOVERY; > p->flags &= ~PHB4_CFG_USE_ASB; > phb4_init_hw(p, false); > pci_slot_set_state(slot, PHB4_SLOT_CRESET_FRESET); > return pci_slot_set_sm_timeout(slot, msecs_to_tb(100)); > > It seems we are clearing it, and then you are setting it immediately after. > Seems like the initial clearing would be wrong and we're just papering over the > issue. No. We should only enter that state after we have successfully cleared the fence already by resetting the logic. If the fence is still up, we have a problem but all I see is papering over it, not understanding and fixing it properly. We could add a proper fence check after clearing the flags and before doing a reinit, and maybe trying to go back to PHB4_SLOT_CRESET_START if we git (with a retry count) but it would be useful to understand how come we end up there fenced. You might need to work with Ric on sorting out the sequence here. Ben.
On 10/03/2017 10:50 AM, Benjamin Herrenschmidt wrote: > On Mon, 2017-10-02 at 19:23 -0300, Guilherme G. Piccoli wrote: >> If we are performing a PHB reset and coming for a bad situation, >> (and PHB is fenced), we fail because we don't use ASB by default >> in the phb4_init_hw(). >> >> This patch makes use of this trick when we are resetting PHBs >> to workaround possible problems in PHB (aka it might be fenced) >> and at least try to perform the complete initialization. >> >> This proved useful in kdump kernel boot on POWER9 (wspoon). > > We should never be calling init_hw while fenced. If we do then the bug > is elsewhere. The reset code should be checking for fences already and > clearing them if needed. Thanks for your good review, I'll forget about this path, since it's just "hiding" the issue, and follow your suggestions! Thanks, Guilherme
diff --git a/hw/phb4.c b/hw/phb4.c index 99902d3a..12312a5c 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -4312,6 +4312,10 @@ static void phb4_init_hw(struct phb4 *p, bool first_init) PHBDBG(p, "Initializing PHB4...\n"); + /* Use ASB to try initialize even if the PHB is fenced. */ + if (!first_init) + p->flags |= PHB4_CFG_USE_ASB; + /* Init_1 - Sync reset * * At this point we assume the PHB has already been reset. @@ -4387,6 +4391,9 @@ static void phb4_init_hw(struct phb4 *p, bool first_init) phb4_write_reg(p, PHB_CTRLR, val); + if (!first_init) + p->flags &= ~PHB4_CFG_USE_ASB; + /* Init_18..41 - Architected IODA3 inits */ phb4_init_ioda3(p);
If we are performing a PHB reset and coming for a bad situation, (and PHB is fenced), we fail because we don't use ASB by default in the phb4_init_hw(). This patch makes use of this trick when we are resetting PHBs to workaround possible problems in PHB (aka it might be fenced) and at least try to perform the complete initialization. This proved useful in kdump kernel boot on POWER9 (wspoon). Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> --- Tested on wspoon DD 2.01, worked fine. But...after kdump is complete, system does not reboot automatically. I'll investigate too, could be another problem or (probably) something due to this change. I thought in sending it anyway, as-is, in order to get feedback and stop my investigation in this direction if it does not worth. Thanks in advance! hw/phb4.c | 7 +++++++ 1 file changed, 7 insertions(+)