mbox series

[for-5.0,0/4] ppc: Fix interrupt controller emulation

Message ID 157548861171.3650476.14824062174573272058.stgit@bahia.lan
Headers show
Series ppc: Fix interrupt controller emulation | expand

Message

Greg Kurz Dec. 4, 2019, 7:43 p.m. UTC
Guest hangs have been observed recently on POWER9 hosts, specifically LC92x
"Boston" systems, when the guests are being rebooted multiple times. The
issue isn't POWER9 specific though. It is caused by a very long standing bug
when using the uncommon accel=kvm,kernel-irqchip=off machine configuration
which happens to be enforced on LC92x because of a host FW limitation. This
affects both the XICS and XIVE emulated interrupt controllers.

The actual fix is in patch 1. Patch 2 is a followup cleanup. The other
patches are unrelated cleanups I came up with while investigating.

Since this bug always existed and we're already in rc4, I think it is better
to fix it in 5.0 and possibly backport it to stable and downstream if needed.

--
Greg

---

Greg Kurz (4):
      ppc: Deassert the external interrupt pin in KVM on reset
      xics: Don't deassert outputs
      ppc: Don't use CPUPPCState::irq_input_state with modern Book3s CPU models
      ppc: Ignore the CPU_INTERRUPT_EXITTB interrupt with KVM


 hw/intc/xics.c                  |    3 ---
 hw/ppc/ppc.c                    |   24 ++++++++++--------------
 include/hw/ppc/ppc.h            |    2 ++
 target/ppc/cpu.h                |    4 +++-
 target/ppc/helper_regs.h        |    5 +++++
 target/ppc/translate_init.inc.c |    1 +
 6 files changed, 21 insertions(+), 18 deletions(-)

Comments

David Gibson Dec. 9, 2019, 1:14 a.m. UTC | #1
On Wed, Dec 04, 2019 at 08:43:31PM +0100, Greg Kurz wrote:
> Guest hangs have been observed recently on POWER9 hosts, specifically LC92x
> "Boston" systems, when the guests are being rebooted multiple times. The
> issue isn't POWER9 specific though. It is caused by a very long standing bug
> when using the uncommon accel=kvm,kernel-irqchip=off machine configuration
> which happens to be enforced on LC92x because of a host FW limitation. This
> affects both the XICS and XIVE emulated interrupt controllers.
> 
> The actual fix is in patch 1. Patch 2 is a followup cleanup. The other
> patches are unrelated cleanups I came up with while investigating.
> 
> Since this bug always existed and we're already in rc4, I think it is better
> to fix it in 5.0 and possibly backport it to stable and downstream if needed.

Applied to ppc-for-5.0.
Greg Kurz Dec. 9, 2019, 10:59 a.m. UTC | #2
On Mon, 9 Dec 2019 12:14:28 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Dec 04, 2019 at 08:43:31PM +0100, Greg Kurz wrote:
> > Guest hangs have been observed recently on POWER9 hosts, specifically LC92x
> > "Boston" systems, when the guests are being rebooted multiple times. The
> > issue isn't POWER9 specific though. It is caused by a very long standing bug
> > when using the uncommon accel=kvm,kernel-irqchip=off machine configuration
> > which happens to be enforced on LC92x because of a host FW limitation. This
> > affects both the XICS and XIVE emulated interrupt controllers.
> > 
> > The actual fix is in patch 1. Patch 2 is a followup cleanup. The other
> > patches are unrelated cleanups I came up with while investigating.
> > 
> > Since this bug always existed and we're already in rc4, I think it is better
> > to fix it in 5.0 and possibly backport it to stable and downstream if needed.
> 
> Applied to ppc-for-5.0.
> 
> 

According to Cornelia's comments, it seems I need to respin this against
the s390-next branch to avoid conflicts.
Cornelia Huck Dec. 9, 2019, 11:07 a.m. UTC | #3
On Mon, 9 Dec 2019 11:59:47 +0100
Greg Kurz <groug@kaod.org> wrote:

> On Mon, 9 Dec 2019 12:14:28 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Dec 04, 2019 at 08:43:31PM +0100, Greg Kurz wrote:  
> > > Guest hangs have been observed recently on POWER9 hosts, specifically LC92x
> > > "Boston" systems, when the guests are being rebooted multiple times. The
> > > issue isn't POWER9 specific though. It is caused by a very long standing bug
> > > when using the uncommon accel=kvm,kernel-irqchip=off machine configuration
> > > which happens to be enforced on LC92x because of a host FW limitation. This
> > > affects both the XICS and XIVE emulated interrupt controllers.
> > > 
> > > The actual fix is in patch 1. Patch 2 is a followup cleanup. The other
> > > patches are unrelated cleanups I came up with while investigating.
> > > 
> > > Since this bug always existed and we're already in rc4, I think it is better
> > > to fix it in 5.0 and possibly backport it to stable and downstream if needed.  
> > 
> > Applied to ppc-for-5.0.
> > 
> >   
> 
> According to Cornelia's comments, it seems I need to respin this against
> the s390-next branch to avoid conflicts.


Aren't these ppc-only patches, though? Confused.
Greg Kurz Dec. 9, 2019, 11:14 a.m. UTC | #4
On Mon, 9 Dec 2019 12:07:35 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Mon, 9 Dec 2019 11:59:47 +0100
> Greg Kurz <groug@kaod.org> wrote:
> 
> > On Mon, 9 Dec 2019 12:14:28 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> > 
> > > On Wed, Dec 04, 2019 at 08:43:31PM +0100, Greg Kurz wrote:  
> > > > Guest hangs have been observed recently on POWER9 hosts, specifically LC92x
> > > > "Boston" systems, when the guests are being rebooted multiple times. The
> > > > issue isn't POWER9 specific though. It is caused by a very long standing bug
> > > > when using the uncommon accel=kvm,kernel-irqchip=off machine configuration
> > > > which happens to be enforced on LC92x because of a host FW limitation. This
> > > > affects both the XICS and XIVE emulated interrupt controllers.
> > > > 
> > > > The actual fix is in patch 1. Patch 2 is a followup cleanup. The other
> > > > patches are unrelated cleanups I came up with while investigating.
> > > > 
> > > > Since this bug always existed and we're already in rc4, I think it is better
> > > > to fix it in 5.0 and possibly backport it to stable and downstream if needed.  
> > > 
> > > Applied to ppc-for-5.0.
> > > 
> > >   
> > 
> > According to Cornelia's comments, it seems I need to respin this against
> > the s390-next branch to avoid conflicts.
> 
> 
> Aren't these ppc-only patches, though? Confused.

Oops... I've mixed up with the CPUReset series, sorry for the confusion :-)