diff mbox

[2/3] pseries: Fix bug with reset of VIO CRQs

Message ID 4F7C7330.1080400@suse.de
State New
Headers show

Commit Message

Andreas Färber April 4, 2012, 4:13 p.m. UTC
Am 28.03.2012 23:39, schrieb David Gibson:
> PAPR specifies a Command Response Queue (CRQ) mechanism used for virtual
> IO, which we implement.  However, we don't correctly clean up registered
> CRQs when we reset the system.
> 
> This patch adds a reset handler to fix this bug.  While we're at it, add
> in some of the extra debug messages that were used to track the problem
> down.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---

As discussed on IRC, I've applied the following diff on my local branch
to drop the h_reg_crq that my __func__ comment was about:

     dev->crq.qladdr = queue_addr;

However, I'm having trouble testing reset. Whether on vanilla master or
using this patch on top of ppc-next or this whole series on top of
ppc-next, using `ppc64-softmmu/qemu-system-ppc64 -M pseries -m 1G`:

a) 0 > reset-all
results in: "reboot not available Aborted"
Do you need to update SLOF to actually use the newly added RTAS call?

b) (qemu) system_reset
results in:
 exception 700
SRR0 = 0000000000000000  SRR1 = 800000008000000000080000
SPRG2 = 0000000000000000  SPRG3 = 000000003DCD1AD4

Could you please look into the two above issues? How did you test?

Thanks,
Andreas

>  hw/spapr_vio.c |   33 +++++++++++++++++++++++++--------
>  1 files changed, 25 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
> index 1f67e64..97d029a 100644
> --- a/hw/spapr_vio.c
> +++ b/hw/spapr_vio.c
> @@ -431,12 +431,13 @@ static target_ulong h_reg_crq(CPUPPCState *env, sPAPREnvironment *spapr,
>  
>      /* Check if device supports CRQs */
>      if (!dev->crq.SendFunc) {
> +        hcall_dprintf("h_reg_crq, device does not support CRQ\n");
>          return H_NOT_FOUND;
>      }
>  
> -
>      /* Already a queue ? */
>      if (dev->crq.qsize) {
> +        hcall_dprintf("h_reg_crq, CRQ already registered\n");
>          return H_RESOURCE;
>      }
>      dev->crq.qladdr = queue_addr;
> @@ -449,6 +450,17 @@ static target_ulong h_reg_crq(CPUPPCState *env, sPAPREnvironment *spapr,
>      return H_SUCCESS;
>  }
>  
> +static target_ulong free_crq(VIOsPAPRDevice *dev)
> +{
> +    dev->crq.qladdr = 0;
> +    dev->crq.qsize = 0;
> +    dev->crq.qnext = 0;
> +
> +    dprintf("CRQ for dev 0x%" PRIx32 " freed\n", dev->reg);
> +
> +    return H_SUCCESS;
> +}
> +
>  static target_ulong h_free_crq(CPUPPCState *env, sPAPREnvironment *spapr,
>                                 target_ulong opcode, target_ulong *args)
>  {
> @@ -460,13 +472,7 @@ static target_ulong h_free_crq(CPUPPCState *env, sPAPREnvironment *spapr,
>          return H_PARAMETER;
>      }
>  
> -    dev->crq.qladdr = 0;
> -    dev->crq.qsize = 0;
> -    dev->crq.qnext = 0;
> -
> -    dprintf("CRQ for dev 0x" TARGET_FMT_lx " freed\n", reg);
> -
> -    return H_SUCCESS;
> +    return free_crq(dev);
>  }
>  
>  static target_ulong h_send_crq(CPUPPCState *env, sPAPREnvironment *spapr,
> @@ -642,6 +648,15 @@ static int spapr_vio_check_reg(VIOsPAPRDevice *sdev)
>      return 0;
>  }
>  
> +static void spapr_vio_busdev_reset(void *opaque)
> +{
> +    VIOsPAPRDevice *dev = (VIOsPAPRDevice *)opaque;
> +
> +    if (dev->crq.qsize) {
> +        free_crq(dev);
> +    }
> +}
> +
>  static int spapr_vio_busdev_init(DeviceState *qdev)
>  {
>      VIOsPAPRDevice *dev = (VIOsPAPRDevice *)qdev;
> @@ -670,6 +685,8 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
>  
>      rtce_init(dev);
>  
> +    qemu_register_reset(spapr_vio_busdev_reset, dev);
> +
>      return pc->init(dev);
>  }
>

Comments

David Gibson April 5, 2012, 1:12 a.m. UTC | #1
On Wed, Apr 04, 2012 at 06:13:36PM +0200, Andreas Färber wrote:
> Am 28.03.2012 23:39, schrieb David Gibson:
> > PAPR specifies a Command Response Queue (CRQ) mechanism used for virtual
> > IO, which we implement.  However, we don't correctly clean up registered
> > CRQs when we reset the system.
> > 
> > This patch adds a reset handler to fix this bug.  While we're at it, add
> > in some of the extra debug messages that were used to track the problem
> > down.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> 
> As discussed on IRC, I've applied the following diff on my local branch
> to drop the h_reg_crq that my __func__ comment was about:
> 
> diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
> index 0bf2c31..97d029a 100644
> --- a/hw/spapr_vio.c
> +++ b/hw/spapr_vio.c
> @@ -431,13 +431,13 @@ static target_ulong h_reg_crq(CPUPPCState *env,
> sPAPREnvironment *spapr,
> 
>      /* Check if device supports CRQs */
>      if (!dev->crq.SendFunc) {
> -        hcall_dprintf("Device does not support CRQ\n");
> +        hcall_dprintf("h_reg_crq, device does not support CRQ\n");
>          return H_NOT_FOUND;
>      }
> 
>      /* Already a queue ? */
>      if (dev->crq.qsize) {
> -        hcall_dprintf("CRQ already registered\n");
> +        hcall_dprintf("h_reg_crq, CRQ already registered\n");
>          return H_RESOURCE;
>      }
>      dev->crq.qladdr = queue_addr;
> 
> However, I'm having trouble testing reset. Whether on vanilla master or
> using this patch on top of ppc-next or this whole series on top of
> ppc-next, using `ppc64-softmmu/qemu-system-ppc64 -M pseries -m 1G`:
> 
> a) 0 > reset-all
> results in: "reboot not available Aborted"
> Do you need to update SLOF to actually use the newly added RTAS call?
> 
> b) (qemu) system_reset
> results in:
>  exception 700
> SRR0 = 0000000000000000  SRR1 = 800000008000000000080000
> SPRG2 = 0000000000000000  SPRG3 = 000000003DCD1AD4
> 
> Could you please look into the two above issues? How did you test?

Ah.  I used "reboot" from within the guest Linux.  I'll look at the
others, the first could just be a slof bug.
David Gibson April 5, 2012, 2:30 a.m. UTC | #2
On Wed, Apr 04, 2012 at 06:13:36PM +0200, Andreas Färber wrote:
> Am 28.03.2012 23:39, schrieb David Gibson:
[snip]
> However, I'm having trouble testing reset. Whether on vanilla master or
> using this patch on top of ppc-next or this whole series on top of
> ppc-next, using `ppc64-softmmu/qemu-system-ppc64 -M pseries -m 1G`:
> 
> a) 0 > reset-all
> results in: "reboot not available Aborted"
> Do you need to update SLOF to actually use the newly added RTAS call?

Maybe, I'll have to check that.

> b) (qemu) system_reset
> results in:
>  exception 700
> SRR0 = 0000000000000000  SRR1 = 800000008000000000080000
> SPRG2 = 0000000000000000  SPRG3 = 000000003DCD1AD4
> 
> Could you please look into the two above issues? How did you test?

Hrm.  I don't get that, at least with a fully booted kernel, although
it does fail to boot completely after the reset, which I'll debug.
diff mbox

Patch

diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index 0bf2c31..97d029a 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -431,13 +431,13 @@  static target_ulong h_reg_crq(CPUPPCState *env,
sPAPREnvironment *spapr,

     /* Check if device supports CRQs */
     if (!dev->crq.SendFunc) {
-        hcall_dprintf("Device does not support CRQ\n");
+        hcall_dprintf("h_reg_crq, device does not support CRQ\n");
         return H_NOT_FOUND;
     }

     /* Already a queue ? */
     if (dev->crq.qsize) {
-        hcall_dprintf("CRQ already registered\n");
+        hcall_dprintf("h_reg_crq, CRQ already registered\n");
         return H_RESOURCE;
     }