Patchwork kvm segfaulting

login
register
mail settings
Submitter Paolo Bonzini
Date Feb. 11, 2013, 2:42 p.m.
Message ID <51190352.2030708@redhat.com>
Download mbox | patch
Permalink /patch/219615/
State New
Headers show

Comments

Paolo Bonzini - Feb. 11, 2013, 2:42 p.m.
Il 11/02/2013 15:18, Stefan Priebe - Profihost AG ha scritto:
>> > Some trace that a request was actually cancelled, but I think I
>> > believe
> Ah but that must be in guest not on host right? How to grab that from
> client when it is crashing?

Serial console could have something like "sda: aborting command".  It is actually interesting to see what is causing commands to be aborted (typically a timeout, but what causes the timeout? :).

>> > that.  This seems to be the same issue as commits
>> > 1bd075f29ea6d11853475c7c42734595720c3ac6 (iSCSI) and
>> > 473c7f0255920bcaf37411990a3725898772817f (rbd), where the "cancelled"
>> > callback is called before the "complete" callback.
> If there is the same code in virtio-scsi it might be.

No, virtio-scsi is relying on the backends (including scsi-disk)
doing it correctly.  The RBD code looks okay, so it's still my
fault :) but not virtio-scsi's.

I think this happens when a request is split into multiple parts,
and one of them is canceled.  Then the next part is fired, but
virtio-scsi's cancellation callbacks have fired already.

You can test this patch:


Paolo
Stefan Priebe - Profihost AG - Feb. 12, 2013, 1:46 p.m.
Hi,

thanks - i applied the patch to the latest master. I hope that this will
solve my issue. Will this one get integrated in 1.4 final?

Greets,
Stefan

Am 11.02.2013 15:42, schrieb Paolo Bonzini:
> Il 11/02/2013 15:18, Stefan Priebe - Profihost AG ha scritto:
>>>> Some trace that a request was actually cancelled, but I think I
>>>> believe
>> Ah but that must be in guest not on host right? How to grab that from
>> client when it is crashing?
> 
> Serial console could have something like "sda: aborting command".  It is actually interesting to see what is causing commands to be aborted (typically a timeout, but what causes the timeout? :).
> 
>>>> that.  This seems to be the same issue as commits
>>>> 1bd075f29ea6d11853475c7c42734595720c3ac6 (iSCSI) and
>>>> 473c7f0255920bcaf37411990a3725898772817f (rbd), where the "cancelled"
>>>> callback is called before the "complete" callback.
>> If there is the same code in virtio-scsi it might be.
> 
> No, virtio-scsi is relying on the backends (including scsi-disk)
> doing it correctly.  The RBD code looks okay, so it's still my
> fault :) but not virtio-scsi's.
> 
> I think this happens when a request is split into multiple parts,
> and one of them is canceled.  Then the next part is fired, but
> virtio-scsi's cancellation callbacks have fired already.
> 
> You can test this patch:
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 07220e4..1d8289c 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -221,6 +221,10 @@ static void scsi_write_do_fua(SCSIDiskReq *r)
>  {
>      SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
>  
> +    if (r->req.io_canceled) {
> +        return;
> +    }
> +
>      if (scsi_is_cmd_fua(&r->req.cmd)) {
>          bdrv_acct_start(s->qdev.conf.bs, &r->acct, 0, BDRV_ACCT_FLUSH);
>          r->req.aiocb = bdrv_aio_flush(s->qdev.conf.bs, scsi_aio_complete, r);
> @@ -352,6 +356,10 @@ static void scsi_read_data(SCSIRequest *req)
>      /* No data transfer may already be in progress */
>      assert(r->req.aiocb == NULL);
>  
> +    if (r->req.io_canceled) {
> +        return;
> +    }
> +
>      /* The request is used as the AIO opaque value, so add a ref.  */
>      scsi_req_ref(&r->req);
>      if (r->req.cmd.mode == SCSI_XFER_TO_DEV) {
> @@ -455,6 +463,10 @@ static void scsi_write_data(SCSIRequest *req)
>      /* No data transfer may already be in progress */
>      assert(r->req.aiocb == NULL);
>  
> +    if (r->req.io_canceled) {
> +        return;
> +    }
> +
>      /* The request is used as the AIO opaque value, so add a ref.  */
>      scsi_req_ref(&r->req);
>      if (r->req.cmd.mode != SCSI_XFER_TO_DEV) {
> 
> Paolo
>
Paolo Bonzini - Feb. 12, 2013, 2:34 p.m.
Il 12/02/2013 14:46, Stefan Priebe - Profihost AG ha scritto:
> Hi,
> 
> thanks - i applied the patch to the latest master. I hope that this will
> solve my issue. Will this one get integrated in 1.4 final?

No, only 1.4.1 and 1.5 unfortunately.  Let's give it a week for you to
test it.

Paolo
Stefan Priebe - Profihost AG - Feb. 13, 2013, 7:49 a.m.
Hi Paolo,

sadly no luck. A VM crashed again.

[ ~]# addr2line -e /usr/lib/debug/usr/bin/kvm -f 24040c
virtio_scsi_command_complete
hw/virtio-scsi.c:429

Same point like last time:
static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
                                         size_t resid)
{
    VirtIOSCSIReq *req = r->hba_private;
    uint32_t sense_len;

429   =====>    req->resp.cmd->response = VIRTIO_SCSI_S_OK;
    req->resp.cmd->status = status;
    if (req->resp.cmd->status == GOOD) {
        req->resp.cmd->resid = tswap32(resid);
    } else {
        req->resp.cmd->resid = 0;
        sense_len = scsi_req_get_sense(r, req->resp.cmd->sense,
                                       VIRTIO_SCSI_SENSE_SIZE);
        req->resp.cmd->sense_len = tswap32(sense_len);
    }
    virtio_scsi_complete_req(req);

Greets
Stefan

Am 12.02.2013 15:34, schrieb Paolo Bonzini:
> Il 12/02/2013 14:46, Stefan Priebe - Profihost AG ha scritto:
>> Hi,
>>
>> thanks - i applied the patch to the latest master. I hope that this will
>> solve my issue. Will this one get integrated in 1.4 final?
Stefan Priebe - Profihost AG - Feb. 13, 2013, 8:01 a.m.
Another hint: I've never seens this using qemu 1.3.1

Stefan

Am 13.02.2013 08:49, schrieb Stefan Priebe - Profihost AG:
> Hi Paolo,
> 
> sadly no luck. A VM crashed again.
> 
> [ ~]# addr2line -e /usr/lib/debug/usr/bin/kvm -f 24040c
> virtio_scsi_command_complete
> hw/virtio-scsi.c:429
> 
> Same point like last time:
> static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
>                                          size_t resid)
> {
>     VirtIOSCSIReq *req = r->hba_private;
>     uint32_t sense_len;
> 
> 429   =====>    req->resp.cmd->response = VIRTIO_SCSI_S_OK;
>     req->resp.cmd->status = status;
>     if (req->resp.cmd->status == GOOD) {
>         req->resp.cmd->resid = tswap32(resid);
>     } else {
>         req->resp.cmd->resid = 0;
>         sense_len = scsi_req_get_sense(r, req->resp.cmd->sense,
>                                        VIRTIO_SCSI_SENSE_SIZE);
>         req->resp.cmd->sense_len = tswap32(sense_len);
>     }
>     virtio_scsi_complete_req(req);
> 
> Greets
> Stefan
> 
> Am 12.02.2013 15:34, schrieb Paolo Bonzini:
>> Il 12/02/2013 14:46, Stefan Priebe - Profihost AG ha scritto:
>>> Hi,
>>>
>>> thanks - i applied the patch to the latest master. I hope that this will
>>> solve my issue. Will this one get integrated in 1.4 final?
Stefan Priebe - Profihost AG - Feb. 13, 2013, 8:19 a.m.
Hi,

could this be this one?

commit 47a150a4bbb06e45ef439a8222e9f46a7c4cca3f
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Jan 10 15:49:08 2013 +0100

    virtio-scsi: abort in-flight I/O when the device is reset

    When the device is reset, the SCSI bus should also be reset so
    that in-flight I/O is cancelled.

    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

Greets,
Stefan
Am 13.02.2013 09:01, schrieb Stefan Priebe - Profihost AG:
> Another hint: I've never seens this using qemu 1.3.1
> 
> Stefan
> 
> Am 13.02.2013 08:49, schrieb Stefan Priebe - Profihost AG:
>> Hi Paolo,
>>
>> sadly no luck. A VM crashed again.
>>
>> [ ~]# addr2line -e /usr/lib/debug/usr/bin/kvm -f 24040c
>> virtio_scsi_command_complete
>> hw/virtio-scsi.c:429
>>
>> Same point like last time:
>> static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
>>                                          size_t resid)
>> {
>>     VirtIOSCSIReq *req = r->hba_private;
>>     uint32_t sense_len;
>>
>> 429   =====>    req->resp.cmd->response = VIRTIO_SCSI_S_OK;
>>     req->resp.cmd->status = status;
>>     if (req->resp.cmd->status == GOOD) {
>>         req->resp.cmd->resid = tswap32(resid);
>>     } else {
>>         req->resp.cmd->resid = 0;
>>         sense_len = scsi_req_get_sense(r, req->resp.cmd->sense,
>>                                        VIRTIO_SCSI_SENSE_SIZE);
>>         req->resp.cmd->sense_len = tswap32(sense_len);
>>     }
>>     virtio_scsi_complete_req(req);
>>
>> Greets
>> Stefan
>>
>> Am 12.02.2013 15:34, schrieb Paolo Bonzini:
>>> Il 12/02/2013 14:46, Stefan Priebe - Profihost AG ha scritto:
>>>> Hi,
>>>>
>>>> thanks - i applied the patch to the latest master. I hope that this will
>>>> solve my issue. Will this one get integrated in 1.4 final?
Paolo Bonzini - Feb. 13, 2013, 8:57 a.m.
Il 13/02/2013 09:19, Stefan Priebe - Profihost AG ha scritto:
> Hi,
> 
> could this be this one?
> 
> commit 47a150a4bbb06e45ef439a8222e9f46a7c4cca3f
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Thu Jan 10 15:49:08 2013 +0100
> 
>     virtio-scsi: abort in-flight I/O when the device is reset
> 
>     When the device is reset, the SCSI bus should also be reset so
>     that in-flight I/O is cancelled.
> 
>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

You can certainly try reverting it, but this patch is fixing a real bug.

Paolo

> Greets,
> Stefan
> Am 13.02.2013 09:01, schrieb Stefan Priebe - Profihost AG:
>> Another hint: I've never seens this using qemu 1.3.1
>>
>> Stefan
>>
>> Am 13.02.2013 08:49, schrieb Stefan Priebe - Profihost AG:
>>> Hi Paolo,
>>>
>>> sadly no luck. A VM crashed again.
>>>
>>> [ ~]# addr2line -e /usr/lib/debug/usr/bin/kvm -f 24040c
>>> virtio_scsi_command_complete
>>> hw/virtio-scsi.c:429
>>>
>>> Same point like last time:
>>> static void virtio_scsi_command_complete(SCSIRequest *r, uint32_t status,
>>>                                          size_t resid)
>>> {
>>>     VirtIOSCSIReq *req = r->hba_private;
>>>     uint32_t sense_len;
>>>
>>> 429   =====>    req->resp.cmd->response = VIRTIO_SCSI_S_OK;
>>>     req->resp.cmd->status = status;
>>>     if (req->resp.cmd->status == GOOD) {
>>>         req->resp.cmd->resid = tswap32(resid);
>>>     } else {
>>>         req->resp.cmd->resid = 0;
>>>         sense_len = scsi_req_get_sense(r, req->resp.cmd->sense,
>>>                                        VIRTIO_SCSI_SENSE_SIZE);
>>>         req->resp.cmd->sense_len = tswap32(sense_len);
>>>     }
>>>     virtio_scsi_complete_req(req);
>>>
>>> Greets
>>> Stefan
>>>
>>> Am 12.02.2013 15:34, schrieb Paolo Bonzini:
>>>> Il 12/02/2013 14:46, Stefan Priebe - Profihost AG ha scritto:
>>>>> Hi,
>>>>>
>>>>> thanks - i applied the patch to the latest master. I hope that this will
>>>>> solve my issue. Will this one get integrated in 1.4 final?
> 
>
Stefan Priebe - Profihost AG - Feb. 13, 2013, 9:07 a.m.
Hi,
Am 13.02.2013 09:57, schrieb Paolo Bonzini:
> Il 13/02/2013 09:19, Stefan Priebe - Profihost AG ha scritto:
>> Hi,
>>
>> could this be this one?
>>
>> commit 47a150a4bbb06e45ef439a8222e9f46a7c4cca3f
...
> You can certainly try reverting it, but this patch is fixing a real bug.

Will try that. Yes but even if it fixes a bug and raises another one
(kvm segfault) which is the worst one. It should be fixed.

Greets,
Stefan
Paolo Bonzini - Feb. 13, 2013, 11:36 a.m.
Il 13/02/2013 10:07, Stefan Priebe - Profihost AG ha scritto:
>>> >>
>>> >> commit 47a150a4bbb06e45ef439a8222e9f46a7c4cca3f
> ...
>> > You can certainly try reverting it, but this patch is fixing a real bug.
> Will try that. Yes but even if it fixes a bug and raises another one
> (kvm segfault) which is the worst one. It should be fixed.

The KVM segfault is exposing a potential consistency problem.  What is
worse is not obvious.  Also, it is happening at reset time if this is
the culprit.  Reset usually happens at places where no data loss is caused.

Can you find out what the VM was doing when it segfaulted?  (Or even,
can you place the corefile and kvm executable somewhere where I can
download it?)

I'll prepare a test program that resets the adapter while doing I/O and
try to reproduce it myself, in the meanwhile: can you grep the VM's
/var/log/messages with kernel messages regarding the storage ("aborting
cmd" and other things after it)?  If not, do your VMs reset themselves
often for example?  Can you reproduce it on non-rbd storage?

Paolo
Stefan Priebe - Profihost AG - Feb. 13, 2013, 12:55 p.m.
Hi,
Am 13.02.2013 12:36, schrieb Paolo Bonzini:
> Il 13/02/2013 10:07, Stefan Priebe - Profihost AG ha scritto:
>>>>>>
>>>>>> commit 47a150a4bbb06e45ef439a8222e9f46a7c4cca3f
>> ...
>>>> You can certainly try reverting it, but this patch is fixing a real bug.
>> Will try that. Yes but even if it fixes a bug and raises another one
>> (kvm segfault) which is the worst one. It should be fixed.
> 
> The KVM segfault is exposing a potential consistency problem.  What is
> worse is not obvious.  Also, it is happening at reset time if this is
> the culprit.  Reset usually happens at places where no data loss is caused.
> 
> Can you find out what the VM was doing when it segfaulted?  (Or even,
> can you place the corefile and kvm executable somewhere where I can
> download it?)

Yes it was doing an fstrim -v / which resulted in:

[45648.453698] end_request: I/O error, dev sda, sector 9066952

> I'll prepare a test program that resets the adapter while doing I/O and
> try to reproduce it myself, in the meanwhile: can you grep the VM's
> /var/log/messages with kernel messages regarding the storage ("aborting
> cmd" and other things after it)? 
Sadly not as i don't have acore dump. The kvm processes are started
through variuos Daemons and there seems no way to activate core dumps
for an already running process and i don't know which VM will crash next.

> If not, do your VMs reset themselves
> often for example?
No

> Can you reproduce it on non-rbd storage?
I don't have another storage type. ;-(

Stefan

Patch

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 07220e4..1d8289c 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -221,6 +221,10 @@  static void scsi_write_do_fua(SCSIDiskReq *r)
 {
     SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
 
+    if (r->req.io_canceled) {
+        return;
+    }
+
     if (scsi_is_cmd_fua(&r->req.cmd)) {
         bdrv_acct_start(s->qdev.conf.bs, &r->acct, 0, BDRV_ACCT_FLUSH);
         r->req.aiocb = bdrv_aio_flush(s->qdev.conf.bs, scsi_aio_complete, r);
@@ -352,6 +356,10 @@  static void scsi_read_data(SCSIRequest *req)
     /* No data transfer may already be in progress */
     assert(r->req.aiocb == NULL);
 
+    if (r->req.io_canceled) {
+        return;
+    }
+
     /* The request is used as the AIO opaque value, so add a ref.  */
     scsi_req_ref(&r->req);
     if (r->req.cmd.mode == SCSI_XFER_TO_DEV) {
@@ -455,6 +463,10 @@  static void scsi_write_data(SCSIRequest *req)
     /* No data transfer may already be in progress */
     assert(r->req.aiocb == NULL);
 
+    if (r->req.io_canceled) {
+        return;
+    }
+
     /* The request is used as the AIO opaque value, so add a ref.  */
     scsi_req_ref(&r->req);
     if (r->req.cmd.mode != SCSI_XFER_TO_DEV) {