From patchwork Mon Feb 11 14:42:26 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 219615 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 17C212C02EA for ; Tue, 12 Feb 2013 01:42:44 +1100 (EST) Received: from localhost ([::1]:58390 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U4uag-0007mO-9v for incoming@patchwork.ozlabs.org; Mon, 11 Feb 2013 09:42:42 -0500 Received: from eggs.gnu.org ([208.118.235.92]:50910) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U4uaY-0007mF-Pq for qemu-devel@nongnu.org; Mon, 11 Feb 2013 09:42:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1U4uaX-0006lo-KB for qemu-devel@nongnu.org; Mon, 11 Feb 2013 09:42:34 -0500 Received: from mx1.redhat.com ([209.132.183.28]:32867) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1U4uaX-0006lQ-BW for qemu-devel@nongnu.org; Mon, 11 Feb 2013 09:42:33 -0500 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r1BEgUhE026411 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 11 Feb 2013 09:42:31 -0500 Received: from yakj.usersys.redhat.com (ovpn-112-16.ams2.redhat.com [10.36.112.16]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r1BEgQYH031218; Mon, 11 Feb 2013 09:42:28 -0500 Message-ID: <51190352.2030708@redhat.com> Date: Mon, 11 Feb 2013 15:42:26 +0100 From: Paolo Bonzini User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Stefan Priebe - Profihost AG References: <5118A1BB.9090408@profihost.ag> <20130211094029.GD29986@stefanha-thinkpad.redhat.com> <5118BE75.5020702@profihost.ag> <5118E88B.80604@redhat.com> <5118ED68.1030605@profihost.ag> <5118F38D.9000106@profihost.ag> <5118F87E.4070702@redhat.com> <5118F8ED.9010707@profihost.ag> <5118F938.7020306@redhat.com> <5118F9F3.9090909@profihost.ag> <5118FC4C.3040901@redhat.com> <5118FDCB.6070509@profihost.ag> In-Reply-To: <5118FDCB.6070509@profihost.ag> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Stefan Hajnoczi , qemu-devel Subject: Re: [Qemu-devel] kvm segfaulting X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Il 11/02/2013 15:18, Stefan Priebe - Profihost AG ha scritto: >> > Some trace that a request was actually cancelled, but I think I >> > believe > Ah but that must be in guest not on host right? How to grab that from > client when it is crashing? Serial console could have something like "sda: aborting command". It is actually interesting to see what is causing commands to be aborted (typically a timeout, but what causes the timeout? :). >> > that. This seems to be the same issue as commits >> > 1bd075f29ea6d11853475c7c42734595720c3ac6 (iSCSI) and >> > 473c7f0255920bcaf37411990a3725898772817f (rbd), where the "cancelled" >> > callback is called before the "complete" callback. > If there is the same code in virtio-scsi it might be. No, virtio-scsi is relying on the backends (including scsi-disk) doing it correctly. The RBD code looks okay, so it's still my fault :) but not virtio-scsi's. I think this happens when a request is split into multiple parts, and one of them is canceled. Then the next part is fired, but virtio-scsi's cancellation callbacks have fired already. You can test this patch: Paolo diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 07220e4..1d8289c 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -221,6 +221,10 @@ static void scsi_write_do_fua(SCSIDiskReq *r) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev); + if (r->req.io_canceled) { + return; + } + if (scsi_is_cmd_fua(&r->req.cmd)) { bdrv_acct_start(s->qdev.conf.bs, &r->acct, 0, BDRV_ACCT_FLUSH); r->req.aiocb = bdrv_aio_flush(s->qdev.conf.bs, scsi_aio_complete, r); @@ -352,6 +356,10 @@ static void scsi_read_data(SCSIRequest *req) /* No data transfer may already be in progress */ assert(r->req.aiocb == NULL); + if (r->req.io_canceled) { + return; + } + /* The request is used as the AIO opaque value, so add a ref. */ scsi_req_ref(&r->req); if (r->req.cmd.mode == SCSI_XFER_TO_DEV) { @@ -455,6 +463,10 @@ static void scsi_write_data(SCSIRequest *req) /* No data transfer may already be in progress */ assert(r->req.aiocb == NULL); + if (r->req.io_canceled) { + return; + } + /* The request is used as the AIO opaque value, so add a ref. */ scsi_req_ref(&r->req); if (r->req.cmd.mode != SCSI_XFER_TO_DEV) {