From patchwork Sat Jul 20 14:48:28 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Christie X-Patchwork-Id: 260460 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 1318E2C00A6 for ; Sun, 21 Jul 2013 00:49:09 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754174Ab3GTOsr (ORCPT ); Sat, 20 Jul 2013 10:48:47 -0400 Received: from dkim2.fusionio.com ([66.114.96.54]:37934 "EHLO dkim2.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754192Ab3GTOso (ORCPT ); Sat, 20 Jul 2013 10:48:44 -0400 Received: from mx1.fusionio.com (unknown [10.101.1.160]) by dkim2.fusionio.com (Postfix) with ESMTP id 6E86E9A068F for ; Sat, 20 Jul 2013 08:48:43 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fusionio.com; s=default; t=1374331723; bh=Pu9Voolc0zkZvoUT0fg+II8V44f96b8beLC92dos7P8=; h=Date:From:To:CC:Subject:References:In-Reply-To; b=e2s/V6htKahgU6O4Ip1YsE0mgh0nURaZMwdmuqNzRsbhywGabGfupvgEqDGSYSg/b naM2l84W/UmvCvXPKsEBWIptqFDG8ql+Z0uHHrUHP4ldtUZOsUzE6NtUfIK6g+3mT+ GujE0EjrjtlFzGaADjyhSKcdSYPUegyaJP5OJ5Vc= X-ASG-Debug-ID: 1374331722-03d6a545824ec80001-WJxGKN Received: from CAS1.int.fusionio.com (cas1.int.fusionio.com [10.101.1.40]) by mx1.fusionio.com with ESMTP id SQ2ilk10v4oDYdVx (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO); Sat, 20 Jul 2013 08:48:42 -0600 (MDT) X-Barracuda-Envelope-From: MChristie@fusionio.com Received: from [20.15.0.11] (10.101.1.160) by mail.fusionio.com (10.101.1.40) with Microsoft SMTP Server (TLS) id 14.3.123.3; Sat, 20 Jul 2013 08:48:41 -0600 Message-ID: <51EAA33C.9010405@fusionio.com> X-Barracuda-Apparent-Source-IP: 20.15.0.11 Date: Sat, 20 Jul 2013 09:48:28 -0500 From: Mike Christie User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6 MIME-Version: 1.0 To: "Nicholas A. Bellinger" CC: James Bottomley , Jens Axboe , Alexander Gordeev , Tejun Heo , , , Jeff Garzik , linux-scsi Subject: Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing References: <20130711102630.GA11133@dhcp-26-207.brq.redhat.com> <1373583637.7397.370.camel@haakon3.risingtidesystems.com> <20130712074559.GA8727@dhcp-26-207.brq.redhat.com> <1373692812.7397.625.camel@haakon3.risingtidesystems.com> <20130716183207.GA6402@dhcp-26-207.brq.redhat.com> <1374010683.7397.880.camel@haakon3.risingtidesystems.com> <20130717161909.GB21468@dhcp-26-207.brq.redhat.com> <1374173515.7397.948.camel@haakon3.risingtidesystems.com> <51E83E32.9050306@cs.wisc.edu> <1374193399.7397.973.camel@haakon3.risingtidesystems.com> <20130719003034.GG28005@kernel.dk> <1374195825.7397.997.camel@haakon3.risingtidesystems.com> <1374215660.7397.1041.camel@haakon3.risingtidesystems.com> <1374248000.2266.20.camel@dabdike> <1374267684.7397.1058.camel@haakon3.risingtidesystems.com> <1374296162.7397.1098.camel@haakon3.risingtidesystems.com> X-ASG-Orig-Subj: Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing In-Reply-To: <1374296162.7397.1098.camel@haakon3.risingtidesystems.com> X-Originating-IP: [10.101.1.160] X-Barracuda-Connect: cas1.int.fusionio.com[10.101.1.40] X-Barracuda-Start-Time: 1374331722 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://10.101.1.180:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at fusionio.com X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_SC0_MISMATCH_TO X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.137254 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 BSF_SC0_MISMATCH_TO Envelope rcpt doesn't match header Sender: linux-ide-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ide@vger.kernel.org On 07/19/2013 11:56 PM, Nicholas A. Bellinger wrote: > On Fri, 2013-07-19 at 14:01 -0700, Nicholas A. Bellinger wrote: >> On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote: >>> On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote: >>>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c >>>> index 0101af5..191bc15 100644 >>>> --- a/drivers/ata/libata-scsi.c >>>> +++ b/drivers/ata/libata-scsi.c >>>> @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev, >>>> "sector_size=%u > PAGE_SIZE, PIO may malfunction\n", >>>> sdev->sector_size); >>>> >>>> - blk_queue_update_dma_alignment(q, sdev->sector_size - 1); >>>> + if (!q->mq_ops) { >>>> + blk_queue_update_dma_alignment(q, sdev->sector_size - 1); >>>> + } else { >>>> + printk("Skipping dma_alignment for libata w/ scsi-mq\n"); >>>> + } >>> >>> Amazingly enough there is a reason for the dma alignment, and it wasn't >>> just to annoy you, so you can't blindly do this. >>> >>> The email thread is probably lost in the mists of time, but if I >>> remember correctly the problem is that some ahci DMA controllers barf if >>> the sector they're doing DMA on crosses a page boundary. Some are >>> annoying enough to actually cause silent data corruption. You won't >>> find every ahci DMA controller doing this, so the change will work for >>> some, but it will be hard to identify those it won't work for until >>> people start losing data. >> >> Thanks for the extra background. >> >> So at least from what I gather thus far this shouldn't be an issue for >> initial testing with scsi-mq <-> libata w/ ata_piix. >> >>> >>> The correct fix, obviously, is to do the bio copy on the kernel path for >>> unaligned data. It is OK to assume that REQ_TYPE_FS data is correctly >>> aligned (because of the block to page alignment). >>> >> >> Indeed. Looking into the bio_copy_kern() breakage next.. >> > > OK, after further investigation the root cause is a actually a missing > bio->bio_end_io() -> bio_copy_kern_endio() -> bio_put() from the > blk_end_sync_rq() callback path that scsi-mq REQ_TYPE_BLOCK_PC is > currently using. > > Including the following patch into the scsi-mq working branch now, and > reverting the libata dma_alignment=0x03 hack. > > Alexander, care to give this a try..? > > --nab > > diff --git a/block/blk-exec.c b/block/blk-exec.c > index 0761c89..70303d2 100644 > --- a/block/blk-exec.c > +++ b/block/blk-exec.c > @@ -25,7 +25,10 @@ static void blk_end_sync_rq(struct request *rq, int error) > struct completion *waiting = rq->end_io_data; > > rq->end_io_data = NULL; > - if (!rq->q->mq_ops) { > + if (rq->q->mq_ops) { > + if (rq->bio) > + bio_endio(rq->bio, error); > + } else { > __blk_put_request(rq->q, rq); > } > This does not handle requests with multiple bios, and for the mq stye passthrough insertion completions you actually want to call blk_mq_finish_request in scsi_execute. Same for all the other passthrough code in your scsi mq tree. That is your root bug. Instead of doing that though I think we want to have the block layer free the bios like before. For the non mq calls, blk_end_request type of calls will complete the bios when blk_finish_request is called from that path. It will then call the rq end_io callback. I think the blk mq code assumes if the end_io callack is set that the end_io function will do the bio cleanup. See __blk_mq_end_io. Also see how blk_mq_execute_rq calls blk_mq_finish_request for an example of how rq passthrough execution and cleanup is being done in the mq paths. Now with the scsi mq changes, when blk_execute_rq_nowait calls blk_mq_insert_request it calls it with a old non mq style of end io function that does not complete the bios. What about the attached only compile tested patch. The patch has the mq block code work like the non mq code for bio cleanups. blk-mq: blk-mq should free bios in pass through case For non mq calls, the block layer will free the bios when blk_finish_request is called. For mq calls, the blk mq code wants the caller to do this. This patch has the blk mq code work like the non mq code and has the block layer free the bios. Signed-off-by: Mike Christie diff --git a/block/blk-flush.c b/block/blk-flush.c index c56c37d..3e4cc9c 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -231,7 +231,7 @@ static void flush_end_io(struct request *flush_rq, int error) unsigned long flags = 0; if (q->mq_ops) { - blk_mq_finish_request(flush_rq, error); + blk_mq_free_request(flush_rq); spin_lock_irqsave(&q->mq_flush_lock, flags); } running = &q->flush_queue[q->flush_running_idx]; diff --git a/block/blk-mq.c b/block/blk-mq.c index 799d305..5489b5a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -270,7 +270,7 @@ void blk_mq_free_request(struct request *rq) } EXPORT_SYMBOL(blk_mq_free_request); -void blk_mq_finish_request(struct request *rq, int error) +static void blk_mq_finish_request(struct request *rq, int error) { struct bio *bio = rq->bio; unsigned int bytes = 0; @@ -286,22 +286,17 @@ void blk_mq_finish_request(struct request *rq, int error) blk_account_io_completion(rq, bytes); blk_account_io_done(rq); - blk_mq_free_request(rq); } void blk_mq_complete_request(struct request *rq, int error) { trace_block_rq_complete(rq->q, rq); + blk_mq_finish_request(rq, error); - /* - * If ->end_io is set, it's responsible for doing the rest of the - * completion. - */ if (rq->end_io) rq->end_io(rq, error); else - blk_mq_finish_request(rq, error); - + blk_mq_free_request(rq); } void __blk_mq_end_io(struct request *rq, int error) @@ -973,8 +968,7 @@ int blk_mq_execute_rq(struct request_queue *q, struct request *rq) if (rq->errors) err = -EIO; - blk_mq_finish_request(rq, rq->errors); - + blk_mq_free_request(rq); return err; } EXPORT_SYMBOL(blk_mq_execute_rq); diff --git a/block/blk-mq.h b/block/blk-mq.h index 42d0110..52bf1f9 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -27,7 +27,6 @@ void blk_mq_complete_request(struct request *rq, int error); void blk_mq_run_request(struct request *rq, bool run_queue, bool async); void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async); void blk_mq_init_flush(struct request_queue *q); -void blk_mq_finish_request(struct request *rq, int error); /* * CPU hotplug helpers