Patchwork [RESEND,0/1] AHCI: Optimize interrupt processing

login
register
mail settings
Submitter Nicholas A. Bellinger
Date July 19, 2013, 6:34 a.m.
Message ID <1374215660.7397.1041.camel@haakon3.risingtidesystems.com>
Download mbox | patch
Permalink /patch/260188/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Nicholas A. Bellinger - July 19, 2013, 6:34 a.m.
On Thu, 2013-07-18 at 18:03 -0700, Nicholas A. Bellinger wrote:
> On Thu, 2013-07-18 at 18:30 -0600, Jens Axboe wrote:
> > On Thu, Jul 18 2013, Nicholas A. Bellinger wrote:
> > > On Thu, 2013-07-18 at 13:12 -0600, Mike Christie wrote:
> > > > On 07/18/2013 12:51 PM, Nicholas A. Bellinger wrote:

<SNIP>

> > > <nod>, just noticed the blk_queue_bounce() in blk_rq_map_kern().  
> > > 
> > > Not sure why this doesn't seem to be doing what it's supposed to for
> > > libata just yet..
> > 
> > How are you make the request from the bio? It'd be pretty trivial to
> > ensure that it gets bounced properly... blk_mq_execute_rq() assumes a
> > fully complete request, so it wont bounce anything.
> > 
> 
> From what I gather for REQ_TYPE_BLOCK_PC, scsi_execute() ->
> blk_rq_map_kern() -> blk_rq_append_bio() -> blk_rq_bio_prep() is what
> does the request setup from the bios returned by bio_[copy,map]_kern()
> in blk_rq_map_kern() code.
> 
> blk_queue_bounce() is called immediately after blk_rq_append_bio() here,
> which AFAICT looks like it's doing the correct thing for scsi-mq..
> 
> What is strange here is that libata-scsi.c CDB emulation code is doing
> the same stuff for both INQUIRY (that seems to be OK) and READ_CAPACITY
> (that is returning zeros), which makes me think that something else is
> going on..
> 
> Alexander, where you able to re-test using sdev->sdev_mq_reg.queue_depth
> = 1 in scsi_mq_alloc_queue()..?
> 

So after a bit more hacking tonight it appears the explicit setting of
dma_alignment by libata (sdev->sector_size - 1) is what is making
blk_rq_map_kern() invoke blk_copy_kern() instead of blk_map_kern(), and
triggering this scsi-mq specific bug with libata.  I'm able to confirm
with QEMU IDE emulation the bug is occurring only after INQUIRY and
before READ_CAPACITY, as reported by Alexander.

Below is a quick hack to skip this setting in ata_scsi_dev_config() for
blk-mq, and leaves the default dma_alignment=0x03 for REQ_TYPE_BLOCK_PC
requests as initially set by scsi-core in scsi_init_request_queue().

Also included is the change for using queue_depth = min(SHT->can_queue,
SHT->cmd_per_lun) during scsi-mq request_queue initialization, along
with a very basic ata_piix conversion for testing purposes.

With these three changes in place, I'm able to register a single 1GB
ata_piix LUN using QEMU IDE emulation, and successfully run simple fio
writeverify tests with blocksize=4k @ queue_depth=1.

Obviously this is not a proper bugfix for unaligned blk_copy_kern() with
scsi-mq + REQ_TYPE_BLOCK_PC case, but should be enough to at least get
libata LUN scanning to work.  Need to take a deeper look at the problem
here..

Alexander, care to give this work-around a shot on your bare-metal setup
using ata_piix..?

--nab


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
James Bottomley - July 19, 2013, 3:33 p.m.
On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index 0101af5..191bc15 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
>                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
>                         sdev->sector_size);
>  
> -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> +       if (!q->mq_ops) {
> +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> +       } else {
> +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
> +       }

Amazingly enough there is a reason for the dma alignment, and it wasn't
just to annoy you, so you can't blindly do this.

The email thread is probably lost in the mists of time, but if I
remember correctly the problem is that some ahci DMA controllers barf if
the sector they're doing DMA on crosses a page boundary.  Some are
annoying enough to actually cause silent data corruption.  You won't
find every ahci DMA controller doing this, so the change will work for
some, but it will be hard to identify those it won't work for until
people start losing data.

The correct fix, obviously, is to do the bio copy on the kernel path for
unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
aligned (because of the block to page alignment).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger - July 19, 2013, 9:01 p.m.
On Fri, 2013-07-19 at 08:33 -0700, James Bottomley wrote:
> On Thu, 2013-07-18 at 23:34 -0700, Nicholas A. Bellinger wrote:
> > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> > index 0101af5..191bc15 100644
> > --- a/drivers/ata/libata-scsi.c
> > +++ b/drivers/ata/libata-scsi.c
> > @@ -1144,7 +1144,11 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
> >                         "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
> >                         sdev->sector_size);
> >  
> > -       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> > +       if (!q->mq_ops) {
> > +               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
> > +       } else {
> > +               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
> > +       }
> 
> Amazingly enough there is a reason for the dma alignment, and it wasn't
> just to annoy you, so you can't blindly do this.
> 
> The email thread is probably lost in the mists of time, but if I
> remember correctly the problem is that some ahci DMA controllers barf if
> the sector they're doing DMA on crosses a page boundary.  Some are
> annoying enough to actually cause silent data corruption.  You won't
> find every ahci DMA controller doing this, so the change will work for
> some, but it will be hard to identify those it won't work for until
> people start losing data.

Thanks for the extra background.

So at least from what I gather thus far this shouldn't be an issue for
initial testing with scsi-mq <-> libata w/ ata_piix.

> 
> The correct fix, obviously, is to do the bio copy on the kernel path for
> unaligned data.  It is OK to assume that REQ_TYPE_FS data is correctly
> aligned (because of the block to page alignment).
> 

Indeed.  Looking into the bio_copy_kern() breakage next..

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index 9a8a674..ac05cd6 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -1066,6 +1066,8 @@  static u8 piix_vmw_bmdma_status(struct ata_port *ap)
 
 static struct scsi_host_template piix_sht = {
        ATA_BMDMA_SHT(DRV_NAME),
+       .scsi_mq        = true,
+       .queuecommand_mq = ata_scsi_queuecmd,
 };
 
 static struct ata_port_operations piix_sata_ops = {
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 0101af5..191bc15 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1144,7 +1144,11 @@  static int ata_scsi_dev_config(struct scsi_device *sdev,
                        "sector_size=%u > PAGE_SIZE, PIO may malfunction\n",
                        sdev->sector_size);
 
-       blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
+       if (!q->mq_ops) {
+               blk_queue_update_dma_alignment(q, sdev->sector_size - 1);
+       } else {
+               printk("Skipping dma_alignment for libata w/ scsi-mq\n");
+       }
 
        if (dev->flags & ATA_DFLAG_AN)
                set_bit(SDEV_EVT_MEDIA_CHANGE, sdev->supported_events);
diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
index ca6ff67..81b2633 100644
--- a/drivers/scsi/scsi-mq.c
+++ b/drivers/scsi/scsi-mq.c
@@ -199,11 +199,11 @@  int scsi_mq_alloc_queue(struct Scsi_Host *sh, struct scsi_device *sdev)
        int i, j;
 
        sdev->sdev_mq_reg.ops = &scsi_mq_ops;
-       sdev->sdev_mq_reg.queue_depth = sdev->queue_depth;
+       sdev->sdev_mq_reg.queue_depth = min((short)sh->hostt->can_queue,
+                                           sh->hostt->cmd_per_lun);
        sdev->sdev_mq_reg.cmd_size = sizeof(struct scsi_cmnd) + sh->hostt->cmd_size;
        sdev->sdev_mq_reg.numa_node = NUMA_NO_NODE;
        sdev->sdev_mq_reg.nr_hw_queues = 1;
-       sdev->sdev_mq_reg.queue_depth = 64;
        sdev->sdev_mq_reg.flags = BLK_MQ_F_SHOULD_MERGE;
 
        printk("Calling blk_mq_init_queue: scsi_mq_ops: %p, queue_depth: %d,"