diff mbox

[SRU,Xenial,Yakkety,Zesty,Artful,1/1] blk-mq: NVMe 512B/4K+T10 DIF/DIX format returns I/O error on dd with split op

Message ID 78ae926b5006d76c03c61007d45c59a07aa82a30.1494865742.git.joseph.salisbury@canonical.com
State New
Headers show

Commit Message

Joseph Salisbury July 14, 2017, 3:46 p.m. UTC
From: Wen Xiong <wenxiong@linux.vnet.ibm.com>

BugLink: http://bugs.launchpad.net/bugs/1689946

When formatting NVMe to 512B/4K + T10 DIf/DIX, dd with split op returns
"Input/output error". Looks block layer split the bio after calling
bio_integrity_prep(bio). This patch fixes the issue.

Below is how we debug this issue:
(1)format nvme to 4K block # size with type 2 DIF
(2)dd with block size bigger than 1024k.
oflag=direct
dd: error writing '/dev/nvme0n1': Input/output error

We added some debug code in nvme device driver. It showed us the first
op and the second op have the same bi and pi address. This is not
correct.

1st op: nvme0n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
	dsmgmt=0x0, AT=0x0 & RT=0x505
	Guard 0x00b1, AT 0x0000, RT physical 0x00000505 RT virtual 0x00002828

2nd op: nvme0n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
	AT=0x0 & RT=0x605  ==> This op fails and subsequent 5 retires..
	Guard 0x00b1, AT 0x0000, RT physical 0x00000605 RT virtual 0x00002828

With the fix, It showed us both of the first op and the second op have
correct bi and pi address.

1st op: nvme2n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
	dsmgmt=0x0, AT=0x0 & RT=0x505
	Guard 0x5ccb, AT 0x0000, RT physical 0x00000505 RT virtual
	0x00002828
2nd op: nvme2n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
	AT=0x0 & RT=0x605
	Guard 0xab4c, AT 0x0000, RT physical 0x00000605 RT virtual
	0x00003028

Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit f36ea50ca0043e7b1204feaf1d2ba6bd68c08d36)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
---
 block/blk-mq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Stefan Bader July 17, 2017, 10:37 a.m. UTC | #1
On 14.07.2017 17:46, Joseph Salisbury wrote:
> From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
> 
> BugLink: http://bugs.launchpad.net/bugs/1689946
> 
> When formatting NVMe to 512B/4K + T10 DIf/DIX, dd with split op returns
> "Input/output error". Looks block layer split the bio after calling
> bio_integrity_prep(bio). This patch fixes the issue.
> 
> Below is how we debug this issue:
> (1)format nvme to 4K block # size with type 2 DIF
> (2)dd with block size bigger than 1024k.
> oflag=direct
> dd: error writing '/dev/nvme0n1': Input/output error
> 
> We added some debug code in nvme device driver. It showed us the first
> op and the second op have the same bi and pi address. This is not
> correct.
> 
> 1st op: nvme0n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
> 	dsmgmt=0x0, AT=0x0 & RT=0x505
> 	Guard 0x00b1, AT 0x0000, RT physical 0x00000505 RT virtual 0x00002828
> 
> 2nd op: nvme0n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
> 	AT=0x0 & RT=0x605  ==> This op fails and subsequent 5 retires..
> 	Guard 0x00b1, AT 0x0000, RT physical 0x00000605 RT virtual 0x00002828
> 
> With the fix, It showed us both of the first op and the second op have
> correct bi and pi address.
> 
> 1st op: nvme2n1 Op:Wr slba 0x505 length 0x100, PI ctrl=0x1400,
> 	dsmgmt=0x0, AT=0x0 & RT=0x505
> 	Guard 0x5ccb, AT 0x0000, RT physical 0x00000505 RT virtual
> 	0x00002828
> 2nd op: nvme2n1 Op:Wr slba 0x605 length 0x1, PI ctrl=0x1400, dsmgmt=0x0,
> 	AT=0x0 & RT=0x605
> 	Guard 0xab4c, AT 0x0000, RT physical 0x00000605 RT virtual
> 	0x00003028
> 
> Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
> Signed-off-by: Jens Axboe <axboe@fb.com>
> (cherry picked from commit f36ea50ca0043e7b1204feaf1d2ba6bd68c08d36)
> Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>

> ---

SRU justification needs to be added to bug report.
>  block/blk-mq.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 567a3ed..d45989b 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1295,13 +1295,13 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  
>  	blk_queue_bounce(q, &bio);
>  
> +	blk_queue_split(q, &bio, q->bio_split);
> +
>  	if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
>  		bio_io_error(bio);
>  		return BLK_QC_T_NONE;
>  	}
>  
> -	blk_queue_split(q, &bio, q->bio_split);
> -
>  	if (!is_flush_fua && !blk_queue_nomerges(q) &&
>  	    blk_attempt_plug_merge(q, bio, &request_count, &same_queue_rq))
>  		return BLK_QC_T_NONE;
>
diff mbox

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 567a3ed..d45989b 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1295,13 +1295,13 @@  static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 
 	blk_queue_bounce(q, &bio);
 
+	blk_queue_split(q, &bio, q->bio_split);
+
 	if (bio_integrity_enabled(bio) && bio_integrity_prep(bio)) {
 		bio_io_error(bio);
 		return BLK_QC_T_NONE;
 	}
 
-	blk_queue_split(q, &bio, q->bio_split);
-
 	if (!is_flush_fua && !blk_queue_nomerges(q) &&
 	    blk_attempt_plug_merge(q, bio, &request_count, &same_queue_rq))
 		return BLK_QC_T_NONE;