[2/2,SRU,ARTFUL] block: cope with WRITE ZEROES failing in blkdev_issue_zeroout()

Message ID 20171121170528.20743-3-colin.king@canonical.com
State New
Headers show
Series
  • Untitled series #14584
Related show

Commit Message

Colin King Nov. 21, 2017, 5:05 p.m.
From: Ilya Dryomov <idryomov@gmail.com>

BugLink: https://bugs.launchpad.net/bugs/1726818

sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to
permit trying WRITE SAME on older SCSI devices, unless ->no_write_same
is set.  Because REQ_OP_WRITE_ZEROES is implemented in terms of WRITE
SAME, blkdev_issue_zeroout() may fail with -EREMOTEIO:

  $ fallocate -zn -l 1k /dev/sdg
  fallocate: fallocate failed: Remote I/O error
  $ fallocate -zn -l 1k /dev/sdg  # OK
  $ fallocate -zn -l 1k /dev/sdg  # OK

The following calls succeed because sd_done() sets ->no_write_same in
response to a sense that would become BLK_STS_TARGET/-EREMOTEIO, causing
__blkdev_issue_zeroout() to fall back to generating ZERO_PAGE bios.

This means blkdev_issue_zeroout() must cope with WRITE ZEROES failing
and fall back to manually zeroing, unless BLKDEV_ZERO_NOFALLBACK is
specified.  For BLKDEV_ZERO_NOFALLBACK case, return -EOPNOTSUPP if
sd_done() has just set ->no_write_same thus indicating lack of offload
support.

Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing")
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-lib.c | 45 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 35 insertions(+), 10 deletions(-)

Comments

Kleber Souza Jan. 5, 2018, 3:22 p.m. | #1
On 11/21/17 18:05, Colin King wrote:
> From: Ilya Dryomov <idryomov@gmail.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1726818
> 
> sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to
> permit trying WRITE SAME on older SCSI devices, unless ->no_write_same
> is set.  Because REQ_OP_WRITE_ZEROES is implemented in terms of WRITE
> SAME, blkdev_issue_zeroout() may fail with -EREMOTEIO:
> 
>   $ fallocate -zn -l 1k /dev/sdg
>   fallocate: fallocate failed: Remote I/O error
>   $ fallocate -zn -l 1k /dev/sdg  # OK
>   $ fallocate -zn -l 1k /dev/sdg  # OK
> 
> The following calls succeed because sd_done() sets ->no_write_same in
> response to a sense that would become BLK_STS_TARGET/-EREMOTEIO, causing
> __blkdev_issue_zeroout() to fall back to generating ZERO_PAGE bios.
> 
> This means blkdev_issue_zeroout() must cope with WRITE ZEROES failing
> and fall back to manually zeroing, unless BLKDEV_ZERO_NOFALLBACK is
> specified.  For BLKDEV_ZERO_NOFALLBACK case, return -EOPNOTSUPP if
> sd_done() has just set ->no_write_same thus indicating lack of offload
> support.
> 
> Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing")
> Cc: Hannes Reinecke <hare@suse.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Hi Colin,

This patch is missing the "backported/cherry picked from" tag and your SOB.

The first patch of the series has you as the author, but mainline commit
425a4dba7953e35ffd096771973add6d2f40d2ed has "Ilya Dryomov
<idryomov@gmail.com>" as author. This we can fix when applying the
patch, but the SOB for the second one needs to come from you.

Thanks,
Kleber

> ---
>  block/blk-lib.c | 45 +++++++++++++++++++++++++++++++++++----------
>  1 file changed, 35 insertions(+), 10 deletions(-)
> 
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index 4ef942c..595517c 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -321,12 +321,6 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
>   *  Zero-fill a block range, either using hardware offload or by explicitly
>   *  writing zeroes to the device.
>   *
> - *  Note that this function may fail with -EOPNOTSUPP if the driver signals
> - *  zeroing offload support, but the device fails to process the command (for
> - *  some devices there is no non-destructive way to verify whether this
> - *  operation is actually supported).  In this case the caller should call
> - *  retry the call to blkdev_issue_zeroout() and the fallback path will be used.
> - *
>   *  If a device is using logical block provisioning, the underlying space will
>   *  not be released if %flags contains BLKDEV_ZERO_NOUNMAP.
>   *
> @@ -370,18 +364,49 @@ EXPORT_SYMBOL(__blkdev_issue_zeroout);
>  int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>  		sector_t nr_sects, gfp_t gfp_mask, unsigned flags)
>  {
> -	int ret;
> -	struct bio *bio = NULL;
> +	int ret = 0;
> +	sector_t bs_mask;
> +	struct bio *bio;
>  	struct blk_plug plug;
> +	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev);
>  
> +	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
> +	if ((sector | nr_sects) & bs_mask)
> +		return -EINVAL;
> +
> +retry:
> +	bio = NULL;
>  	blk_start_plug(&plug);
> -	ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
> -			&bio, flags);
> +	if (try_write_zeroes) {
> +		ret = __blkdev_issue_write_zeroes(bdev, sector, nr_sects,
> +						  gfp_mask, &bio, flags);
> +	} else if (!(flags & BLKDEV_ZERO_NOFALLBACK)) {
> +		ret = __blkdev_issue_zero_pages(bdev, sector, nr_sects,
> +						gfp_mask, &bio);
> +	} else {
> +		/* No zeroing offload support */
> +		ret = -EOPNOTSUPP;
> +	}
>  	if (ret == 0 && bio) {
>  		ret = submit_bio_wait(bio);
>  		bio_put(bio);
>  	}
>  	blk_finish_plug(&plug);
> +	if (ret && try_write_zeroes) {
> +		if (!(flags & BLKDEV_ZERO_NOFALLBACK)) {
> +			try_write_zeroes = false;
> +			goto retry;
> +		}
> +		if (!bdev_write_zeroes_sectors(bdev)) {
> +			/*
> +			 * Zeroing offload support was indicated, but the
> +			 * device reported ILLEGAL REQUEST (for some devices
> +			 * there is no non-destructive way to verify whether
> +			 * WRITE ZEROES is actually supported).
> +			 */
> +			ret = -EOPNOTSUPP;
> +		}
> +	}
>  
>  	return ret;
>  }
>
Colin King Jan. 5, 2018, 3:29 p.m. | #2
On 05/01/18 15:22, Kleber Souza wrote:
> On 11/21/17 18:05, Colin King wrote:
>> From: Ilya Dryomov <idryomov@gmail.com>
>>
>> BugLink: https://bugs.launchpad.net/bugs/1726818
>>
>> sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to
>> permit trying WRITE SAME on older SCSI devices, unless ->no_write_same
>> is set.  Because REQ_OP_WRITE_ZEROES is implemented in terms of WRITE
>> SAME, blkdev_issue_zeroout() may fail with -EREMOTEIO:
>>
>>   $ fallocate -zn -l 1k /dev/sdg
>>   fallocate: fallocate failed: Remote I/O error
>>   $ fallocate -zn -l 1k /dev/sdg  # OK
>>   $ fallocate -zn -l 1k /dev/sdg  # OK
>>
>> The following calls succeed because sd_done() sets ->no_write_same in
>> response to a sense that would become BLK_STS_TARGET/-EREMOTEIO, causing
>> __blkdev_issue_zeroout() to fall back to generating ZERO_PAGE bios.
>>
>> This means blkdev_issue_zeroout() must cope with WRITE ZEROES failing
>> and fall back to manually zeroing, unless BLKDEV_ZERO_NOFALLBACK is
>> specified.  For BLKDEV_ZERO_NOFALLBACK case, return -EOPNOTSUPP if
>> sd_done() has just set ->no_write_same thus indicating lack of offload
>> support.
>>
>> Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing")
>> Cc: Hannes Reinecke <hare@suse.com>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
>> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> Hi Colin,
> 
> This patch is missing the "backported/cherry picked from" tag and your SOB.

Do you mind adding the following lines before applying patch:

(cherry picked from upstream commit
d5ce4c31d6df518dd8f63bbae20d7423c5018a6c)
Signed-off-by: Colin Ian King <colin.king@canonical.com>

> 
> The first patch of the series has you as the author, but mainline commit
> 425a4dba7953e35ffd096771973add6d2f40d2ed has "Ilya Dryomov
> <idryomov@gmail.com>" as author. This we can fix when applying the
> patch, but the SOB for the second one needs to come from you.
> 
> Thanks,
> Kleber
> 
>> ---
>>  block/blk-lib.c | 45 +++++++++++++++++++++++++++++++++++----------
>>  1 file changed, 35 insertions(+), 10 deletions(-)
>>
>> diff --git a/block/blk-lib.c b/block/blk-lib.c
>> index 4ef942c..595517c 100644
>> --- a/block/blk-lib.c
>> +++ b/block/blk-lib.c
>> @@ -321,12 +321,6 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
>>   *  Zero-fill a block range, either using hardware offload or by explicitly
>>   *  writing zeroes to the device.
>>   *
>> - *  Note that this function may fail with -EOPNOTSUPP if the driver signals
>> - *  zeroing offload support, but the device fails to process the command (for
>> - *  some devices there is no non-destructive way to verify whether this
>> - *  operation is actually supported).  In this case the caller should call
>> - *  retry the call to blkdev_issue_zeroout() and the fallback path will be used.
>> - *
>>   *  If a device is using logical block provisioning, the underlying space will
>>   *  not be released if %flags contains BLKDEV_ZERO_NOUNMAP.
>>   *
>> @@ -370,18 +364,49 @@ EXPORT_SYMBOL(__blkdev_issue_zeroout);
>>  int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
>>  		sector_t nr_sects, gfp_t gfp_mask, unsigned flags)
>>  {
>> -	int ret;
>> -	struct bio *bio = NULL;
>> +	int ret = 0;
>> +	sector_t bs_mask;
>> +	struct bio *bio;
>>  	struct blk_plug plug;
>> +	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev);
>>  
>> +	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
>> +	if ((sector | nr_sects) & bs_mask)
>> +		return -EINVAL;
>> +
>> +retry:
>> +	bio = NULL;
>>  	blk_start_plug(&plug);
>> -	ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
>> -			&bio, flags);
>> +	if (try_write_zeroes) {
>> +		ret = __blkdev_issue_write_zeroes(bdev, sector, nr_sects,
>> +						  gfp_mask, &bio, flags);
>> +	} else if (!(flags & BLKDEV_ZERO_NOFALLBACK)) {
>> +		ret = __blkdev_issue_zero_pages(bdev, sector, nr_sects,
>> +						gfp_mask, &bio);
>> +	} else {
>> +		/* No zeroing offload support */
>> +		ret = -EOPNOTSUPP;
>> +	}
>>  	if (ret == 0 && bio) {
>>  		ret = submit_bio_wait(bio);
>>  		bio_put(bio);
>>  	}
>>  	blk_finish_plug(&plug);
>> +	if (ret && try_write_zeroes) {
>> +		if (!(flags & BLKDEV_ZERO_NOFALLBACK)) {
>> +			try_write_zeroes = false;
>> +			goto retry;
>> +		}
>> +		if (!bdev_write_zeroes_sectors(bdev)) {
>> +			/*
>> +			 * Zeroing offload support was indicated, but the
>> +			 * device reported ILLEGAL REQUEST (for some devices
>> +			 * there is no non-destructive way to verify whether
>> +			 * WRITE ZEROES is actually supported).
>> +			 */
>> +			ret = -EOPNOTSUPP;
>> +		}
>> +	}
>>  
>>  	return ret;
>>  }
>>
>

Patch

diff --git a/block/blk-lib.c b/block/blk-lib.c
index 4ef942c..595517c 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -321,12 +321,6 @@  static int __blkdev_issue_zero_pages(struct block_device *bdev,
  *  Zero-fill a block range, either using hardware offload or by explicitly
  *  writing zeroes to the device.
  *
- *  Note that this function may fail with -EOPNOTSUPP if the driver signals
- *  zeroing offload support, but the device fails to process the command (for
- *  some devices there is no non-destructive way to verify whether this
- *  operation is actually supported).  In this case the caller should call
- *  retry the call to blkdev_issue_zeroout() and the fallback path will be used.
- *
  *  If a device is using logical block provisioning, the underlying space will
  *  not be released if %flags contains BLKDEV_ZERO_NOUNMAP.
  *
@@ -370,18 +364,49 @@  EXPORT_SYMBOL(__blkdev_issue_zeroout);
 int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
 		sector_t nr_sects, gfp_t gfp_mask, unsigned flags)
 {
-	int ret;
-	struct bio *bio = NULL;
+	int ret = 0;
+	sector_t bs_mask;
+	struct bio *bio;
 	struct blk_plug plug;
+	bool try_write_zeroes = !!bdev_write_zeroes_sectors(bdev);
 
+	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
+	if ((sector | nr_sects) & bs_mask)
+		return -EINVAL;
+
+retry:
+	bio = NULL;
 	blk_start_plug(&plug);
-	ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask,
-			&bio, flags);
+	if (try_write_zeroes) {
+		ret = __blkdev_issue_write_zeroes(bdev, sector, nr_sects,
+						  gfp_mask, &bio, flags);
+	} else if (!(flags & BLKDEV_ZERO_NOFALLBACK)) {
+		ret = __blkdev_issue_zero_pages(bdev, sector, nr_sects,
+						gfp_mask, &bio);
+	} else {
+		/* No zeroing offload support */
+		ret = -EOPNOTSUPP;
+	}
 	if (ret == 0 && bio) {
 		ret = submit_bio_wait(bio);
 		bio_put(bio);
 	}
 	blk_finish_plug(&plug);
+	if (ret && try_write_zeroes) {
+		if (!(flags & BLKDEV_ZERO_NOFALLBACK)) {
+			try_write_zeroes = false;
+			goto retry;
+		}
+		if (!bdev_write_zeroes_sectors(bdev)) {
+			/*
+			 * Zeroing offload support was indicated, but the
+			 * device reported ILLEGAL REQUEST (for some devices
+			 * there is no non-destructive way to verify whether
+			 * WRITE ZEROES is actually supported).
+			 */
+			ret = -EOPNOTSUPP;
+		}
+	}
 
 	return ret;
 }