diff mbox

[4/4] block: Cater to iscsi with non-power-of-2 discard

Message ID 1469129688-22848-5-git-send-email-eblake@redhat.com
State New
Headers show

Commit Message

Eric Blake July 21, 2016, 7:34 p.m. UTC
Dell Equallogic iSCSI SANs have a very unusual advertised geometry:

$ iscsi-inq -e 1 -c $((0xb0)) iscsi://XXX/0
wsnz:0
maximum compare and write length:1
optimal transfer length granularity:0
maximum transfer length:0
optimal transfer length:0
maximum prefetch xdread xdwrite transfer length:0
maximum unmap lba count:30720
maximum unmap block descriptor count:2
optimal unmap granularity:30720
ugavalid:1
unmap granularity alignment:0
maximum write same length:30720

which says that both the maximum and the optimal discard size
is 15M.  It is not immediately apparent if the device allows
discard requests not aligned to the optimal size, nor if it
allows discards at a finer granularity than the optimal size.

I tried to find details in the SCSI Commands Reference Manual
Rev. A on what valid values of maximum and optimal sizes are
permitted, but while that document mentions a "Block Limits
VPD Page", I couldn't actually find documentation of that page
or what values it would have, or if a SCSI device has an
advertisement of its minimal unmap granularity.  So it is not
obvious to me whether the Dell Equallogic device is compliance
with the SCSI specification.

Fortunately, it is easy enough to support non-power-of-2 sizing,
even if it means we are less efficient than truly possible when
targetting that device (for example, it means that we refuse to
unmap anything that is not a multiple of 15M and aligned to a
15M boundary, even if the device truly does support a smaller
granularity where unmapping actually works).

Reported-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Eric Blake <eblake@redhat.com>

---
Help in locating the actual specs on what SCSI requires for
page 0xb0 would be nice. But this should at least avoid the
assertion failures that Peter is hitting.  I was able to
test this patch using NBD on a hacked up qemu where I made
block/nbd.c report the same block limits, and could confirm
the assert under qemu-io 'w -z 0 40m' and 'discard 0 40m'
pre-patch, as well as the post-patch behavior of splitting
things to 15M alignment ('discard 1M 15M' becomes a no-op
because it is not aligned).  But obviously it needs to be
tested on the actual iscsi SAN that triggered the original
report.
---
 include/block/block_int.h | 37 ++++++++++++++++++++-----------------
 block/io.c                | 15 +++++++++------
 2 files changed, 29 insertions(+), 23 deletions(-)

Comments

Stefan Hajnoczi July 26, 2016, 1:28 p.m. UTC | #1
On Thu, Jul 21, 2016 at 01:34:48PM -0600, Eric Blake wrote:
> Dell Equallogic iSCSI SANs have a very unusual advertised geometry:
> 
> $ iscsi-inq -e 1 -c $((0xb0)) iscsi://XXX/0
> wsnz:0
> maximum compare and write length:1
> optimal transfer length granularity:0
> maximum transfer length:0
> optimal transfer length:0
> maximum prefetch xdread xdwrite transfer length:0
> maximum unmap lba count:30720
> maximum unmap block descriptor count:2
> optimal unmap granularity:30720
> ugavalid:1
> unmap granularity alignment:0
> maximum write same length:30720
> 
> which says that both the maximum and the optimal discard size
> is 15M.  It is not immediately apparent if the device allows
> discard requests not aligned to the optimal size, nor if it
> allows discards at a finer granularity than the optimal size.
> 
> I tried to find details in the SCSI Commands Reference Manual
> Rev. A on what valid values of maximum and optimal sizes are
> permitted, but while that document mentions a "Block Limits
> VPD Page", I couldn't actually find documentation of that page
> or what values it would have, or if a SCSI device has an
> advertisement of its minimal unmap granularity.  So it is not
> obvious to me whether the Dell Equallogic device is compliance
> with the SCSI specification.
> 
> Fortunately, it is easy enough to support non-power-of-2 sizing,
> even if it means we are less efficient than truly possible when
> targetting that device (for example, it means that we refuse to
> unmap anything that is not a multiple of 15M and aligned to a
> 15M boundary, even if the device truly does support a smaller
> granularity where unmapping actually works).
> 
> Reported-by: Peter Lieven <pl@kamp.de>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
> ---
> Help in locating the actual specs on what SCSI requires for
> page 0xb0 would be nice. But this should at least avoid the
> assertion failures that Peter is hitting.  I was able to
> test this patch using NBD on a hacked up qemu where I made
> block/nbd.c report the same block limits, and could confirm
> the assert under qemu-io 'w -z 0 40m' and 'discard 0 40m'
> pre-patch, as well as the post-patch behavior of splitting
> things to 15M alignment ('discard 1M 15M' becomes a no-op
> because it is not aligned).  But obviously it needs to be
> tested on the actual iscsi SAN that triggered the original
> report.
> ---
>  include/block/block_int.h | 37 ++++++++++++++++++++-----------------
>  block/io.c                | 15 +++++++++------
>  2 files changed, 29 insertions(+), 23 deletions(-)

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Fam Zheng July 27, 2016, 7:25 a.m. UTC | #2
On Thu, 07/21 13:34, Eric Blake wrote:
> +    max_write_zeroes = max_write_zeroes / alignment * alignment;

Not using QEMU_ALIGN_DOWN despite patch 3?
Eric Blake July 28, 2016, 2:39 a.m. UTC | #3
On 07/27/2016 01:25 AM, Fam Zheng wrote:
> On Thu, 07/21 13:34, Eric Blake wrote:
>> +    max_write_zeroes = max_write_zeroes / alignment * alignment;
> 
> Not using QEMU_ALIGN_DOWN despite patch 3?

Looks like I missed that on the rebase. Can fix if there is a reason for
a respin.
Peter Lieven Oct. 25, 2016, 12:03 p.m. UTC | #4
Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:
>
> On 28/07/2016 04:39, Eric Blake wrote:
>> On 07/27/2016 01:25 AM, Fam Zheng wrote:
>>> On Thu, 07/21 13:34, Eric Blake wrote:
>>>> +    max_write_zeroes = max_write_zeroes / alignment * alignment;
>>> Not using QEMU_ALIGN_DOWN despite patch 3?
>> Looks like I missed that on the rebase. Can fix if there is a reason for
>> a respin.
> Since Stefan acked this, I'm applying the patch and fixing it to use
> QEMU_ALIGN_DOWN.
>
> Paolo

Hi,

I came across a sort of regression we introduced with the dropping of head and tail
of an unaligned discard.

The discard alignment that we use to trim the discard request is just a hint.

I learned on the equallogics that a page (which is this unusal 15MB large) is
unallocated even if the discard happens in pieces. E.g. in slices of 1MB requests.

 From my point of view I would like to restore the old behaviour. What do you think?

Thanks,
Peter
Paolo Bonzini Oct. 25, 2016, 12:09 p.m. UTC | #5
On 25/10/2016 14:03, Peter Lieven wrote:
> Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:
>>
>> On 28/07/2016 04:39, Eric Blake wrote:
>>> On 07/27/2016 01:25 AM, Fam Zheng wrote:
>>>> On Thu, 07/21 13:34, Eric Blake wrote:
>>>>> +    max_write_zeroes = max_write_zeroes / alignment * alignment;
>>>> Not using QEMU_ALIGN_DOWN despite patch 3?
>>> Looks like I missed that on the rebase. Can fix if there is a reason for
>>> a respin.
>> Since Stefan acked this, I'm applying the patch and fixing it to use
>> QEMU_ALIGN_DOWN.
>>
>> Paolo
> 
> Hi,
> 
> I came across a sort of regression we introduced with the dropping of
> head and tail
> of an unaligned discard.
> 
> The discard alignment that we use to trim the discard request is just a
> hint.
> 
> I learned on the equallogics that a page (which is this unusal 15MB
> large) is
> unallocated even if the discard happens in pieces. E.g. in slices of 1MB
> requests.
> 
> From my point of view I would like to restore the old behaviour. What do
> you think?

The right logic should be the one in Linux: if splitting a request, and
the next starting sector would be misaligned, stop the discard at the
previous aligned sector.  Otherwise leave everything alone.

Paolo
Peter Lieven Oct. 25, 2016, 12:12 p.m. UTC | #6
Am 25.10.2016 um 14:09 schrieb Paolo Bonzini:
>
> On 25/10/2016 14:03, Peter Lieven wrote:
>> Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:
>>> On 28/07/2016 04:39, Eric Blake wrote:
>>>> On 07/27/2016 01:25 AM, Fam Zheng wrote:
>>>>> On Thu, 07/21 13:34, Eric Blake wrote:
>>>>>> +    max_write_zeroes = max_write_zeroes / alignment * alignment;
>>>>> Not using QEMU_ALIGN_DOWN despite patch 3?
>>>> Looks like I missed that on the rebase. Can fix if there is a reason for
>>>> a respin.
>>> Since Stefan acked this, I'm applying the patch and fixing it to use
>>> QEMU_ALIGN_DOWN.
>>>
>>> Paolo
>> Hi,
>>
>> I came across a sort of regression we introduced with the dropping of
>> head and tail
>> of an unaligned discard.
>>
>> The discard alignment that we use to trim the discard request is just a
>> hint.
>>
>> I learned on the equallogics that a page (which is this unusal 15MB
>> large) is
>> unallocated even if the discard happens in pieces. E.g. in slices of 1MB
>> requests.
>>
>>  From my point of view I would like to restore the old behaviour. What do
>> you think?
> The right logic should be the one in Linux: if splitting a request, and
> the next starting sector would be misaligned, stop the discard at the
> previous aligned sector.  Otherwise leave everything alone.

Just to clarify. I mean the guest would send incremental 1MB discards
we would now drop all of them if the alignment is 15MB. Previously,
we have sent all of the 1MB requests.

Peter
Paolo Bonzini Oct. 25, 2016, 12:19 p.m. UTC | #7
On 25/10/2016 14:12, Peter Lieven wrote:
> Am 25.10.2016 um 14:09 schrieb Paolo Bonzini:
>>
>> On 25/10/2016 14:03, Peter Lieven wrote:
>>> Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:
>>>> On 28/07/2016 04:39, Eric Blake wrote:
>>>>> On 07/27/2016 01:25 AM, Fam Zheng wrote:
>>>>>> On Thu, 07/21 13:34, Eric Blake wrote:
>>>>>>> +    max_write_zeroes = max_write_zeroes / alignment * alignment;
>>>>>> Not using QEMU_ALIGN_DOWN despite patch 3?
>>>>> Looks like I missed that on the rebase. Can fix if there is a
>>>>> reason for
>>>>> a respin.
>>>> Since Stefan acked this, I'm applying the patch and fixing it to use
>>>> QEMU_ALIGN_DOWN.
>>>>
>>>> Paolo
>>> Hi,
>>>
>>> I came across a sort of regression we introduced with the dropping of
>>> head and tail
>>> of an unaligned discard.
>>>
>>> The discard alignment that we use to trim the discard request is just a
>>> hint.
>>>
>>> I learned on the equallogics that a page (which is this unusal 15MB
>>> large) is
>>> unallocated even if the discard happens in pieces. E.g. in slices of 1MB
>>> requests.
>>>
>>>  From my point of view I would like to restore the old behaviour.
>>> What do
>>> you think?
>> The right logic should be the one in Linux: if splitting a request, and
>> the next starting sector would be misaligned, stop the discard at the
>> previous aligned sector.  Otherwise leave everything alone.
> 
> Just to clarify. I mean the guest would send incremental 1MB discards
> we would now drop all of them if the alignment is 15MB. Previously,
> we have sent all of the 1MB requests.

Yes.  In this case there would be no need at all to split the request,
so each request should be passed through.

But hey, that firmware is seriously weird. :)

Paolo
Peter Lieven Oct. 25, 2016, 12:42 p.m. UTC | #8
Am 25.10.2016 um 14:19 schrieb Paolo Bonzini:
>
> On 25/10/2016 14:12, Peter Lieven wrote:
>> Am 25.10.2016 um 14:09 schrieb Paolo Bonzini:
>>> On 25/10/2016 14:03, Peter Lieven wrote:
>>>> Am 01.08.2016 um 11:22 schrieb Paolo Bonzini:
>>>>> On 28/07/2016 04:39, Eric Blake wrote:
>>>>>> On 07/27/2016 01:25 AM, Fam Zheng wrote:
>>>>>>> On Thu, 07/21 13:34, Eric Blake wrote:
>>>>>>>> +    max_write_zeroes = max_write_zeroes / alignment * alignment;
>>>>>>> Not using QEMU_ALIGN_DOWN despite patch 3?
>>>>>> Looks like I missed that on the rebase. Can fix if there is a
>>>>>> reason for
>>>>>> a respin.
>>>>> Since Stefan acked this, I'm applying the patch and fixing it to use
>>>>> QEMU_ALIGN_DOWN.
>>>>>
>>>>> Paolo
>>>> Hi,
>>>>
>>>> I came across a sort of regression we introduced with the dropping of
>>>> head and tail
>>>> of an unaligned discard.
>>>>
>>>> The discard alignment that we use to trim the discard request is just a
>>>> hint.
>>>>
>>>> I learned on the equallogics that a page (which is this unusal 15MB
>>>> large) is
>>>> unallocated even if the discard happens in pieces. E.g. in slices of 1MB
>>>> requests.
>>>>
>>>>   From my point of view I would like to restore the old behaviour.
>>>> What do
>>>> you think?
>>> The right logic should be the one in Linux: if splitting a request, and
>>> the next starting sector would be misaligned, stop the discard at the
>>> previous aligned sector.  Otherwise leave everything alone.
>> Just to clarify. I mean the guest would send incremental 1MB discards
>> we would now drop all of them if the alignment is 15MB. Previously,
>> we have sent all of the 1MB requests.
> Yes.  In this case there would be no need at all to split the request,
> so each request should be passed through.
>
> But hey, that firmware is seriously weird. :)

Yes, so you would not change the new implementation?

Even if the discard is e.g. 1MB it could theretically be that internally
the device has a finer granularity. Its an optimal discard alignment
not the minimum required discard size. I think thats a difference.
It does not say I can't handle smaller discards.

Peter
Eric Blake Oct. 25, 2016, 1:59 p.m. UTC | #9
On 10/25/2016 07:42 AM, Peter Lieven wrote:
>>
>> But hey, that firmware is seriously weird. :)
> 
> Yes, so you would not change the new implementation?
> 
> Even if the discard is e.g. 1MB it could theretically be that internally
> the device has a finer granularity. Its an optimal discard alignment
> not the minimum required discard size. I think thats a difference.
> It does not say I can't handle smaller discards.

The firmware is probably technically buggy for advertising too large of
a minimum granularity, if it can piece together smaller requests into a
larger discard.  If discards need to happen at a smaller granularity,
the firmware (or kernel quirk system) should fix the advertisement to
the actual granularity that it will honor.  I don't see a reason to
change qemu's current behavior.
Peter Lieven Oct. 25, 2016, 2:20 p.m. UTC | #10
Am 25.10.2016 um 15:59 schrieb Eric Blake:
> On 10/25/2016 07:42 AM, Peter Lieven wrote:
>>> But hey, that firmware is seriously weird. :)
>> Yes, so you would not change the new implementation?
>>
>> Even if the discard is e.g. 1MB it could theretically be that internally
>> the device has a finer granularity. Its an optimal discard alignment
>> not the minimum required discard size. I think thats a difference.
>> It does not say I can't handle smaller discards.
> The firmware is probably technically buggy for advertising too large of
> a minimum granularity, if it can piece together smaller requests into a
> larger discard.  If discards need to happen at a smaller granularity,
> the firmware (or kernel quirk system) should fix the advertisement to
> the actual granularity that it will honor.  I don't see a reason to
> change qemu's current behavior.
>

The issue is that the optimal unmap granularity is only a hint.
There is no evidence what happens with unaligned requests or requests
that are smaller. They could still lead to a discard operation in the
storage device. It just says if you can send me discards of that size
thats optimal for me. Thats not said that smaller or unaligned requests
have no effect.

Peter
Eric Blake Oct. 25, 2016, 2:35 p.m. UTC | #11
On 10/25/2016 09:20 AM, Peter Lieven wrote:
>> The firmware is probably technically buggy for advertising too large of
>> a minimum granularity, if it can piece together smaller requests into a
>> larger discard.  If discards need to happen at a smaller granularity,
>> the firmware (or kernel quirk system) should fix the advertisement to
>> the actual granularity that it will honor.  I don't see a reason to
>> change qemu's current behavior.
>>
> 
> The issue is that the optimal unmap granularity is only a hint.
> There is no evidence what happens with unaligned requests or requests
> that are smaller. They could still lead to a discard operation in the
> storage device. It just says if you can send me discards of that size
> thats optimal for me. Thats not said that smaller or unaligned requests
> have no effect.

So your argument is that we should always pass down every unaligned
less-than-optimum discard request all the way to the hardware, rather
than dropping it higher in the stack, even though discard requests are
already advisory, in order to leave the hardware as the ultimate
decision on whether to ignore the unaligned request?
Paolo Bonzini Oct. 25, 2016, 2:36 p.m. UTC | #12
On 25/10/2016 16:35, Eric Blake wrote:
> So your argument is that we should always pass down every unaligned
> less-than-optimum discard request all the way to the hardware, rather
> than dropping it higher in the stack, even though discard requests are
> already advisory, in order to leave the hardware as the ultimate
> decision on whether to ignore the unaligned request?

Yes, I agree with Peter as to this.

Paolo
Eric Blake Oct. 25, 2016, 4:12 p.m. UTC | #13
On 10/25/2016 09:36 AM, Paolo Bonzini wrote:
> 
> 
> On 25/10/2016 16:35, Eric Blake wrote:
>> So your argument is that we should always pass down every unaligned
>> less-than-optimum discard request all the way to the hardware, rather
>> than dropping it higher in the stack, even though discard requests are
>> already advisory, in order to leave the hardware as the ultimate
>> decision on whether to ignore the unaligned request?
> 
> Yes, I agree with Peter as to this.

Okay, I'll work on patches. I think it counts as bug fix, so appropriate
even if I miss soft freeze (I'd still like to get NBD write zero support
into 2.8, since it already missed 2.7, but that one is still awaiting
review with not much time left).
Peter Lieven Nov. 8, 2016, 11:03 a.m. UTC | #14
Am 25.10.2016 um 18:12 schrieb Eric Blake:
> On 10/25/2016 09:36 AM, Paolo Bonzini wrote:
>>
>> On 25/10/2016 16:35, Eric Blake wrote:
>>> So your argument is that we should always pass down every unaligned
>>> less-than-optimum discard request all the way to the hardware, rather
>>> than dropping it higher in the stack, even though discard requests are
>>> already advisory, in order to leave the hardware as the ultimate
>>> decision on whether to ignore the unaligned request?
>> Yes, I agree with Peter as to this.
> Okay, I'll work on patches. I think it counts as bug fix, so appropriate
> even if I miss soft freeze (I'd still like to get NBD write zero support
> into 2.8, since it already missed 2.7, but that one is still awaiting
> review with not much time left).
>

Hi Eric,

have you had time to look at this?
If you need help, let me know.

Peter
Eric Blake Nov. 8, 2016, 4:43 p.m. UTC | #15
On 11/08/2016 05:03 AM, Peter Lieven wrote:
> Am 25.10.2016 um 18:12 schrieb Eric Blake:
>> On 10/25/2016 09:36 AM, Paolo Bonzini wrote:
>>>
>>> On 25/10/2016 16:35, Eric Blake wrote:
>>>> So your argument is that we should always pass down every unaligned
>>>> less-than-optimum discard request all the way to the hardware, rather
>>>> than dropping it higher in the stack, even though discard requests are
>>>> already advisory, in order to leave the hardware as the ultimate
>>>> decision on whether to ignore the unaligned request?
>>> Yes, I agree with Peter as to this.
>> Okay, I'll work on patches. I think it counts as bug fix, so appropriate
>> even if I miss soft freeze (I'd still like to get NBD write zero support
>> into 2.8, since it already missed 2.7, but that one is still awaiting
>> review with not much time left).
>>
> 
> Hi Eric,
> 
> have you had time to look at this?
> If you need help, let me know.

Still on my list. I'm not forgetting it, and it does count as a bug fix
so it is safe for inclusion, although I'm trying to get it in before
this week is out.
Eric Blake Nov. 11, 2016, 4:02 a.m. UTC | #16
On 11/08/2016 05:03 AM, Peter Lieven wrote:
> Am 25.10.2016 um 18:12 schrieb Eric Blake:
>> On 10/25/2016 09:36 AM, Paolo Bonzini wrote:
>>>
>>> On 25/10/2016 16:35, Eric Blake wrote:
>>>> So your argument is that we should always pass down every unaligned
>>>> less-than-optimum discard request all the way to the hardware, rather
>>>> than dropping it higher in the stack, even though discard requests are
>>>> already advisory, in order to leave the hardware as the ultimate
>>>> decision on whether to ignore the unaligned request?
>>> Yes, I agree with Peter as to this.
>> Okay, I'll work on patches. I think it counts as bug fix, so appropriate
>> even if I miss soft freeze (I'd still like to get NBD write zero support
>> into 2.8, since it already missed 2.7, but that one is still awaiting
>> review with not much time left).
>>
> 
> Hi Eric,
> 
> have you had time to look at this?
> If you need help, let me know.

Patches posted, but testing help would be appreciated since you have the
actual hardware that exhibits the issue.

I'm also trying to write a patch to extend the blkdebug driver to share
this "feature" of a 15M page, and write a qemu-iotest to make it harder
to regress in the future.
diff mbox

Patch

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 1fe0fd9..47665be 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -330,36 +330,39 @@  typedef struct BlockLimits {
      * otherwise. */
     uint32_t request_alignment;

-    /* maximum number of bytes that can be discarded at once (since it
-     * is signed, it must be < 2G, if set), should be multiple of
+    /* Maximum number of bytes that can be discarded at once (since it
+     * is signed, it must be < 2G, if set). Must be multiple of
      * pdiscard_alignment, but need not be power of 2. May be 0 if no
      * inherent 32-bit limit */
     int32_t max_pdiscard;

-    /* optimal alignment for discard requests in bytes, must be power
-     * of 2, less than max_pdiscard if that is set, and multiple of
-     * bl.request_alignment. May be 0 if bl.request_alignment is good
-     * enough */
+    /* Optimal alignment for discard requests in bytes. A power of 2
+     * is best but not mandatory.  Must be a multiple of
+     * bl.request_alignment, and must be less than max_pdiscard if
+     * that is set. May be 0 if bl.request_alignment is good enough */
     uint32_t pdiscard_alignment;

-    /* maximum number of bytes that can zeroized at once (since it is
-     * signed, it must be < 2G, if set), should be multiple of
+    /* Maximum number of bytes that can zeroized at once (since it is
+     * signed, it must be < 2G, if set). Must be multiple of
      * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */
     int32_t max_pwrite_zeroes;

-    /* optimal alignment for write zeroes requests in bytes, must be
-     * power of 2, less than max_pwrite_zeroes if that is set, and
-     * multiple of bl.request_alignment. May be 0 if
-     * bl.request_alignment is good enough */
+    /* Optimal alignment for write zeroes requests in bytes. A power
+     * of 2 is best but not mandatory.  Must be a multiple of
+     * bl.request_alignment, and must be less than max_pwrite_zeroes
+     * if that is set. May be 0 if bl.request_alignment is good
+     * enough */
     uint32_t pwrite_zeroes_alignment;

-    /* optimal transfer length in bytes (must be power of 2, and
-     * multiple of bl.request_alignment), or 0 if no preferred size */
+    /* Optimal transfer length in bytes.  A power of 2 is best but not
+     * mandatory.  Must be a multiple of bl.request_alignment, or 0 if
+     * no preferred size */
     uint32_t opt_transfer;

-    /* maximal transfer length in bytes (need not be power of 2, but
-     * should be multiple of opt_transfer), or 0 for no 32-bit limit.
-     * For now, anything larger than INT_MAX is clamped down. */
+    /* Maximal transfer length in bytes.  Need not be power of 2, but
+     * must be multiple of opt_transfer and bl.request_alignment, or 0
+     * for no 32-bit limit.  For now, anything larger than INT_MAX is
+     * clamped down. */
     uint32_t max_transfer;

     /* memory alignment, in bytes so that no bounce buffer is needed */
diff --git a/block/io.c b/block/io.c
index 7323f0f..0644a5d 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1180,10 +1180,11 @@  static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
     int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
                         bs->bl.request_alignment);

-    assert(is_power_of_2(alignment));
-    head = offset & (alignment - 1);
-    tail = (offset + count) & (alignment - 1);
-    max_write_zeroes &= ~(alignment - 1);
+    assert(alignment % bs->bl.request_alignment == 0);
+    head = offset % alignment;
+    tail = (offset + count) % alignment;
+    max_write_zeroes = max_write_zeroes / alignment * alignment;
+    assert(max_write_zeroes >= bs->bl.request_alignment);

     while (count > 0 && !ret) {
         int num = count;
@@ -2429,9 +2430,10 @@  int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,

     /* Discard is advisory, so ignore any unaligned head or tail */
     align = MAX(bs->bl.pdiscard_alignment, bs->bl.request_alignment);
-    assert(is_power_of_2(align));
-    head = MIN(count, -offset & (align - 1));
+    assert(align % bs->bl.request_alignment == 0);
+    head = offset % align;
     if (head) {
+        head = MIN(count, align - head);
         count -= head;
         offset += head;
     }
@@ -2449,6 +2451,7 @@  int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, int64_t offset,

     max_pdiscard = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_pdiscard, INT_MAX),
                                    align);
+    assert(max_pdiscard);

     while (count > 0) {
         int ret;