diff mbox

ata: do not hard code limit in ata_set_lba_range_entries()

Message ID e35afc461c9a79f568f2bcd4ef2cd1e07fb50082.1471890326.git.tom.ty89@gmail.com
State Not Applicable
Delegated to: David Miller
Headers show

Commit Message

Tom Yan Aug. 22, 2016, 6:55 p.m. UTC
From: Tom Yan <tom.ty89@gmail.com>

In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with
n_block that exceeds limit"), it is made sure that
ata_set_lba_range_entries() will never be called with a request
size (n_block) that is larger than the number of blocks that a
512-byte block TRIM payload can describe (65535 * 64 = 4194240),
in addition to acknowlegding the SCSI/block layer with the same
limit by advertising it as the Maximum Write Same Length.

Therefore, it is unnecessary to hard code the same limit in
ata_set_lba_range_entries() itself, which would only cost extra
maintenance effort. Such effort can be noticed in, for example,
commit 2983860c7668 ("libata-scsi: avoid repeated calculation of
number of TRIM ranges").

Signed-off-by: Tom Yan <tom.ty89@gmail.com>

Comments

Shaun Tancheff Aug. 22, 2016, 7:58 p.m. UTC | #1
On Mon, Aug 22, 2016 at 1:55 PM,  <tom.ty89@gmail.com> wrote:
> From: Tom Yan <tom.ty89@gmail.com>
>
> In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with
> n_block that exceeds limit"), it is made sure that
> ata_set_lba_range_entries() will never be called with a request
> size (n_block) that is larger than the number of blocks that a
> 512-byte block TRIM payload can describe (65535 * 64 = 4194240),
> in addition to acknowlegding the SCSI/block layer with the same
> limit by advertising it as the Maximum Write Same Length.
>
> Therefore, it is unnecessary to hard code the same limit in
> ata_set_lba_range_entries() itself, which would only cost extra
> maintenance effort. Such effort can be noticed in, for example,
> commit 2983860c7668 ("libata-scsi: avoid repeated calculation of
> number of TRIM ranges").
>
> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>
> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
> index be9c76c..9b74ecb 100644
> --- a/drivers/ata/libata-scsi.c
> +++ b/drivers/ata/libata-scsi.c
> @@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>         buf = page_address(sg_page(scsi_sglist(scmd)));
>
>         if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
> +               size = ata_set_lba_range_entries(buf, block, n_block);
>         } else {
>                 fp = 2;
>                 goto invalid_fld;
> diff --git a/include/linux/ata.h b/include/linux/ata.h
> index adbc812..5e2e9ad 100644
> --- a/include/linux/ata.h
> +++ b/include/linux/ata.h
> @@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>   * TO NV CACHE PINNED SET.
>   */
>  static inline unsigned ata_set_lba_range_entries(void *_buffer,
> -               unsigned num, u64 sector, unsigned long count)
> +               u64 sector, unsigned long count)
>  {
>         __le64 *buffer = _buffer;
>         unsigned i = 0, used_bytes;
>
> -       while (i < num) {
> -               u64 entry = sector |
> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
> +       while (count > 0) {
> +               u64 range, entry;
> +
> +               range = count > 0xffff ? 0xffff : count;
> +               entry = sector | (range << 48);
>                 buffer[i++] = __cpu_to_le64(entry);
> -               if (count <= 0xffff)
> -                       break;
> -               count -= 0xffff;
> -               sector += 0xffff;
> +               count -= range;
> +               sector += range;
>         }

I think the problem here is that I can now inject a buffer overflow
via SG_IO.

>         used_bytes = ALIGN(i * 8, 512);
> --
> 2.9.3
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Yan Aug. 22, 2016, 8:07 p.m. UTC | #2
I don't see how that's possible. count / n_block will always be
smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention
that isn't even a "buffer limit" anyway. By SG_IO do you mean like
SCSI Write Same commands that issued with sg_write_same or so? If
that's the case, that's what exactly commit 5c79097a28c2
("libata-scsi: reject WRITE SAME (16) with n_block that exceeds
limit") is for.

On 23 August 2016 at 03:58, Shaun Tancheff <shaun@tancheff.com> wrote:
> On Mon, Aug 22, 2016 at 1:55 PM,  <tom.ty89@gmail.com> wrote:
>> From: Tom Yan <tom.ty89@gmail.com>
>>
>> In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with
>> n_block that exceeds limit"), it is made sure that
>> ata_set_lba_range_entries() will never be called with a request
>> size (n_block) that is larger than the number of blocks that a
>> 512-byte block TRIM payload can describe (65535 * 64 = 4194240),
>> in addition to acknowlegding the SCSI/block layer with the same
>> limit by advertising it as the Maximum Write Same Length.
>>
>> Therefore, it is unnecessary to hard code the same limit in
>> ata_set_lba_range_entries() itself, which would only cost extra
>> maintenance effort. Such effort can be noticed in, for example,
>> commit 2983860c7668 ("libata-scsi: avoid repeated calculation of
>> number of TRIM ranges").
>>
>> Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>>
>> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
>> index be9c76c..9b74ecb 100644
>> --- a/drivers/ata/libata-scsi.c
>> +++ b/drivers/ata/libata-scsi.c
>> @@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
>>         buf = page_address(sg_page(scsi_sglist(scmd)));
>>
>>         if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
>> -               size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
>> +               size = ata_set_lba_range_entries(buf, block, n_block);
>>         } else {
>>                 fp = 2;
>>                 goto invalid_fld;
>> diff --git a/include/linux/ata.h b/include/linux/ata.h
>> index adbc812..5e2e9ad 100644
>> --- a/include/linux/ata.h
>> +++ b/include/linux/ata.h
>> @@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id)
>>   * TO NV CACHE PINNED SET.
>>   */
>>  static inline unsigned ata_set_lba_range_entries(void *_buffer,
>> -               unsigned num, u64 sector, unsigned long count)
>> +               u64 sector, unsigned long count)
>>  {
>>         __le64 *buffer = _buffer;
>>         unsigned i = 0, used_bytes;
>>
>> -       while (i < num) {
>> -               u64 entry = sector |
>> -                       ((u64)(count > 0xffff ? 0xffff : count) << 48);
>> +       while (count > 0) {
>> +               u64 range, entry;
>> +
>> +               range = count > 0xffff ? 0xffff : count;
>> +               entry = sector | (range << 48);
>>                 buffer[i++] = __cpu_to_le64(entry);
>> -               if (count <= 0xffff)
>> -                       break;
>> -               count -= 0xffff;
>> -               sector += 0xffff;
>> +               count -= range;
>> +               sector += range;
>>         }
>
> I think the problem here is that I can now inject a buffer overflow
> via SG_IO.
>
>>         used_bytes = ALIGN(i * 8, 512);
>> --
>> 2.9.3
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shaun Tancheff Aug. 22, 2016, 8:32 p.m. UTC | #3
On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> I don't see how that's possible. count / n_block will always be
> smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention
> that isn't even a "buffer limit" anyway. By SG_IO do you mean like
> SCSI Write Same commands that issued with sg_write_same or so? If
> that's the case, that's what exactly commit 5c79097a28c2
> ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds
> limit") is for.

Ah, I see. You are guarding the only user of ata_set_lba_range_entries().
Still if you are going to do that you have to alert any new user that they
must have an appropriately sized buffer to be overwriting.

Better to move it out of ata.h then the limit the scope of accidental
misuse?

Regards,
Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Yan Aug. 22, 2016, 8:53 p.m. UTC | #4
On 22 August 2016 at 20:32, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
> On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote:
>> I don't see how that's possible. count / n_block will always be
>> smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention
>> that isn't even a "buffer limit" anyway. By SG_IO do you mean like
>> SCSI Write Same commands that issued with sg_write_same or so? If
>> that's the case, that's what exactly commit 5c79097a28c2
>> ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds
>> limit") is for.
>
> Ah, I see. You are guarding the only user of ata_set_lba_range_entries().

Yup. It is the only right thing to do anyway, that we leave the
function "open" and guard per context when we use it. Say if
ata_set_lba_range_entries() is gonna be a function that is shared by
others, it would only make this commit more important. As I said, we
did not guard it with a certain buffer limit, but merely redundantly
guard it with a ("humanized") limit that applies to TRIM only.

> Still if you are going to do that you have to alert any new user that they
> must have an appropriately sized buffer to be overwriting.
>
> Better to move it out of ata.h then the limit the scope of accidental
> misuse?

I am not sure if it is really necessary but that's fine to me. I see
that you have been doing it in your SCT Write Same patch series.
Probably I can/should just leave the move to you?

>
> Regards,
> Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shaun Tancheff Aug. 23, 2016, 7:30 a.m. UTC | #5
On Mon, Aug 22, 2016 at 3:53 PM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 22 August 2016 at 20:32, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>> On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote:
>>> I don't see how that's possible. count / n_block will always be
>>> smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention
>>> that isn't even a "buffer limit" anyway. By SG_IO do you mean like
>>> SCSI Write Same commands that issued with sg_write_same or so? If
>>> that's the case, that's what exactly commit 5c79097a28c2
>>> ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds
>>> limit") is for.
>>
>> Ah, I see. You are guarding the only user of ata_set_lba_range_entries().
>
> Yup. It is the only right thing to do anyway, that we leave the
> function "open" and guard per context when we use it. Say if
> ata_set_lba_range_entries() is gonna be a function that is shared by
> others, it would only make this commit more important. As I said, we
> did not guard it with a certain buffer limit, but merely redundantly
> guard it with a ("humanized") limit that applies to TRIM only.

But  the "humanized" limit is the one you just added and proceeded to
change ata_set_lba_range_entries(). You changed from a buffer size
to use "num" instead and now you want to remove the protection
entirely?

Why not just change to put this in front of ata_set_lba_range_entries()

if (n_block > 65535 * ATA_MAX_TRIM_RNUM) {
     fp = 2;
    goto invalid_fld;
}

And then restore ata_set_lba_range_entries() to how it looked
before you changed it in commit:

2983860c7 (libata-scsi: avoid repeated calculation of number of TRIM ranges)

Then you can have ata_set_lba_range_entries() take the buffer size ...
something like the following would be fine:

      size = ata_set_lba_range_entries(buf, scmd->device->sector_size,
block, n_block);

Now things are happily protected against both exceeding the b0 limit(s) and
overflowing the sglist buffer.

--
Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Yan Aug. 23, 2016, 8:37 a.m. UTC | #6
On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote:
>
> But  the "humanized" limit is the one you just added and proceeded to
> change ata_set_lba_range_entries(). You changed from a buffer size
> to use "num" instead and now you want to remove the protection
> entirely?

I am not sure about what you are talking about here. What I did is
just to avoid setting an (inappropriate and unnecessary) hard-coded
limit (with the original while-condition) for i (the unaligned buffer
size) in this general (in the sense that TRIM payload is _not always_
of one 512-byte block, even if we may want to forever practice that in
our kernel) function.

If we really want/need to avoid hitting some real buffer limit (e.g.
maximum length of scatter/gather list?), then we should in some way
check n_block against that. If it is too large we then return
used_bytes = 0 (optionally with some follow-up to add a response to
such return value or so).

We advertise 4194240 as Maximum Write Same Length so that the
SCSI/block layer will know how to split the discard request, and then
we make sure that the SATL reject SCSI commands (that is not issued
through the discard / write same ioctl but with SG_IO /
sg_write_same). That's all we really need to do, end of story.

>
> Why not just change to put this in front of ata_set_lba_range_entries()
>
> if (n_block > 65535 * ATA_MAX_TRIM_RNUM) {
>      fp = 2;
>     goto invalid_fld;
> }

Well you can change the if-else clause to that. Apparently you've been
doing that in your patch series. But that has nothing to do with what
I am fixing here. Neither does it affect the actual behavior.

>
> And then restore ata_set_lba_range_entries() to how it looked
> before you changed it in commit:
>
> 2983860c7 (libata-scsi: avoid repeated calculation of number of TRIM ranges)

I don't see how it is relevant at all. Though I should have introduced
_this_ patch before that to save some work. Unfortunately I wasn't
aware how ugly ata_set_lba_range_entries was.

>
> Then you can have ata_set_lba_range_entries() take the buffer size ...
> something like the following would be fine:
>
>       size = ata_set_lba_range_entries(buf, scmd->device->sector_size,
> block, n_block);
>
> Now things are happily protected against both exceeding the b0 limit(s) and
> overflowing the sglist buffer.

We have no reason at all to bring the logical sector size in here.
Only if in future ACS standards it is stated that payload block are
counted in logical block size instead of statically 512, we would only
need to make the now static "512" in ALIGN() in to 4096.

For possible sglist buffer overflow, see what I mentioned above about
checking n_block and return 0. We would not want to do it with the
original while-way anyway. All it does would be to truncate larger
request and partially complete it silently.

Can you tell exactly how the size of the sglist buffer is limited?
AFAIK it has nothing to do with logical sector size. I doubt that we
will ever cause an overflow, given how conservative we have been on
the size of TRIM payload. We would probably ignore the reported value
once it is larger than 16 or so, if we would ever enable multi-block
payload in the future.

Realistically speaking, I sent this patch not really to get the
function ready for multi-block payload (neither I have mentioned that
in the commit message), but just to make the code proper and avoid
silly effort (and confusion caused) you and I spent in our patches.

>
> --
> Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Yan Aug. 23, 2016, 8:41 a.m. UTC | #7
On 23 August 2016 at 08:37, Tom Yan <tom.ty89@gmail.com> wrote:
> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote:
>>
>> But  the "humanized" limit is the one you just added and proceeded to
>> change ata_set_lba_range_entries(). You changed from a buffer size
>> to use "num" instead and now you want to remove the protection
>> entirely?
>
> I am not sure about what you are talking about here. What I did is
> just to avoid setting an (inappropriate and unnecessary) hard-coded
> limit (with the original while-condition) for i (the unaligned buffer
> size) in this general (in the sense that TRIM payload is _not always_
> of one 512-byte block, even if we may want to forever practice that in
> our kernel) function.
>
> If we really want/need to avoid hitting some real buffer limit (e.g.
> maximum length of scatter/gather list?), then we should in some way
> check n_block against that. If it is too large we then return
> used_bytes = 0 (optionally with some follow-up to add a response to
> such return value or so).
>
> We advertise 4194240 as Maximum Write Same Length so that the
> SCSI/block layer will know how to split the discard request, and then
> we make sure that the SATL reject SCSI commands (that is not issued
> through the discard / write same ioctl but with SG_IO /
> sg_write_same). That's all we really need to do, end of story.
>
>>
>> Why not just change to put this in front of ata_set_lba_range_entries()
>>
>> if (n_block > 65535 * ATA_MAX_TRIM_RNUM) {
>>      fp = 2;
>>     goto invalid_fld;
>> }
>
> Well you can change the if-else clause to that. Apparently you've been
> doing that in your patch series. But that has nothing to do with what
> I am fixing here. Neither does it affect the actual behavior.
>
>>
>> And then restore ata_set_lba_range_entries() to how it looked
>> before you changed it in commit:
>>
>> 2983860c7 (libata-scsi: avoid repeated calculation of number of TRIM ranges)
>
> I don't see how it is relevant at all. Though I should have introduced
> _this_ patch before that to save some work. Unfortunately I wasn't
> aware how ugly ata_set_lba_range_entries was.
>
>>
>> Then you can have ata_set_lba_range_entries() take the buffer size ...
>> something like the following would be fine:
>>
>>       size = ata_set_lba_range_entries(buf, scmd->device->sector_size,
>> block, n_block);
>>
>> Now things are happily protected against both exceeding the b0 limit(s) and
>> overflowing the sglist buffer.
>
> We have no reason at all to bring the logical sector size in here.
> Only if in future ACS standards it is stated that payload block are
> counted in logical block size instead of statically 512, we would only
> need to make the now static "512" in ALIGN() in to 4096.

typo. I meant from static "512" to logical sector size, and I really
think we should do it ONLY if the later ACS changes its statement.

>
> For possible sglist buffer overflow, see what I mentioned above about
> checking n_block and return 0. We would not want to do it with the
> original while-way anyway. All it does would be to truncate larger
> request and partially complete it silently.
>
> Can you tell exactly how the size of the sglist buffer is limited?
> AFAIK it has nothing to do with logical sector size. I doubt that we
> will ever cause an overflow, given how conservative we have been on
> the size of TRIM payload. We would probably ignore the reported value
> once it is larger than 16 or so, if we would ever enable multi-block
> payload in the future.
>
> Realistically speaking, I sent this patch not really to get the
> function ready for multi-block payload (neither I have mentioned that
> in the commit message), but just to make the code proper and avoid
> silly effort (and confusion caused) you and I spent in our patches.
>
>>
>> --
>> Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shaun Tancheff Aug. 23, 2016, 9:18 a.m. UTC | #8
On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote:
> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote:

> If we really want/need to avoid hitting some real buffer limit (e.g.
> maximum length of scatter/gather list?), then we should in some way
> check n_block against that. If it is too large we then return
> used_bytes = 0 (optionally with some follow-up to add a response to
> such return value or so).

Yes there is a real buffer limit, I can think of these two options:
1- Assume the setups from sd_setup_discard_cmnd() and/
   or sd_setup_write_same_cmnd() are providing an sglist of
   sdp->sector_size via scsi_init_io()

2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the
    sglist and calculate the available buffer size.

#2 sounds like more fun but I'm not sure it's what people would prefer to see.

--
Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Yan Aug. 23, 2016, 9:36 a.m. UTC | #9
On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
> On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote:
>
>> If we really want/need to avoid hitting some real buffer limit (e.g.
>> maximum length of scatter/gather list?), then we should in some way
>> check n_block against that. If it is too large we then return
>> used_bytes = 0 (optionally with some follow-up to add a response to
>> such return value or so).
>
> Yes there is a real buffer limit, I can think of these two options:
> 1- Assume the setups from sd_setup_discard_cmnd() and/
>    or sd_setup_write_same_cmnd() are providing an sglist of
>    sdp->sector_size via scsi_init_io()

That sounds completely wrong. The scatter/gather list we are talking
about here has nothing to do with the SCSI or block layer anymore. The
SATL has _already_ parsed the SCSI Write Same (16) command and is
packing ranges/payload according to that in this stage. If there is
any limit it would probably the max_segment allowed by the host driver
(e.g. ahci).

It doesn't seem to make sense to me either that we would need to
prevent sglist overflow in such level. Doesn't that mean we would need
to do the same checking (specifically, as in hard coding checks in all
kinds of procedures) in every use of scatter/gather list? That doesn't
sound right at all.

>
> 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the
>     sglist and calculate the available buffer size.
>
> #2 sounds like more fun but I'm not sure it's what people would prefer to see.

No idea if such thing exists / makes sense at all.

>
> --
> Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Yan Aug. 23, 2016, 10:17 a.m. UTC | #10
Wait a minute. I think you missed or misunderstood something when you
listen to someone's opinion in that we should switch to sglist buffer.
I think the danger people referred to is exactly what is revealed when
the ugly code is removed in this commit (it doesn't mean that the code
should be kept though).

The original buffer appears to be open:
buf = page_address(sg_page(scsi_sglist(scmd)));

While the new buffer you adopted in in ata_format_sct_write_same() and
ata_format_dsm_trim_descr() is of fixed size:

buffer = ((void *)ata_scsi_rbuf);

sctpg = ((void *)ata_scsi_rbuf);

because:

#define ATA_SCSI_RBUF_SIZE      4096
...
static u8 ata_scsi_rbuf[ATA_SCSI_RBUF_SIZE];

So the sglist buffer is always 4096 bytes.

And hence you can probably safely use ATA_SCSI_RBUF_SIZE as the buflen
param in the sg_copy_from_buffer() calls (at least in the case of
ata_format_sct_write_same()). However, the return value of
ata_format_dsm_trim_descr() should still always be used_bytes since
that is needed by the ata taskfile construction.

You may want to check (n_block / 65535 * 8 > ATA_SCSI_RBUF_SIZE). If
it is true, then perhaps we may want to return 0, and make the SATL
response with invalid CDB field if we catch that.

Though IMHO this is really NOT a reason that is strong enough to
prevent this patch from entering the repo first.

On 23 August 2016 at 09:36, Tom Yan <tom.ty89@gmail.com> wrote:
> On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>> On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>>> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote:
>>
>>> If we really want/need to avoid hitting some real buffer limit (e.g.
>>> maximum length of scatter/gather list?), then we should in some way
>>> check n_block against that. If it is too large we then return
>>> used_bytes = 0 (optionally with some follow-up to add a response to
>>> such return value or so).
>>
>> Yes there is a real buffer limit, I can think of these two options:
>> 1- Assume the setups from sd_setup_discard_cmnd() and/
>>    or sd_setup_write_same_cmnd() are providing an sglist of
>>    sdp->sector_size via scsi_init_io()
>
> That sounds completely wrong. The scatter/gather list we are talking
> about here has nothing to do with the SCSI or block layer anymore. The
> SATL has _already_ parsed the SCSI Write Same (16) command and is
> packing ranges/payload according to that in this stage. If there is
> any limit it would probably the max_segment allowed by the host driver
> (e.g. ahci).
>
> It doesn't seem to make sense to me either that we would need to
> prevent sglist overflow in such level. Doesn't that mean we would need
> to do the same checking (specifically, as in hard coding checks in all
> kinds of procedures) in every use of scatter/gather list? That doesn't
> sound right at all.
>
>>
>> 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the
>>     sglist and calculate the available buffer size.
>>
>> #2 sounds like more fun but I'm not sure it's what people would prefer to see.
>
> No idea if such thing exists / makes sense at all.
>
>>
>> --
>> Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shaun Tancheff Aug. 23, 2016, 10:45 a.m. UTC | #11
On Tue, Aug 23, 2016 at 5:17 AM, Tom Yan <tom.ty89@gmail.com> wrote:
> Wait a minute. I think you missed or misunderstood something when you
> listen to someone's opinion in that we should switch to sglist buffer.

No, I think I can trust Christoph Hellwig <hch@lst.de>

> I think the danger people referred to is exactly what is revealed when
> the ugly code is removed in this commit (it doesn't mean that the code
> should be kept though).
>
> The original buffer appears to be open:
> buf = page_address(sg_page(scsi_sglist(scmd)));

Which is unsafe.

> While the new buffer you adopted in in ata_format_sct_write_same() and
> ata_format_dsm_trim_descr() is of fixed size:

Yes ... it is a temporary response buffer for simulated commands used to
copy data to and from the command sg_list so as not to hold irqs while
modifying the buffer.

> buffer = ((void *)ata_scsi_rbuf);
>
> sctpg = ((void *)ata_scsi_rbuf);
>
> because:
>
> #define ATA_SCSI_RBUF_SIZE      4096
> ...
> static u8 ata_scsi_rbuf[ATA_SCSI_RBUF_SIZE];
>
> So the sglist buffer is always 4096 bytes.

No. The sglist buffer attached to the write same / trim command
is always sdp->sector_size

> And hence you can probably safely use ATA_SCSI_RBUF_SIZE as the buflen
> param in the sg_copy_from_buffer() calls (at least in the case of
> ata_format_sct_write_same()).

No. SCT Write Same has a fixed single 512 byte transfer.

However, the return value of
> ata_format_dsm_trim_descr() should still always be used_bytes since
> that is needed by the ata taskfile construction.

So long as it does not exceed its sglist/sector_size buffer.

> You may want to check (n_block / 65535 * 8 > ATA_SCSI_RBUF_SIZE). If
> it is true, then perhaps we may want to return 0, and make the SATL
> response with invalid CDB field if we catch that.

No that is not quite right you need to check if you are
overflowing either RBUF or sdp->sector_size.

> Though IMHO this is really NOT a reason that is strong enough to
> prevent this patch from entering the repo first.

> On 23 August 2016 at 09:36, Tom Yan <tom.ty89@gmail.com> wrote:
>> On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>>> On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>>>> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote:
>>>
>>>> If we really want/need to avoid hitting some real buffer limit (e.g.
>>>> maximum length of scatter/gather list?), then we should in some way
>>>> check n_block against that. If it is too large we then return
>>>> used_bytes = 0 (optionally with some follow-up to add a response to
>>>> such return value or so).
>>>
>>> Yes there is a real buffer limit, I can think of these two options:
>>> 1- Assume the setups from sd_setup_discard_cmnd() and/
>>>    or sd_setup_write_same_cmnd() are providing an sglist of
>>>    sdp->sector_size via scsi_init_io()
>>
>> That sounds completely wrong. The scatter/gather list we are talking
>> about here has nothing to do with the SCSI or block layer anymore. The
>> SATL has _already_ parsed the SCSI Write Same (16) command and is
>> packing ranges/payload according to that in this stage. If there is
>> any limit it would probably the max_segment allowed by the host driver
>> (e.g. ahci).
>>
>> It doesn't seem to make sense to me either that we would need to
>> prevent sglist overflow in such level. Doesn't that mean we would need
>> to do the same checking (specifically, as in hard coding checks in all
>> kinds of procedures) in every use of scatter/gather list? That doesn't
>> sound right at all.
>>
>>>
>>> 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the
>>>     sglist and calculate the available buffer size.
>>>
>>> #2 sounds like more fun but I'm not sure it's what people would prefer to see.
>>
>> No idea if such thing exists / makes sense at all.
>>
>>>
>>> --
>>> Shaun
Tom Yan Aug. 23, 2016, 11:16 a.m. UTC | #12
On 23 August 2016 at 10:45, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
> On Tue, Aug 23, 2016 at 5:17 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>> Wait a minute. I think you missed or misunderstood something when you
>> listen to someone's opinion in that we should switch to sglist buffer.
>
> No, I think I can trust Christoph Hellwig <hch@lst.de>

I didn't say that you cannot trust him, but I just wonder you might
have misinterpreted what he said. I haven't really follow your patch
series earlier though.

>
>> I think the danger people referred to is exactly what is revealed when
>> the ugly code is removed in this commit (it doesn't mean that the code
>> should be kept though).
>>
>> The original buffer appears to be open:
>> buf = page_address(sg_page(scsi_sglist(scmd)));
>
> Which is unsafe.

Yup, but the while loop is not the right way to make it safe. It
probably prevent any overflow, but does not handle / response to the
request (n_block) properly.

It would only be practically unsafe if we would ever call it with a
request that needs a multi-block payload.

>
>> While the new buffer you adopted in in ata_format_sct_write_same() and
>> ata_format_dsm_trim_descr() is of fixed size:
>
> Yes ... it is a temporary response buffer for simulated commands used to
> copy data to and from the command sg_list so as not to hold irqs while
> modifying the buffer.
>
>> buffer = ((void *)ata_scsi_rbuf);
>>
>> sctpg = ((void *)ata_scsi_rbuf);
>>
>> because:
>>
>> #define ATA_SCSI_RBUF_SIZE      4096
>> ...
>> static u8 ata_scsi_rbuf[ATA_SCSI_RBUF_SIZE];
>>
>> So the sglist buffer is always 4096 bytes.
>
> No. The sglist buffer attached to the write same / trim command

Don't you think it is in appropriate to use ata_scsi_rbuf here then?

> is always sdp->sector_size

I can hardly see how sdp->sector_size is relevant to the sglist buffer
(especially the sglist here is facing to the ATA device).
Scatter/gather list could have multiple entries of a certain size
(that is not necessarily, or unlikely, logical sector size). Although
I don't really know how scatter/gather list works, it hardly seems
making any sense by saying that the sglist buffer is always logical
sector size.

AFAIK, for example, USB Attached SCSI driver does not limit the
maximum number of entries (max_segment) so the kernel fall back to
SG_MAX_SEGMENTS (2048), while it reports a max_segment_size of 4096 in
spite of the logical sector size with dma_get_max_seg_size().

For AHCI, it reports a max_segment of 168, while it does not seem to
report any max_segment_size so dma_get_max_seg_size() fallback to
65536 (though practically it seems 4096 will still be used in normal
writing).

>
>> And hence you can probably safely use ATA_SCSI_RBUF_SIZE as the buflen
>> param in the sg_copy_from_buffer() calls (at least in the case of
>> ata_format_sct_write_same()).
>
> No. SCT Write Same has a fixed single 512 byte transfer.

Never mind on that. I have presumption that you should always fully
copy ata_scsi_rbuf to the sglist.

>
> However, the return value of
>> ata_format_dsm_trim_descr() should still always be used_bytes since
>> that is needed by the ata taskfile construction.
>
> So long as it does not exceed its sglist/sector_size buffer.
>
>> You may want to check (n_block / 65535 * 8 > ATA_SCSI_RBUF_SIZE). If
>> it is true, then perhaps we may want to return 0, and make the SATL
>> response with invalid CDB field if we catch that.
>
> No that is not quite right you need to check if you are
> overflowing either RBUF or sdp->sector_size.

Again, this does not make any sense. See above. Apparently you really
have no reason to use ata_scsi_rbuf.

>
>> Though IMHO this is really NOT a reason that is strong enough to
>> prevent this patch from entering the repo first.
>
>> On 23 August 2016 at 09:36, Tom Yan <tom.ty89@gmail.com> wrote:
>>> On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote:
>>>> On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote:
>>>>> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote:
>>>>
>>>>> If we really want/need to avoid hitting some real buffer limit (e.g.
>>>>> maximum length of scatter/gather list?), then we should in some way
>>>>> check n_block against that. If it is too large we then return
>>>>> used_bytes = 0 (optionally with some follow-up to add a response to
>>>>> such return value or so).
>>>>
>>>> Yes there is a real buffer limit, I can think of these two options:
>>>> 1- Assume the setups from sd_setup_discard_cmnd() and/
>>>>    or sd_setup_write_same_cmnd() are providing an sglist of
>>>>    sdp->sector_size via scsi_init_io()
>>>
>>> That sounds completely wrong. The scatter/gather list we are talking
>>> about here has nothing to do with the SCSI or block layer anymore. The
>>> SATL has _already_ parsed the SCSI Write Same (16) command and is
>>> packing ranges/payload according to that in this stage. If there is
>>> any limit it would probably the max_segment allowed by the host driver
>>> (e.g. ahci).
>>>
>>> It doesn't seem to make sense to me either that we would need to
>>> prevent sglist overflow in such level. Doesn't that mean we would need
>>> to do the same checking (specifically, as in hard coding checks in all
>>> kinds of procedures) in every use of scatter/gather list? That doesn't
>>> sound right at all.
>>>
>>>>
>>>> 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the
>>>>     sglist and calculate the available buffer size.
>>>>
>>>> #2 sounds like more fun but I'm not sure it's what people would prefer to see.
>>>
>>> No idea if such thing exists / makes sense at all.
>>>
>>>>
>>>> --
>>>> Shaun
>
>
>
> --
> Shaun Tancheff
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shaun Tancheff Aug. 25, 2016, 8:42 a.m. UTC | #13
Tom,

In my opinion this patch you submitted is simply making the code less
safe against a buffer overflow without a sufficiently good reason.

In future please comment on other patches as replies to those patches.
Mixing them together is just confusing.

--Shaun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index be9c76c..9b74ecb 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -3322,7 +3322,7 @@  static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc)
 	buf = page_address(sg_page(scsi_sglist(scmd)));
 
 	if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) {
-		size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block);
+		size = ata_set_lba_range_entries(buf, block, n_block);
 	} else {
 		fp = 2;
 		goto invalid_fld;
diff --git a/include/linux/ata.h b/include/linux/ata.h
index adbc812..5e2e9ad 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -1077,19 +1077,19 @@  static inline void ata_id_to_hd_driveid(u16 *id)
  * TO NV CACHE PINNED SET.
  */
 static inline unsigned ata_set_lba_range_entries(void *_buffer,
-		unsigned num, u64 sector, unsigned long count)
+		u64 sector, unsigned long count)
 {
 	__le64 *buffer = _buffer;
 	unsigned i = 0, used_bytes;
 
-	while (i < num) {
-		u64 entry = sector |
-			((u64)(count > 0xffff ? 0xffff : count) << 48);
+	while (count > 0) {
+		u64 range, entry;
+
+		range = count > 0xffff ? 0xffff : count;
+	        entry = sector | (range << 48);
 		buffer[i++] = __cpu_to_le64(entry);
-		if (count <= 0xffff)
-			break;
-		count -= 0xffff;
-		sector += 0xffff;
+		count -= range;
+		sector += range;
 	}
 
 	used_bytes = ALIGN(i * 8, 512);