mbox series

[v7,0/7] Improve libata support for FUA

Message ID 20230103051924.233796-1-damien.lemoal@opensource.wdc.com
Headers show
Series Improve libata support for FUA | expand

Message

Damien Le Moal Jan. 3, 2023, 5:19 a.m. UTC
These patches cleanup and improve libata support for ATA devices
supporting the FUA feature.

The first patch modifies the block layer to prevent the use of REQ_FUA
with read requests. This is necessary as the block layer code expect
REQ_FUA to be used with write requests (the flush machinery cannot
enforce access to the media for FUA read commands) and FUA is not
supported with ATA devices when NCQ is not enabled (device queue depth
set to 1).

Patch 2 and 3 are libata cleanup preparatory patches. Patch 4 cleans up
the detection for FUA support. Patch 5 fixes building a taskfile for FUA
write requests. Patch 6 prevents the use of FUA with known bad drives.

Finally, patch 7 enables FUA support by default in libata for devices
supporting this features.

Changes from v6:
 - Modified patch 1 to include checks for REQ_OP_ZONE_APPEND
 - Addressed comments from Niklas (patch 2 -> return false, patch 3 ->
   commit message typo, patch 7 -> more verbose commit message)

Changes from v5:
 - Removed WARN for FUA reads in patch 5.
 - Added reviewed-by tags.

Changes from v4:
 - Changed patch 1 to the one suggested by Christoph.
 - Added Hannes review tag.

Changes from v3:
 - Added patch 1 to prevent any block device user from issuing a
   REQ_FUA read.
 - Changed patch 5 to remove the check for REQ_FUA read and also remove 
   support for ATA_CMD_WRITE_MULTI_FUA_EXT as this command is obsolete
   in recent ACS specifications.

Changes from v2:
 - Added patch 1 and 2 as preparatory patches
 - Added patch 4 to fix FUA writes handling for the non-ncq case. Note
   that it is possible that the drives blacklisted in patch 5 are
   actually OK since the code back in 2012 had the issue with the wrong
   use of LBA 28 commands for FUA writes.

Changes from v1:
 - Removed Maciej's patch 2. Instead, blacklist drives which are known
   to have a buggy FUA support.

Christoph Hellwig (1):
  block: add a sanity check for non-write flush/fua bios

Damien Le Moal (6):
  ata: libata: Introduce ata_ncq_supported()
  ata: libata: Rename and cleanup ata_rwcmd_protocol()
  ata: libata: cleanup fua support detection
  ata: libata: Fix FUA handling in ata_build_rw_tf()
  ata: libata: blacklist FUA support for known buggy drives
  ata: libata: Enable fua support by default

 .../admin-guide/kernel-parameters.txt         |  3 +
 block/blk-core.c                              | 14 ++--
 drivers/ata/libata-core.c                     | 73 ++++++++++++++-----
 drivers/ata/libata-scsi.c                     | 30 +-------
 include/linux/libata.h                        | 36 ++++++---
 5 files changed, 94 insertions(+), 62 deletions(-)

Comments

Tejun Heo Jan. 4, 2023, 4:49 p.m. UTC | #1
Hello,

On Tue, Jan 03, 2023 at 02:19:17PM +0900, Damien Le Moal wrote:
> Finally, patch 7 enables FUA support by default in libata for devices
> supporting this features.

These optional features tend to be broken in various and subtle ways,
especially the ones which don't show clear and notable advantages and thus
don't get used by everybody. I'm not necessarily against enabling it by
default but we should have better justifications as we might unnecessarily
cause a bunch of painful and subtle failures which can take a while to sort
out.

* Can the advantages of using FUA be demonstrated in a realistic way? IOW,
  are there workloads which clearly benefit from FUA? My memory is hazy but
  we only really use FUA from flush sequence to turn flush, write, flush
  sequence into flush, FUA-write. As all the heavy lifting is done in the
  first flush anyway, I couldn't find a case where that optimization made a
  meaningful difference but I didn't look very hard.

* Do we know how widely FUA is used now? IOW, is windows using FUA by
  default now? If so, do we know whether they have a blocklist?

Thanks.
Damien Le Moal Jan. 5, 2023, 3:43 a.m. UTC | #2
On 1/5/23 01:49, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jan 03, 2023 at 02:19:17PM +0900, Damien Le Moal wrote:
>> Finally, patch 7 enables FUA support by default in libata for devices
>> supporting this features.
> 
> These optional features tend to be broken in various and subtle ways,

FUA is not optional for any drive that supports NCQ. The FUA bit is a
mandatory part of the FPDMA READ/WRITE commands. The optional part is
support for the non-ncq WRITE FUA EXT command.

> especially the ones which don't show clear and notable advantages and thus
> don't get used by everybody. I'm not necessarily against enabling it by
> default but we should have better justifications as we might unnecessarily
> cause a bunch of painful and subtle failures which can take a while to sort
> out.

Avoiding regressions is always my highest priority. I know that there
are a lot of cheap ATA devices out there that have questionable ACS spec
compliance.

> * Can the advantages of using FUA be demonstrated in a realistic way? IOW,
>   are there workloads which clearly benefit from FUA? My memory is hazy but
>   we only really use FUA from flush sequence to turn flush, write, flush
>   sequence into flush, FUA-write. As all the heavy lifting is done in the
>   first flush anyway, I couldn't find a case where that optimization made a
>   meaningful difference but I didn't look very hard.

The main users in kernel are file systems, when committing
transactions/metadata journaling. Given that this is generally not
generating a lot of traffic, I do not think we can measure any
difference for HDDs. The devices are too slow to start with, so saving
one command will not matter much, unless the application is fsync()
crazy (and even then, not sure we'll see any difference). Even for SATA
SSDs it likely will be hard to see a difference I think.

Then we have applications using the drive block device file directly.
For these, it is hard to tell how much it matters. Enabling it by
default with a drive correctly supporting it will very much likely not
hurt though.

Maciej,

May be you did some experiments before asking for enabling FUA by
default ? Any interesting performance data you can share ?

> * Do we know how widely FUA is used now? IOW, is windows using FUA by
>   default now? If so, do we know whether they have a blocklist?

You mean "blacklist" ? I do not have any information about Windows, but
I can try to find out, at least for my employer's devices. But that will
not be very useful as I know these drives behave correctly.

More than Windows or the kernel, I think that looking at SAS HBAs is
more important here. SATA HDDs are the most widely used type of devices
with these, by far. These may have a SAT translating FUA scsi writes to
FUA NCQ FPDMA writes, resulting in FUA being extensively used. Modulo a
blacklist that results in the same as the kernel with a
flush/write/flush sequence. Hard to know as HBA's FW are not open. A bus
analyzer could tell us that though, but again I can look at that only
with the drives I have, which I know are working well with FUA.

I am OK with attempting enabling FUA by default for the following reasons:
1) The vast majority of drives in libata blacklist (all features) are
old models that are not sold anymore.
2) We are restricting FUA support to drives that also support NCQ, that
is, modern-ish ones that are supposed to process the FUA NCQ read/write
commands correctly, per specs.
3) For HDDs, which is the vast majority of ATA devices out there these
days, all recent drives I have tested are OK. Even older ones with NCQ
support that I have access to are fine.
4) We are at rc2, which gives us time to revert patch 7 if we see too
many bug reports.

One thing we could add to the patch series is an additional restriction
to enabling FUA by default to drives that support a recent standard. Say
ACS-4 and above. That will restrict this to recent devices, thus
reducing the risk of hitting bad apples. Thoughts ?
Tejun Heo Jan. 5, 2023, 6:15 p.m. UTC | #3
Hello,

On Thu, Jan 05, 2023 at 12:43:06PM +0900, Damien Le Moal wrote:
> > These optional features tend to be broken in various and subtle ways,
> 
> FUA is not optional for any drive that supports NCQ. The FUA bit is a
> mandatory part of the FPDMA READ/WRITE commands. The optional part is
> support for the non-ncq WRITE FUA EXT command.

Optional in the sense that it isn't essential in achieving the main function
of the device, which means that most don't end up using it.

> > especially the ones which don't show clear and notable advantages and thus
> > don't get used by everybody. I'm not necessarily against enabling it by
> > default but we should have better justifications as we might unnecessarily
> > cause a bunch of painful and subtle failures which can take a while to sort
> > out.
> 
> Avoiding regressions is always my highest priority. I know that there
> are a lot of cheap ATA devices out there that have questionable ACS spec
> compliance.

A lot of historical devices too which don't get much scrutiny or testing but
can still cause significant griefs for the users.

> > * Can the advantages of using FUA be demonstrated in a realistic way? IOW,
> >   are there workloads which clearly benefit from FUA? My memory is hazy but
> >   we only really use FUA from flush sequence to turn flush, write, flush
> >   sequence into flush, FUA-write. As all the heavy lifting is done in the
> >   first flush anyway, I couldn't find a case where that optimization made a
> >   meaningful difference but I didn't look very hard.
> 
> The main users in kernel are file systems, when committing
> transactions/metadata journaling. Given that this is generally not
> generating a lot of traffic, I do not think we can measure any
> difference for HDDs. The devices are too slow to start with, so saving
> one command will not matter much, unless the application is fsync()
> crazy (and even then, not sure we'll see any difference). Even for SATA
> SSDs it likely will be hard to see a difference I think.

On a quick glance, there are some uses of REQ_FUA w/o REQ_PREFLUSH which
indicates that there can be actual gains to be had. However, ext4 AFAICS
always pairs PREFLUSH w/ FUA, so a lot of use cases won't see any gain while
taking on the possible risk of being exposed to FUA commands.

> Then we have applications using the drive block device file directly.
> For these, it is hard to tell how much it matters. Enabling it by
> default with a drive correctly supporting it will very much likely not
> hurt though.
> 
> Maciej,
> 
> May be you did some experiments before asking for enabling FUA by
> default ? Any interesting performance data you can share ?
> 
> > * Do we know how widely FUA is used now? IOW, is windows using FUA by
> >   default now? If so, do we know whether they have a blocklist?
> 
> You mean "blacklist" ? I do not have any information about Windows, but

The PC thing to say now seems to be allowlist / blocklist instead of
whiltelist / blacklist, not that I mind either way.

> I can try to find out, at least for my employer's devices. But that will
> not be very useful as I know these drives behave correctly.

So, AFAIK, windows doesn't issue FUA for SATA devices, only SAS, but I could
be wrong. It'd be really useful to find out.

> More than Windows or the kernel, I think that looking at SAS HBAs is
> more important here. SATA HDDs are the most widely used type of devices
> with these, by far. These may have a SAT translating FUA scsi writes to
> FUA NCQ FPDMA writes, resulting in FUA being extensively used. Modulo a
> blacklist that results in the same as the kernel with a
> flush/write/flush sequence. Hard to know as HBA's FW are not open. A bus
> analyzer could tell us that though, but again I can look at that only
> with the drives I have, which I know are working well with FUA.
> 
> I am OK with attempting enabling FUA by default for the following reasons:
> 1) The vast majority of drives in libata blacklist (all features) are
> old models that are not sold anymore.

The context here is that we promptly found all of these devices struggle
with FUA (like locking up and dropping off the bus) shortly after we enabled
FUA by default, so the list is by no means exhaustive and is more an
indication that there at least were a whole lot of devices which choke on
FUA. On top, devices not sold anymore are even harder to debug and pay
attention to while being able to cause a lot of pain to configurations which
have been stable and happy for a long time.

> 2) We are restricting FUA support to drives that also support NCQ, that
> is, modern-ish ones that are supposed to process the FUA NCQ read/write
> commands correctly, per specs.

NCQ is really old now and our previous attempt at FUA was after NCQ was
widely available, so I'm not sure this holds.

> 3) For HDDs, which is the vast majority of ATA devices out there these
> days, all recent drives I have tested are OK. Even older ones with NCQ
> support that I have access to are fine.
> 4) We are at rc2, which gives us time to revert patch 7 if we see too
> many bug reports.

This sort of problems especially if affecting mostly old devices can be very
difficult to suss out and will definitely take way longer than a single
release cycle.

> One thing we could add to the patch series is an additional restriction
> to enabling FUA by default to drives that support a recent standard. Say
> ACS-4 and above. That will restrict this to recent devices, thus
> reducing the risk of hitting bad apples. Thoughts ?

Yeah, that'd help and also if SAS HBA SAT's have been issuing FUA's which
would be a meaningful verification of the feature, at least for rotating
hard disks.

I feel rather uneasy about enabling FUA by default given history. We can
improve its chances by restricting it to newer devices and maybe even just
hard disks, but it kinda comes back to the root question of why. Why would
we want to do this? What are the benefits? Right now, there are a bunch of
really tricky cons and not whole lot on the pro column.

Thanks.
Damien Le Moal Jan. 6, 2023, 6:51 a.m. UTC | #4
On 1/6/23 03:15, Tejun Heo wrote:
> Hello,
> 
> On Thu, Jan 05, 2023 at 12:43:06PM +0900, Damien Le Moal wrote:
>>> These optional features tend to be broken in various and subtle ways,
>>
>> FUA is not optional for any drive that supports NCQ. The FUA bit is a
>> mandatory part of the FPDMA READ/WRITE commands. The optional part is
>> support for the non-ncq WRITE FUA EXT command.
> 
> Optional in the sense that it isn't essential in achieving the main function
> of the device, which means that most don't end up using it.

OK. Granted. But for this particular case, scsi & nvme subsystem do not
treat FUA as "optional". If it is supported, it will be used even though
you are correct that we can work without it. I do not think we should
treat ATA devices any differently.

>>> especially the ones which don't show clear and notable advantages and thus
>>> don't get used by everybody. I'm not necessarily against enabling it by
>>> default but we should have better justifications as we might unnecessarily
>>> cause a bunch of painful and subtle failures which can take a while to sort
>>> out.
>>
>> Avoiding regressions is always my highest priority. I know that there
>> are a lot of cheap ATA devices out there that have questionable ACS spec
>> compliance.
> 
> A lot of historical devices too which don't get much scrutiny or testing but
> can still cause significant griefs for the users.

Yes. There are a lot of s****y old devices that do not correctly handle
synchronize cache, and likely fua too. Hence my propsal to limit
enabling FUA support to newer devices based on the standards version
supported. Note that this patch set excludes all ide/pata devices. These
will still operate with fua off by default since they do not support NCQ.

>>> * Can the advantages of using FUA be demonstrated in a realistic way? IOW,
>>>   are there workloads which clearly benefit from FUA? My memory is hazy but
>>>   we only really use FUA from flush sequence to turn flush, write, flush
>>>   sequence into flush, FUA-write. As all the heavy lifting is done in the
>>>   first flush anyway, I couldn't find a case where that optimization made a
>>>   meaningful difference but I didn't look very hard.
>>
>> The main users in kernel are file systems, when committing
>> transactions/metadata journaling. Given that this is generally not
>> generating a lot of traffic, I do not think we can measure any
>> difference for HDDs. The devices are too slow to start with, so saving
>> one command will not matter much, unless the application is fsync()
>> crazy (and even then, not sure we'll see any difference). Even for SATA
>> SSDs it likely will be hard to see a difference I think.
> 
> On a quick glance, there are some uses of REQ_FUA w/o REQ_PREFLUSH which
> indicates that there can be actual gains to be had. However, ext4 AFAICS
> always pairs PREFLUSH w/ FUA, so a lot of use cases won't see any gain while
> taking on the possible risk of being exposed to FUA commands.

Yes. Most FSes will do PREFLUSH | FUA. For the risk, see above.

>> Then we have applications using the drive block device file directly.
>> For these, it is hard to tell how much it matters. Enabling it by
>> default with a drive correctly supporting it will very much likely not
>> hurt though.
>>
>> Maciej,
>>
>> May be you did some experiments before asking for enabling FUA by
>> default ? Any interesting performance data you can share ?
>>
>>> * Do we know how widely FUA is used now? IOW, is windows using FUA by
>>>   default now? If so, do we know whether they have a blocklist?
>>
>> You mean "blacklist" ? I do not have any information about Windows, but
> 
> The PC thing to say now seems to be allowlist / blocklist instead of
> whiltelist / blacklist, not that I mind either way.

I was thinking "block == sector" :) yes, could patch the code to rename
blacklist to something like badlist. I find "block" confusing here given
that we are talking about block devices :)

>> I can try to find out, at least for my employer's devices. But that will
>> not be very useful as I know these drives behave correctly.
> 
> So, AFAIK, windows doesn't issue FUA for SATA devices, only SAS, but I could
> be wrong. It'd be really useful to find out.

Need to ping some people to see if I can find out.

>> More than Windows or the kernel, I think that looking at SAS HBAs is
>> more important here. SATA HDDs are the most widely used type of devices
>> with these, by far. These may have a SAT translating FUA scsi writes to
>> FUA NCQ FPDMA writes, resulting in FUA being extensively used. Modulo a
>> blacklist that results in the same as the kernel with a
>> flush/write/flush sequence. Hard to know as HBA's FW are not open. A bus
>> analyzer could tell us that though, but again I can look at that only
>> with the drives I have, which I know are working well with FUA.
>>
>> I am OK with attempting enabling FUA by default for the following reasons:
>> 1) The vast majority of drives in libata blacklist (all features) are
>> old models that are not sold anymore.
> 
> The context here is that we promptly found all of these devices struggle
> with FUA (like locking up and dropping off the bus) shortly after we enabled
> FUA by default, so the list is by no means exhaustive and is more an
> indication that there at least were a whole lot of devices which choke on
> FUA. On top, devices not sold anymore are even harder to debug and pay
> attention to while being able to cause a lot of pain to configurations which
> have been stable and happy for a long time.

Yes. Hence, again, the idea to limit this to recent drives. E.g ACS-4
(or 5) and above.

>> 2) We are restricting FUA support to drives that also support NCQ, that
>> is, modern-ish ones that are supposed to process the FUA NCQ read/write
>> commands correctly, per specs.
> 
> NCQ is really old now and our previous attempt at FUA was after NCQ was
> widely available, so I'm not sure this holds.
> 
>> 3) For HDDs, which is the vast majority of ATA devices out there these
>> days, all recent drives I have tested are OK. Even older ones with NCQ
>> support that I have access to are fine.
>> 4) We are at rc2, which gives us time to revert patch 7 if we see too
>> many bug reports.
> 
> This sort of problems especially if affecting mostly old devices can be very
> difficult to suss out and will definitely take way longer than a single
> release cycle.
> 
>> One thing we could add to the patch series is an additional restriction
>> to enabling FUA by default to drives that support a recent standard. Say
>> ACS-4 and above. That will restrict this to recent devices, thus
>> reducing the risk of hitting bad apples. Thoughts ?
> 
> Yeah, that'd help and also if SAS HBA SAT's have been issuing FUA's which
> would be a meaningful verification of the feature, at least for rotating
> hard disks.
> 
> I feel rather uneasy about enabling FUA by default given history. We can
> improve its chances by restricting it to newer devices and maybe even just
> hard disks, but it kinda comes back to the root question of why. Why would
> we want to do this? What are the benefits? Right now, there are a bunch of
> really tricky cons and not whole lot on the pro column.

OK. But again, why treat ATA devices differently from scsi/nvme/ufs ?
These have FUA used by default if it is supported.

We can take a big hammer here and start with enabling only ACS-5 and
above for now. That will represent the set of devices that are in
development right now, and only a few already released (I have some in
my test boxes and they are not even a few months old...).

Or simply remove patch 7 and let user choose to enable FUA themselves if
they are confident their devices are OK. That is the safest, but I am
not keen on keeping ATA subsystem in the 20th century...

> 
> Thanks.
>
Tejun Heo Jan. 6, 2023, 6:03 p.m. UTC | #5
Hello,

On Fri, Jan 06, 2023 at 03:51:48PM +0900, Damien Le Moal wrote:
> OK. Granted. But for this particular case, scsi & nvme subsystem do not
> treat FUA as "optional". If it is supported, it will be used even though
> you are correct that we can work without it. I do not think we should
> treat ATA devices any differently.

What matters isn't that they have a featured with the same name but the
actual circumstances. e.g. for nvme, FUA has been there from the beginning
and we used it from the beginning so we know that they're safe. For ATA,
it's something added later on and we know that there are a bunch of devices
which choke on it and we don't know whether anyone else is using it at any
scale.

> > On a quick glance, there are some uses of REQ_FUA w/o REQ_PREFLUSH which
> > indicates that there can be actual gains to be had. However, ext4 AFAICS
> > always pairs PREFLUSH w/ FUA, so a lot of use cases won't see any gain while
> > taking on the possible risk of being exposed to FUA commands.
> 
> Yes. Most FSes will do PREFLUSH | FUA. For the risk, see above.

Someone should benchmark it but it's likelyt that PREFLUSH | FUA vs.
PREFLUSH | WRITE | POSTFLUSH isn't gonna show any meaningful difference in
any realistic scenario. The main gain of NCQ'd FUA is that we don't have to
drain the in-flight commands but PREFLUSH needs that anyway.

> > I feel rather uneasy about enabling FUA by default given history. We can
> > improve its chances by restricting it to newer devices and maybe even just
> > hard disks, but it kinda comes back to the root question of why. Why would
> > we want to do this? What are the benefits? Right now, there are a bunch of
> > really tricky cons and not whole lot on the pro column.
> 
> OK. But again, why treat ATA devices differently from scsi/nvme/ufs ?
> These have FUA used by default if it is supported.

This part hopefully is answered above.

> We can take a big hammer here and start with enabling only ACS-5 and
> above for now. That will represent the set of devices that are in
> development right now, and only a few already released (I have some in
> my test boxes and they are not even a few months old...).

All that said, yeah, if we restrict it to only the newest devices, they're
more likely to be well behaved and a lot more visible when they misbehave.
That sounds reasonable to me.

Thanks.
Damien Le Moal Jan. 10, 2023, 1:23 p.m. UTC | #6
On 1/7/23 03:03, Tejun Heo wrote:
>> We can take a big hammer here and start with enabling only ACS-5 and
>> above for now. That will represent the set of devices that are in
>> development right now, and only a few already released (I have some in
>> my test boxes and they are not even a few months old...).
> 
> All that said, yeah, if we restrict it to only the newest devices, they're
> more likely to be well behaved and a lot more visible when they misbehave.
> That sounds reasonable to me.

I re-posted the series without patch 7 enabling FUA by default. This
maintains the current state of libata while still cleaning up nicely all
the code around FUA.

I will send 1 or 2 patches later after thinking a little more about how to
safely enable FUA by default only for recent drives or drives of interest.
E.g. SMR drives as the lack of FUA support for them forces the use of the
block layer flush machinery, which itself causes write reordering... That
needs to be addressed too, and will look at that.

Thanks for the feedback.