mbox series

[v2,0/3] Improve libata support for FUA

Message ID 20221024072609.346502-1-damien.lemoal@opensource.wdc.com
Headers show
Series Improve libata support for FUA | expand

Message

Damien Le Moal Oct. 24, 2022, 7:26 a.m. UTC
These patches cleanup and improve libata support for the FUA device
feature. Patch 3 enables FUA support by default for any drive that
reports supporting the feature.

Changes from v1:
 - Removed Maciej's patch 2. Instead, blacklist drives which are known
   to have a buggy FUA support.

Damien Le Moal (3):
  ata: libata: cleanup fua handling
  ata: libata: blacklist FUA support for known buggy drives
  ata: libata: Enable fua support by default

 .../admin-guide/kernel-parameters.txt         |  3 ++
 drivers/ata/libata-core.c                     | 36 +++++++++++++++++--
 drivers/ata/libata-scsi.c                     | 30 ++--------------
 include/linux/libata.h                        |  8 +++--
 4 files changed, 43 insertions(+), 34 deletions(-)

Comments

Maciej S. Szmigiero Oct. 24, 2022, 6:48 p.m. UTC | #1
On 24.10.2022 09:26, Damien Le Moal wrote:
> These patches cleanup and improve libata support for the FUA device
> feature. Patch 3 enables FUA support by default for any drive that
> reports supporting the feature.
> 
> Changes from v1:
>   - Removed Maciej's patch 2. Instead, blacklist drives which are known
>     to have a buggy FUA support.
> 
> Damien Le Moal (3):
>    ata: libata: cleanup fua handling
>    ata: libata: blacklist FUA support for known buggy drives
>    ata: libata: Enable fua support by default
> 

Thanks for the updated series.

In general (besides the small commit message thing that Sergey had
already mentioned) it looks good to me, so:
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>

Thanks,
Maciej
Damien Le Moal Oct. 24, 2022, 10:09 p.m. UTC | #2
On 10/25/22 03:48, Maciej S. Szmigiero wrote:
> On 24.10.2022 09:26, Damien Le Moal wrote:
>> These patches cleanup and improve libata support for the FUA device
>> feature. Patch 3 enables FUA support by default for any drive that
>> reports supporting the feature.
>>
>> Changes from v1:
>>   - Removed Maciej's patch 2. Instead, blacklist drives which are known
>>     to have a buggy FUA support.
>>
>> Damien Le Moal (3):
>>    ata: libata: cleanup fua handling
>>    ata: libata: blacklist FUA support for known buggy drives
>>    ata: libata: Enable fua support by default
>>
> 
> Thanks for the updated series.
> 
> In general (besides the small commit message thing that Sergey had
> already mentioned) it looks good to me, so:
> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>

Thanks. I need to do some more testing using some very old drives I found.
So far, no issues: detection works, some drives have FUA, other not. For
the ones that have FUA, I am running fstests (ext4 and xfs) to check for
weird behavior with REQ_FUA writes. Once I complete all tests I will queue
this.

> 
> Thanks,
> Maciej
>
Damien Le Moal Oct. 24, 2022, 11:26 p.m. UTC | #3
On 10/25/22 07:09, Damien Le Moal wrote:
> On 10/25/22 03:48, Maciej S. Szmigiero wrote:
>> On 24.10.2022 09:26, Damien Le Moal wrote:
>>> These patches cleanup and improve libata support for the FUA device
>>> feature. Patch 3 enables FUA support by default for any drive that
>>> reports supporting the feature.
>>>
>>> Changes from v1:
>>>   - Removed Maciej's patch 2. Instead, blacklist drives which are known
>>>     to have a buggy FUA support.
>>>
>>> Damien Le Moal (3):
>>>    ata: libata: cleanup fua handling
>>>    ata: libata: blacklist FUA support for known buggy drives
>>>    ata: libata: Enable fua support by default
>>>
>>
>> Thanks for the updated series.
>>
>> In general (besides the small commit message thing that Sergey had
>> already mentioned) it looks good to me, so:
>> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> 
> Thanks. I need to do some more testing using some very old drives I found.
> So far, no issues: detection works, some drives have FUA, other not. For
> the ones that have FUA, I am running fstests (ext4 and xfs) to check for
> weird behavior with REQ_FUA writes. Once I complete all tests I will queue
> this.

Actually, I need to take this back. Checking again the code, I found an
issue with this entire FUA support: for a drive that does not support NCQ,
or one that has NCQ but has its queue depth set to one, then for a REQ_FUA
write request, ATA_CMD_WRITE_MULTI_FUA_EXT or ATA_CMD_WRITE_FUA_EXT will
be used. All good, BUT ! sd.c may also send read requests with the FUA bit
set if the read request has REQ_FUA set. For read commands, the regular,
non FUA commands ATA_CMD_READ_MULTI, ATA_CMD_READ_MULTI_EXT, ATA_CMD_READ
or ATA_CMD_READ_EXT will be used since ATA does not define a FUA version
of these. This means that the REQ_FUA flag will be ignored: this entire
code is broken as it is assuming that the read command processing on the
drive is consistent with executions of ATA_CMD_WRITE_MULTI_FUA_EXT or
ATA_CMD_WRITE_FUA_EXT. I do not want to bet on that, especially with old
drives.

I would be tempted to restrict FUA support to drives that support NCQ,
given that with NCQ, both READ FPDMA QUEUED and READ FPDMA WRITE have the
FUA bit. But then, the problem is that if the user changes the queue depth
of the drive to 1 through sysfs, ncq is turned off and we are back to
using the EXT read & write commands, that is, only write has FUA.

So if we want a solid ata FUA support, we would need to always use NCQ
regardless of the drive max queue depth setting...

Thoughts ?

> 
>>
>> Thanks,
>> Maciej
>>
>
Damien Le Moal Oct. 25, 2022, 12:22 a.m. UTC | #4
On 10/25/22 08:26, Damien Le Moal wrote:
> On 10/25/22 07:09, Damien Le Moal wrote:
>> On 10/25/22 03:48, Maciej S. Szmigiero wrote:
>>> On 24.10.2022 09:26, Damien Le Moal wrote:
>>>> These patches cleanup and improve libata support for the FUA device
>>>> feature. Patch 3 enables FUA support by default for any drive that
>>>> reports supporting the feature.
>>>>
>>>> Changes from v1:
>>>>   - Removed Maciej's patch 2. Instead, blacklist drives which are known
>>>>     to have a buggy FUA support.
>>>>
>>>> Damien Le Moal (3):
>>>>    ata: libata: cleanup fua handling
>>>>    ata: libata: blacklist FUA support for known buggy drives
>>>>    ata: libata: Enable fua support by default
>>>>
>>>
>>> Thanks for the updated series.
>>>
>>> In general (besides the small commit message thing that Sergey had
>>> already mentioned) it looks good to me, so:
>>> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>
>> Thanks. I need to do some more testing using some very old drives I found.
>> So far, no issues: detection works, some drives have FUA, other not. For
>> the ones that have FUA, I am running fstests (ext4 and xfs) to check for
>> weird behavior with REQ_FUA writes. Once I complete all tests I will queue
>> this.
> 
> Actually, I need to take this back. Checking again the code, I found an
> issue with this entire FUA support: for a drive that does not support NCQ,
> or one that has NCQ but has its queue depth set to one, then for a REQ_FUA
> write request, ATA_CMD_WRITE_MULTI_FUA_EXT or ATA_CMD_WRITE_FUA_EXT will
> be used. All good, BUT ! sd.c may also send read requests with the FUA bit
> set if the read request has REQ_FUA set. For read commands, the regular,
> non FUA commands ATA_CMD_READ_MULTI, ATA_CMD_READ_MULTI_EXT, ATA_CMD_READ
> or ATA_CMD_READ_EXT will be used since ATA does not define a FUA version
> of these. This means that the REQ_FUA flag will be ignored: this entire
> code is broken as it is assuming that the read command processing on the
> drive is consistent with executions of ATA_CMD_WRITE_MULTI_FUA_EXT or
> ATA_CMD_WRITE_FUA_EXT. I do not want to bet on that, especially with old
> drives.

Correction here: a REQ_FUA reads would end up being mapped to the "0"
command as returned by ata_rwcmd_protocol(), resulting in the user getting
back an EIO for any FUA read if libata FUA is enabled. With libata fua
disabled that was not happening. Digging further though, I do not see any
in-kernel code using REQ_FUA for reads. But it does not look like that is
forbidden either. Applications using SG_IO may use it anyway (e.g. to do
write verify type operations).

> 
> I would be tempted to restrict FUA support to drives that support NCQ,
> given that with NCQ, both READ FPDMA QUEUED and READ FPDMA WRITE have the
> FUA bit. But then, the problem is that if the user changes the queue depth
> of the drive to 1 through sysfs, ncq is turned off and we are back to
> using the EXT read & write commands, that is, only write has FUA.
> 
> So if we want a solid ata FUA support, we would need to always use NCQ
> regardless of the drive max queue depth setting...
> 
> Thoughts ?
> 
>>
>>>
>>> Thanks,
>>> Maciej
>>>
>>
>
Hannes Reinecke Oct. 25, 2022, 7:05 a.m. UTC | #5
On 10/25/22 01:26, Damien Le Moal wrote:
> On 10/25/22 07:09, Damien Le Moal wrote:
>> On 10/25/22 03:48, Maciej S. Szmigiero wrote:
>>> On 24.10.2022 09:26, Damien Le Moal wrote:
>>>> These patches cleanup and improve libata support for the FUA device
>>>> feature. Patch 3 enables FUA support by default for any drive that
>>>> reports supporting the feature.
>>>>
>>>> Changes from v1:
>>>>    - Removed Maciej's patch 2. Instead, blacklist drives which are known
>>>>      to have a buggy FUA support.
>>>>
>>>> Damien Le Moal (3):
>>>>     ata: libata: cleanup fua handling
>>>>     ata: libata: blacklist FUA support for known buggy drives
>>>>     ata: libata: Enable fua support by default
>>>>
>>>
>>> Thanks for the updated series.
>>>
>>> In general (besides the small commit message thing that Sergey had
>>> already mentioned) it looks good to me, so:
>>> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>
>> Thanks. I need to do some more testing using some very old drives I found.
>> So far, no issues: detection works, some drives have FUA, other not. For
>> the ones that have FUA, I am running fstests (ext4 and xfs) to check for
>> weird behavior with REQ_FUA writes. Once I complete all tests I will queue
>> this.
> 
> Actually, I need to take this back. Checking again the code, I found an
> issue with this entire FUA support: for a drive that does not support NCQ,
> or one that has NCQ but has its queue depth set to one, then for a REQ_FUA
> write request, ATA_CMD_WRITE_MULTI_FUA_EXT or ATA_CMD_WRITE_FUA_EXT will
> be used. All good, BUT ! sd.c may also send read requests with the FUA bit
> set if the read request has REQ_FUA set. For read commands, the regular,
> non FUA commands ATA_CMD_READ_MULTI, ATA_CMD_READ_MULTI_EXT, ATA_CMD_READ
> or ATA_CMD_READ_EXT will be used since ATA does not define a FUA version
> of these. This means that the REQ_FUA flag will be ignored: this entire
> code is broken as it is assuming that the read command processing on the
> drive is consistent with executions of ATA_CMD_WRITE_MULTI_FUA_EXT or
> ATA_CMD_WRITE_FUA_EXT. I do not want to bet on that, especially with old
> drives.
> 
Now you got me confused.
What exactly would be the semantics of a READ request with the FUA bit 
set? Ignore the cache and read from disk?
That would only make sense if the cache went out of sync with the drive, 
which really shouldn't happen, no?

> I would be tempted to restrict FUA support to drives that support NCQ,
> given that with NCQ, both READ FPDMA QUEUED and READ FPDMA WRITE have the
> FUA bit. But then, the problem is that if the user changes the queue depth
> of the drive to 1 through sysfs, ncq is turned off and we are back to
> using the EXT read & write commands, that is, only write has FUA.
> 
Hmm. Is this a requirement? We _could_ use the NCQ variants even with a 
queue depth of 1, no?

> So if we want a solid ata FUA support, we would need to always use NCQ
> regardless of the drive max queue depth setting...
> 
Sure, that would be the way I would be going.
If the drive supports NCQ we should always be using the FPDMA variants, 
irrespective of the queue depth.
Additionally we _might_ make FUA dependent on NCQ, and disallow FUA for 
non-NCQ drives.
(Where it's questionable anyway; if you only have a single command 
outstanding the pressure on any internal cache is far less as with NCQ.)

Cheers,

Hannes
Damien Le Moal Oct. 25, 2022, 8:59 a.m. UTC | #6
On 10/25/22 16:05, Hannes Reinecke wrote:
> On 10/25/22 01:26, Damien Le Moal wrote:
>> On 10/25/22 07:09, Damien Le Moal wrote:
>>> On 10/25/22 03:48, Maciej S. Szmigiero wrote:
>>>> On 24.10.2022 09:26, Damien Le Moal wrote:
>>>>> These patches cleanup and improve libata support for the FUA device
>>>>> feature. Patch 3 enables FUA support by default for any drive that
>>>>> reports supporting the feature.
>>>>>
>>>>> Changes from v1:
>>>>>    - Removed Maciej's patch 2. Instead, blacklist drives which are known
>>>>>      to have a buggy FUA support.
>>>>>
>>>>> Damien Le Moal (3):
>>>>>     ata: libata: cleanup fua handling
>>>>>     ata: libata: blacklist FUA support for known buggy drives
>>>>>     ata: libata: Enable fua support by default
>>>>>
>>>>
>>>> Thanks for the updated series.
>>>>
>>>> In general (besides the small commit message thing that Sergey had
>>>> already mentioned) it looks good to me, so:
>>>> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>>
>>> Thanks. I need to do some more testing using some very old drives I found.
>>> So far, no issues: detection works, some drives have FUA, other not. For
>>> the ones that have FUA, I am running fstests (ext4 and xfs) to check for
>>> weird behavior with REQ_FUA writes. Once I complete all tests I will queue
>>> this.
>>
>> Actually, I need to take this back. Checking again the code, I found an
>> issue with this entire FUA support: for a drive that does not support NCQ,
>> or one that has NCQ but has its queue depth set to one, then for a REQ_FUA
>> write request, ATA_CMD_WRITE_MULTI_FUA_EXT or ATA_CMD_WRITE_FUA_EXT will
>> be used. All good, BUT ! sd.c may also send read requests with the FUA bit
>> set if the read request has REQ_FUA set. For read commands, the regular,
>> non FUA commands ATA_CMD_READ_MULTI, ATA_CMD_READ_MULTI_EXT, ATA_CMD_READ
>> or ATA_CMD_READ_EXT will be used since ATA does not define a FUA version
>> of these. This means that the REQ_FUA flag will be ignored: this entire
>> code is broken as it is assuming that the read command processing on the
>> drive is consistent with executions of ATA_CMD_WRITE_MULTI_FUA_EXT or
>> ATA_CMD_WRITE_FUA_EXT. I do not want to bet on that, especially with old
>> drives.
>>
> Now you got me confused.
> What exactly would be the semantics of a READ request with the FUA bit 
> set? Ignore the cache and read from disk?
> That would only make sense if the cache went out of sync with the drive, 
> which really shouldn't happen, no?

For the NCQ read command, FUA text says:

If the Forced Unit Access (FUA) bit is set to one, the device shall
retrieve the data from the non-volatile media regardless of whether the
device holds the requested information in the volatile cache. If the
device holds a modified copy of the requested data as a result of having
volatile cached writes, the modified data shall be written to the
non-volatile media before being retrieved from the non-volatile media as
part of this operation. If the FUA bit is cleared to zero, the data shall
be retrieved either from the device's non-volatile media or cache.

So all well for NCQ, everything as expected, no surprises at all here.

But no equivalent behavior defined for non-NCQ read commands, and with
libata.fua enabled, ata_rwcmd_protocol() will cause any REQ_FUA read to
fail, which is the safe thing to do, but that is not what one gets when
libata.fua is disabled. So enabling libata fua by default could
potentially break some use cases out there of people using REQ_FUA reads
without realizing that it is doing nothing in non-NCQ case.

I have not checked the block layer though. Does the preflush/sync
machinery also triggers for reads ? I do not think so. I think REQ_FUA is
ignored for reads.

> 
>> I would be tempted to restrict FUA support to drives that support NCQ,
>> given that with NCQ, both READ FPDMA QUEUED and READ FPDMA WRITE have the
>> FUA bit. But then, the problem is that if the user changes the queue depth
>> of the drive to 1 through sysfs, ncq is turned off and we are back to
>> using the EXT read & write commands, that is, only write has FUA.
>>
> Hmm. Is this a requirement? We _could_ use the NCQ variants even with a 
> queue depth of 1, no?

Absolutely, yes. Running NCQ commands at QD = 1 is perfectly fine.
I think that this "do not use NCQ when QD == 1" is there mostly to deal
with drives that have a buggy NCQ implementation. Not sure. It has been
like this forever. Would need to do some git archeology about that one.

>> So if we want a solid ata FUA support, we would need to always use NCQ
>> regardless of the drive max queue depth setting...
>>
> Sure, that would be the way I would be going.
> If the drive supports NCQ we should always be using the FPDMA variants, 
> irrespective of the queue depth.
> Additionally we _might_ make FUA dependent on NCQ, and disallow FUA for 
> non-NCQ drives.
> (Where it's questionable anyway; if you only have a single command 
> outstanding the pressure on any internal cache is far less as with NCQ.)

Yes, that is my thinking too. Will try this and see how it looks.

> 
> Cheers,
> 
> Hannes
Niklas Cassel Oct. 25, 2022, 9:01 a.m. UTC | #7
On Tue, Oct 25, 2022 at 09:05:02AM +0200, Hannes Reinecke wrote:
> On 10/25/22 01:26, Damien Le Moal wrote:
> > On 10/25/22 07:09, Damien Le Moal wrote:
> > > On 10/25/22 03:48, Maciej S. Szmigiero wrote:
> > > > On 24.10.2022 09:26, Damien Le Moal wrote:
> > > > > These patches cleanup and improve libata support for the FUA device
> > > > > feature. Patch 3 enables FUA support by default for any drive that
> > > > > reports supporting the feature.
> > > > > 
> > > > > Changes from v1:
> > > > >    - Removed Maciej's patch 2. Instead, blacklist drives which are known
> > > > >      to have a buggy FUA support.
> > > > > 
> > > > > Damien Le Moal (3):
> > > > >     ata: libata: cleanup fua handling
> > > > >     ata: libata: blacklist FUA support for known buggy drives
> > > > >     ata: libata: Enable fua support by default
> > > > > 
> > > > 
> > > > Thanks for the updated series.
> > > > 
> > > > In general (besides the small commit message thing that Sergey had
> > > > already mentioned) it looks good to me, so:
> > > > Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> > > 
> > > Thanks. I need to do some more testing using some very old drives I found.
> > > So far, no issues: detection works, some drives have FUA, other not. For
> > > the ones that have FUA, I am running fstests (ext4 and xfs) to check for
> > > weird behavior with REQ_FUA writes. Once I complete all tests I will queue
> > > this.
> > 
> > Actually, I need to take this back. Checking again the code, I found an
> > issue with this entire FUA support: for a drive that does not support NCQ,
> > or one that has NCQ but has its queue depth set to one, then for a REQ_FUA
> > write request, ATA_CMD_WRITE_MULTI_FUA_EXT or ATA_CMD_WRITE_FUA_EXT will
> > be used. All good, BUT ! sd.c may also send read requests with the FUA bit
> > set if the read request has REQ_FUA set. For read commands, the regular,
> > non FUA commands ATA_CMD_READ_MULTI, ATA_CMD_READ_MULTI_EXT, ATA_CMD_READ
> > or ATA_CMD_READ_EXT will be used since ATA does not define a FUA version
> > of these. This means that the REQ_FUA flag will be ignored: this entire
> > code is broken as it is assuming that the read command processing on the
> > drive is consistent with executions of ATA_CMD_WRITE_MULTI_FUA_EXT or
> > ATA_CMD_WRITE_FUA_EXT. I do not want to bet on that, especially with old
> > drives.
> > 
> Now you got me confused.
> What exactly would be the semantics of a READ request with the FUA bit set?
> Ignore the cache and read from disk?

The opposite:
"If the device holds a modified copy of the requested data as a result of
having volatile cached writes, the modified data shall be written to the
non-volatile media before being retrieved from the non-volatile media as
part of this operation."


Kind regards,
Niklas
Niklas Cassel Oct. 25, 2022, 9:41 a.m. UTC | #8
On Tue, Oct 25, 2022 at 05:59:07PM +0900, Damien Le Moal wrote:
> > Sure, that would be the way I would be going.
> > If the drive supports NCQ we should always be using the FPDMA variants, 
> > irrespective of the queue depth.
> > Additionally we _might_ make FUA dependent on NCQ, and disallow FUA for 
> > non-NCQ drives.
> > (Where it's questionable anyway; if you only have a single command 
> > outstanding the pressure on any internal cache is far less as with NCQ.)
> 
> Yes, that is my thinking too. Will try this and see how it looks.

When there are too many NCQ errors, flag ATA_DFLAG_NCQ_OFF will get set
(but queue depth will be left unchanged).

Currently, one way to re-enable NCQ is to change queue_depth to 1 in
sysfs, and then set it back to 32.

It would be nice if this method, or some other method to re-enable NCQ
after encountering too many NCQ errors, would still be available after
your suggested changes.



And if we change to always use NCQ commands regardless of queue depth
(for devices supporting it), because it provides us with additional
fields not available in the non-NCQ versions...
then I think that we should also consider always using WRITE DMA EXT
(on devices supporting 48-bit addresses), when NCQ is not supported
(or disabled because of too many errors), since it also has additional
fields not available in the regular WRITE DMA command (especially
considering that they should have the exact same performance).


Kind regards,
Niklas
Maciej S. Szmigiero Oct. 25, 2022, 6:13 p.m. UTC | #9
On 25.10.2022 10:59, Damien Le Moal wrote:
> On 10/25/22 16:05, Hannes Reinecke wrote:
>> On 10/25/22 01:26, Damien Le Moal wrote:
>>> On 10/25/22 07:09, Damien Le Moal wrote:
>>>> On 10/25/22 03:48, Maciej S. Szmigiero wrote:
>>>>> On 24.10.2022 09:26, Damien Le Moal wrote:
>>>>>> These patches cleanup and improve libata support for the FUA device
>>>>>> feature. Patch 3 enables FUA support by default for any drive that
>>>>>> reports supporting the feature.
>>>>>>
>>>>>> Changes from v1:
>>>>>>     - Removed Maciej's patch 2. Instead, blacklist drives which are known
>>>>>>       to have a buggy FUA support.
>>>>>>
>>>>>> Damien Le Moal (3):
>>>>>>      ata: libata: cleanup fua handling
>>>>>>      ata: libata: blacklist FUA support for known buggy drives
>>>>>>      ata: libata: Enable fua support by default
>>>>>>
>>>>>
>>>>> Thanks for the updated series.
>>>>>
>>>>> In general (besides the small commit message thing that Sergey had
>>>>> already mentioned) it looks good to me, so:
>>>>> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>>>
>>>> Thanks. I need to do some more testing using some very old drives I found.
>>>> So far, no issues: detection works, some drives have FUA, other not. For
>>>> the ones that have FUA, I am running fstests (ext4 and xfs) to check for
>>>> weird behavior with REQ_FUA writes. Once I complete all tests I will queue
>>>> this.
>>>
>>> Actually, I need to take this back. Checking again the code, I found an
>>> issue with this entire FUA support: for a drive that does not support NCQ,
>>> or one that has NCQ but has its queue depth set to one, then for a REQ_FUA
>>> write request, ATA_CMD_WRITE_MULTI_FUA_EXT or ATA_CMD_WRITE_FUA_EXT will
>>> be used. All good, BUT ! sd.c may also send read requests with the FUA bit
>>> set if the read request has REQ_FUA set. For read commands, the regular,
>>> non FUA commands ATA_CMD_READ_MULTI, ATA_CMD_READ_MULTI_EXT, ATA_CMD_READ
>>> or ATA_CMD_READ_EXT will be used since ATA does not define a FUA version
>>> of these. This means that the REQ_FUA flag will be ignored: this entire
>>> code is broken as it is assuming that the read command processing on the
>>> drive is consistent with executions of ATA_CMD_WRITE_MULTI_FUA_EXT or
>>> ATA_CMD_WRITE_FUA_EXT. I do not want to bet on that, especially with old
>>> drives.
>>>
>> Now you got me confused.
>> What exactly would be the semantics of a READ request with the FUA bit
>> set? Ignore the cache and read from disk?
>> That would only make sense if the cache went out of sync with the drive,
>> which really shouldn't happen, no?
> 
> For the NCQ read command, FUA text says:
> 
> If the Forced Unit Access (FUA) bit is set to one, the device shall
> retrieve the data from the non-volatile media regardless of whether the
> device holds the requested information in the volatile cache. If the
> device holds a modified copy of the requested data as a result of having
> volatile cached writes, the modified data shall be written to the
> non-volatile media before being retrieved from the non-volatile media as
> part of this operation. If the FUA bit is cleared to zero, the data shall
> be retrieved either from the device's non-volatile media or cache.
> 
> So all well for NCQ, everything as expected, no surprises at all here.
> 
> But no equivalent behavior defined for non-NCQ read commands, and with
> libata.fua enabled, ata_rwcmd_protocol() will cause any REQ_FUA read to
> fail, which is the safe thing to do, but that is not what one gets when
> libata.fua is disabled. So enabling libata fua by default could
> potentially break some use cases out there of people using REQ_FUA reads
> without realizing that it is doing nothing in non-NCQ case.
> 
> I have not checked the block layer though. Does the preflush/sync
> machinery also triggers for reads ? I do not think so. I think REQ_FUA is
> ignored for reads.

It seems like the block layer is indeed capable of doing a {pre,post}-flush
on reads - neither op_is_flush() nor blk_flush_policy() or
blk_insert_flush() have any checks for request direction.

But, as you mentioned, no higher level kernel code seems to actually issue
such FUA reads.

>>
>>> I would be tempted to restrict FUA support to drives that support NCQ,
>>> given that with NCQ, both READ FPDMA QUEUED and READ FPDMA WRITE have the
>>> FUA bit. But then, the problem is that if the user changes the queue depth
>>> of the drive to 1 through sysfs, ncq is turned off and we are back to
>>> using the EXT read & write commands, that is, only write has FUA.
>>>
>> Hmm. Is this a requirement? We _could_ use the NCQ variants even with a
>> queue depth of 1, no?
> 
> Absolutely, yes. Running NCQ commands at QD = 1 is perfectly fine.
> I think that this "do not use NCQ when QD == 1" is there mostly to deal
> with drives that have a buggy NCQ implementation. Not sure. It has been
> like this forever. Would need to do some git archeology about that one.

As Niklas has mentioned the NCQ will be disabled if there are too many
errors (for buggy implementations), so it should be safe to always make
use of it when it is supported by the drive.

>>> So if we want a solid ata FUA support, we would need to always use NCQ
>>> regardless of the drive max queue depth setting...
>>>
>> Sure, that would be the way I would be going.
>> If the drive supports NCQ we should always be using the FPDMA variants,
>> irrespective of the queue depth.
>> Additionally we _might_ make FUA dependent on NCQ, and disallow FUA for
>> non-NCQ drives.
>> (Where it's questionable anyway; if you only have a single command
>> outstanding the pressure on any internal cache is far less as with NCQ.)
> 
> Yes, that is my thinking too. Will try this and see how it looks.

In my opinion, having FUA dependent on NCQ totally makes sense
considering that there are no ATA_CMD_READ_FUA{,_EXT}.

In this case the NCQ disabling code in ata_eh_speed_down() will probably
need to clear the queue FUA flag too.

Thanks,
Maciej
Damien Le Moal Oct. 25, 2022, 11:21 p.m. UTC | #10
On 10/26/22 03:13, Maciej S. Szmigiero wrote:
> On 25.10.2022 10:59, Damien Le Moal wrote:
>> On 10/25/22 16:05, Hannes Reinecke wrote:
>>> On 10/25/22 01:26, Damien Le Moal wrote:
>>>> On 10/25/22 07:09, Damien Le Moal wrote:
>>>>> On 10/25/22 03:48, Maciej S. Szmigiero wrote:
>>>>>> On 24.10.2022 09:26, Damien Le Moal wrote:
>>>>>>> These patches cleanup and improve libata support for the FUA device
>>>>>>> feature. Patch 3 enables FUA support by default for any drive that
>>>>>>> reports supporting the feature.
>>>>>>>
>>>>>>> Changes from v1:
>>>>>>>     - Removed Maciej's patch 2. Instead, blacklist drives which are known
>>>>>>>       to have a buggy FUA support.
>>>>>>>
>>>>>>> Damien Le Moal (3):
>>>>>>>      ata: libata: cleanup fua handling
>>>>>>>      ata: libata: blacklist FUA support for known buggy drives
>>>>>>>      ata: libata: Enable fua support by default
>>>>>>>
>>>>>>
>>>>>> Thanks for the updated series.
>>>>>>
>>>>>> In general (besides the small commit message thing that Sergey had
>>>>>> already mentioned) it looks good to me, so:
>>>>>> Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>>>>
>>>>> Thanks. I need to do some more testing using some very old drives I found.
>>>>> So far, no issues: detection works, some drives have FUA, other not. For
>>>>> the ones that have FUA, I am running fstests (ext4 and xfs) to check for
>>>>> weird behavior with REQ_FUA writes. Once I complete all tests I will queue
>>>>> this.
>>>>
>>>> Actually, I need to take this back. Checking again the code, I found an
>>>> issue with this entire FUA support: for a drive that does not support NCQ,
>>>> or one that has NCQ but has its queue depth set to one, then for a REQ_FUA
>>>> write request, ATA_CMD_WRITE_MULTI_FUA_EXT or ATA_CMD_WRITE_FUA_EXT will
>>>> be used. All good, BUT ! sd.c may also send read requests with the FUA bit
>>>> set if the read request has REQ_FUA set. For read commands, the regular,
>>>> non FUA commands ATA_CMD_READ_MULTI, ATA_CMD_READ_MULTI_EXT, ATA_CMD_READ
>>>> or ATA_CMD_READ_EXT will be used since ATA does not define a FUA version
>>>> of these. This means that the REQ_FUA flag will be ignored: this entire
>>>> code is broken as it is assuming that the read command processing on the
>>>> drive is consistent with executions of ATA_CMD_WRITE_MULTI_FUA_EXT or
>>>> ATA_CMD_WRITE_FUA_EXT. I do not want to bet on that, especially with old
>>>> drives.
>>>>
>>> Now you got me confused.
>>> What exactly would be the semantics of a READ request with the FUA bit
>>> set? Ignore the cache and read from disk?
>>> That would only make sense if the cache went out of sync with the drive,
>>> which really shouldn't happen, no?
>>
>> For the NCQ read command, FUA text says:
>>
>> If the Forced Unit Access (FUA) bit is set to one, the device shall
>> retrieve the data from the non-volatile media regardless of whether the
>> device holds the requested information in the volatile cache. If the
>> device holds a modified copy of the requested data as a result of having
>> volatile cached writes, the modified data shall be written to the
>> non-volatile media before being retrieved from the non-volatile media as
>> part of this operation. If the FUA bit is cleared to zero, the data shall
>> be retrieved either from the device's non-volatile media or cache.
>>
>> So all well for NCQ, everything as expected, no surprises at all here.
>>
>> But no equivalent behavior defined for non-NCQ read commands, and with
>> libata.fua enabled, ata_rwcmd_protocol() will cause any REQ_FUA read to
>> fail, which is the safe thing to do, but that is not what one gets when
>> libata.fua is disabled. So enabling libata fua by default could
>> potentially break some use cases out there of people using REQ_FUA reads
>> without realizing that it is doing nothing in non-NCQ case.
>>
>> I have not checked the block layer though. Does the preflush/sync
>> machinery also triggers for reads ? I do not think so. I think REQ_FUA is
>> ignored for reads.
> 
> It seems like the block layer is indeed capable of doing a {pre,post}-flush
> on reads - neither op_is_flush() nor blk_flush_policy() or
> blk_insert_flush() have any checks for request direction.

OK. But the problem remains that if fua is enabled, this machinery will
not trigger and we end up with a read error in ata.

> But, as you mentioned, no higher level kernel code seems to actually issue
> such FUA reads.

Which leaves SG_IO applications: translating a read with FUA bit set needs
to use NCQ read, always. But since that was not working before (the user
would get a read error), and nobody complained, I guess this is OK.

> 
>>>
>>>> I would be tempted to restrict FUA support to drives that support NCQ,
>>>> given that with NCQ, both READ FPDMA QUEUED and READ FPDMA WRITE have the
>>>> FUA bit. But then, the problem is that if the user changes the queue depth
>>>> of the drive to 1 through sysfs, ncq is turned off and we are back to
>>>> using the EXT read & write commands, that is, only write has FUA.
>>>>
>>> Hmm. Is this a requirement? We _could_ use the NCQ variants even with a
>>> queue depth of 1, no?
>>
>> Absolutely, yes. Running NCQ commands at QD = 1 is perfectly fine.
>> I think that this "do not use NCQ when QD == 1" is there mostly to deal
>> with drives that have a buggy NCQ implementation. Not sure. It has been
>> like this forever. Would need to do some git archeology about that one.
> 
> As Niklas has mentioned the NCQ will be disabled if there are too many
> errors (for buggy implementations), so it should be safe to always make
> use of it when it is supported by the drive.

I checked with the ATA specs gurus, and NCQ FUA is always supported (if
NCQ is supported), regardless of support for WRITE DMA FUA EXT. Problem
is, that switching off NCQ has to go hand in hand with switching off FUA.
See below.

> 
>>>> So if we want a solid ata FUA support, we would need to always use NCQ
>>>> regardless of the drive max queue depth setting...
>>>>
>>> Sure, that would be the way I would be going.
>>> If the drive supports NCQ we should always be using the FPDMA variants,
>>> irrespective of the queue depth.
>>> Additionally we _might_ make FUA dependent on NCQ, and disallow FUA for
>>> non-NCQ drives.
>>> (Where it's questionable anyway; if you only have a single command
>>> outstanding the pressure on any internal cache is far less as with NCQ.)
>>
>> Yes, that is my thinking too. Will try this and see how it looks.
> 
> In my opinion, having FUA dependent on NCQ totally makes sense
> considering that there are no ATA_CMD_READ_FUA{,_EXT}.
> 
> In this case the NCQ disabling code in ata_eh_speed_down() will probably
> need to clear the queue FUA flag too.

Yes, that is the difficulty because this needs also a revalidation at scsi
layer. The flag is set there, not in libata. And if there is a mounted
file system on the drive, switching randomly from "have fua" to "sorry, no
more fua" might cause problems... Not 100% sure though.

Given that FUA reads do not seem to used at all, if we restrict FUA
support to drives that support both NCQ *and* WRITE DMA FUA EXT, then we
should be OK.

Let me rewrite the first patch for that and see how it looks.

> 
> Thanks,
> Maciej
>