Message ID | 1336447443-4685-1-git-send-email-wenqing.lz@taobao.com |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On 05/07/2012 09:24 PM, Zheng Liu wrote: > From: Zheng Liu<wenqing.lz@taobao.com> > > Currently, SATA disk fua detection is disabled on default because most of > devices don't support this feature at that time. With the development of > technology, more and more SATA disks support this feature. So now we can enable > this detection on default. > > Although fua detection is defined as a kernel module parameter, it is too hard > to set its value because it must be loaded and set before system starts up. > That needs to modify initrd file. So it is inconvenient for administrator who > needs to manage a huge number of servers. > > CC: Jeff Garzik<jeff@garzik.org> > Signed-off-by: Zheng Liu<wenqing.lz@taobao.com> > --- > drivers/ata/libata-core.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > index 23763a1..3627251 100644 > --- a/drivers/ata/libata-core.c > +++ b/drivers/ata/libata-core.c > @@ -133,9 +133,9 @@ int atapi_passthru16 = 1; > module_param(atapi_passthru16, int, 0444); > MODULE_PARM_DESC(atapi_passthru16, "Enable ATA_16 passthru for ATAPI devices (0=off, 1=on [default])"); > > -int libata_fua = 0; > +int libata_fua = 1; > module_param_named(fua, libata_fua, int, 0444); > -MODULE_PARM_DESC(fua, "FUA support (0=off [default], 1=on)"); > +MODULE_PARM_DESC(fua, "FUA support (0=off, 1=on [default])"); > > static int ata_ignore_hpa; > module_param_named(ignore_hpa, ata_ignore_hpa, int, 0644); I don't see that this is very likely to cause issues, but the FUA implementation also seems a bit incomplete. FUA will only get enabled if the drive claims support for the FUA-specific commands like WRITE_DMA_FUA_EXT. But theoretically any drive that supports NCQ should be able to do FUA since there's a FUA bit provided in those commands. Of course this makes things a bit more complicated (for example, if NCQ gets disabled by user request or due to too many errors, and the drive doesn't support the separate FUA commands, then FUA needs to be disabled too.) -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 08, 2012 at 11:38:34PM -0600, Robert Hancock wrote: > On 05/07/2012 09:24 PM, Zheng Liu wrote: > >From: Zheng Liu<wenqing.lz@taobao.com> > > > >Currently, SATA disk fua detection is disabled on default because most of > >devices don't support this feature at that time. With the development of > >technology, more and more SATA disks support this feature. So now we can enable > >this detection on default. > > > >Although fua detection is defined as a kernel module parameter, it is too hard > >to set its value because it must be loaded and set before system starts up. > >That needs to modify initrd file. So it is inconvenient for administrator who > >needs to manage a huge number of servers. > > > >CC: Jeff Garzik<jeff@garzik.org> > >Signed-off-by: Zheng Liu<wenqing.lz@taobao.com> > >--- > > drivers/ata/libata-core.c | 4 ++-- > > 1 files changed, 2 insertions(+), 2 deletions(-) > > > >diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > >index 23763a1..3627251 100644 > >--- a/drivers/ata/libata-core.c > >+++ b/drivers/ata/libata-core.c > >@@ -133,9 +133,9 @@ int atapi_passthru16 = 1; > > module_param(atapi_passthru16, int, 0444); > > MODULE_PARM_DESC(atapi_passthru16, "Enable ATA_16 passthru for ATAPI devices (0=off, 1=on [default])"); > > > >-int libata_fua = 0; > >+int libata_fua = 1; > > module_param_named(fua, libata_fua, int, 0444); > >-MODULE_PARM_DESC(fua, "FUA support (0=off [default], 1=on)"); > >+MODULE_PARM_DESC(fua, "FUA support (0=off, 1=on [default])"); > > > > static int ata_ignore_hpa; > > module_param_named(ignore_hpa, ata_ignore_hpa, int, 0644); > > I don't see that this is very likely to cause issues, but the FUA > implementation also seems a bit incomplete. FUA will only get > enabled if the drive claims support for the FUA-specific commands > like WRITE_DMA_FUA_EXT. But theoretically any drive that supports > NCQ should be able to do FUA since there's a FUA bit provided in > those commands. Of course this makes things a bit more complicated > (for example, if NCQ gets disabled by user request or due to too > many errors, and the drive doesn't support the separate FUA > commands, then FUA needs to be disabled too.) Hi Reboert, Thanks for your reply. Actually, we met a problem in our product system. We have thousands of database servers that run on Linux. These servers have a lot of SATA disks that support FUA feature. However, it displays that the disk doesn't support FUA in dmesg. I digged this problem and found that libata_fua variable is set to 0 to disable FUA detection for SATA disk. If FUA feature can be used, we could disable io_barrier to improve the performance. Now one solution is that we set libata_fua to 1 when this kernel module is loaded. But it needs to modify initrd file, and it is too complicated for administrator who needs to modify every server manually. So that enabling FUA detection is another solution. Have you any other idea to solve this problem? Thank you. Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 09, 2012 at 02:19:56PM +0800, Zheng Liu wrote: > disable FUA detection for SATA disk. If FUA feature can be used, we > could disable io_barrier to improve the performance. Now one solution You can't run with -o nobarrier just because the driver/hardware support FUA. However the implementation of -o barrier (aka REQ_FUA) is a lot more efficient if it is implemented. If your workload has a high percentage of REQ_FUA requests it might be better to switch the disks to write through mode entirely. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 09, 2012 at 04:25:27AM -0400, Christoph Hellwig wrote: > On Wed, May 09, 2012 at 02:19:56PM +0800, Zheng Liu wrote: > > disable FUA detection for SATA disk. If FUA feature can be used, we > > could disable io_barrier to improve the performance. Now one solution > > You can't run with -o nobarrier just because the driver/hardware support > FUA. However the implementation of -o barrier (aka REQ_FUA) is a lot > more efficient if it is implemented. If your workload has a high > percentage of REQ_FUA requests it might be better to switch the disks to > write through mode entirely. Thanks for your advice. Indeed, it is more efficient when data is flushed with REQ_FUA, and this is the reason why I submit this patch. IMHO, FUA provides a solution that we can get better performance in write cache mode when we do many flush operations. If I set the disk to write through mode, I won't get this benefit. Am I missing something? Currently, the key issue is that we disable FUA detection for SATA disk. We almost have no chance to change it because it is too complicated to set libata_fua variable when this module is loaded. So why not give SATA disk an opportunity to enable this feature? After all, there is a lot of SATA disks that support this feature. Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 09, 2012 at 05:30:16PM +0800, Zheng Liu wrote: > IMHO, FUA provides a solution that we can get better performance in write > cache mode when we do many flush operations. If I set the disk to write > through mode, I won't get this benefit. Am I missing something? If you set the disk to write through mode you never have to flush the cache. So as soonas your number of flushes gets close to the number of writes it tends to be a clear win - for ATA the tradeoff is even more in favour of write through because the flush command can't be queued yetin commonly available standards versions. So if you have a workload that basically needs to flush out every write you win - if you have workloads where you have a lot more writes than cache flushes write back mode wins. > Currently, the key issue is that we disable FUA detection for SATA disk. > We almost have no chance to change it because it is too complicated to > set libata_fua variable when this module is loaded. So why not give > SATA disk an opportunity to enable this feature? After all, there is a > lot of SATA disks that support this feature. I'm all in favour of your patch, I just wanted to point out that the argument in the description wasn't quite correct. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 09, 2012 at 07:12:43AM -0400, Christoph Hellwig wrote: > On Wed, May 09, 2012 at 05:30:16PM +0800, Zheng Liu wrote: > > IMHO, FUA provides a solution that we can get better performance in write > > cache mode when we do many flush operations. If I set the disk to write > > through mode, I won't get this benefit. Am I missing something? > > If you set the disk to write through mode you never have to flush the > cache. So as soonas your number of flushes gets close to the number of > writes it tends to be a clear win - for ATA the tradeoff is even more in > favour of write through because the flush command can't be queued yetin > commonly available standards versions. So if you have a workload that > basically needs to flush out every write you win - if you have workloads > where you have a lot more writes than cache flushes write back mode > wins. Thanks for your explanation. It seems that there still has a problem. If I set the disk to write through mode, I need to modify my application to remove all of flush/sync operations. It is unacceptable for us. > > Currently, the key issue is that we disable FUA detection for SATA disk. > > We almost have no chance to change it because it is too complicated to > > set libata_fua variable when this module is loaded. So why not give > > SATA disk an opportunity to enable this feature? After all, there is a > > lot of SATA disks that support this feature. > > I'm all in favour of your patch, I just wanted to point out that the > argument in the description wasn't quite correct. Thank you. I will fix it. :-) Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/09/2012 02:48 PM, Zheng Liu wrote: > On Wed, May 09, 2012 at 07:12:43AM -0400, Christoph Hellwig wrote: >> On Wed, May 09, 2012 at 05:30:16PM +0800, Zheng Liu wrote: >>> IMHO, FUA provides a solution that we can get better performance in write >>> cache mode when we do many flush operations. If I set the disk to write >>> through mode, I won't get this benefit. Am I missing something? >> >> If you set the disk to write through mode you never have to flush the >> cache. So as soonas your number of flushes gets close to the number of >> writes it tends to be a clear win - for ATA the tradeoff is even more in >> favour of write through because the flush command can't be queued yetin >> commonly available standards versions. So if you have a workload that >> basically needs to flush out every write you win - if you have workloads >> where you have a lot more writes than cache flushes write back mode >> wins. > > Thanks for your explanation. It seems that there still has a problem. > If I set the disk to write through mode, I need to modify my application > to remove all of flush/sync operations. It is unacceptable for us. Why? You still need to flush the linux page cache. Just the disk cache does not need to be flushed anymore. So changing the application would be the wrong thing to do. Cheers, Bernd -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, May 09, 2012 at 08:48:04PM +0800, Zheng Liu wrote: > Thanks for your explanation. It seems that there still has a problem. > If I set the disk to write through mode, I need to modify my application > to remove all of flush/sync operations. It is unacceptable for us. You don't have to - the kernel will not send a SYNCHRONIZE CACHE command unless the disk advertises a write back cache. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Jeff, Sorry for bothering you. Could you please review this patch, or give me some feedbacks. If this patch can be applied, I can rebase it to the lastest version of mainline kernel. Thank you. Regards, Zheng On Tue, May 08, 2012 at 11:24:03AM +0800, Zheng Liu wrote: > From: Zheng Liu <wenqing.lz@taobao.com> > > Currently, SATA disk fua detection is disabled on default because most of > devices don't support this feature at that time. With the development of > technology, more and more SATA disks support this feature. So now we can enable > this detection on default. > > Although fua detection is defined as a kernel module parameter, it is too hard > to set its value because it must be loaded and set before system starts up. > That needs to modify initrd file. So it is inconvenient for administrator who > needs to manage a huge number of servers. > > CC: Jeff Garzik <jeff@garzik.org> > Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> > --- > drivers/ata/libata-core.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > index 23763a1..3627251 100644 > --- a/drivers/ata/libata-core.c > +++ b/drivers/ata/libata-core.c > @@ -133,9 +133,9 @@ int atapi_passthru16 = 1; > module_param(atapi_passthru16, int, 0444); > MODULE_PARM_DESC(atapi_passthru16, "Enable ATA_16 passthru for ATAPI devices (0=off, 1=on [default])"); > > -int libata_fua = 0; > +int libata_fua = 1; > module_param_named(fua, libata_fua, int, 0444); > -MODULE_PARM_DESC(fua, "FUA support (0=off [default], 1=on)"); > +MODULE_PARM_DESC(fua, "FUA support (0=off, 1=on [default])"); > > static int ata_ignore_hpa; > module_param_named(ignore_hpa, ata_ignore_hpa, int, 0644); > -- > 1.7.4.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03.07.2012 13:04, Zheng Liu wrote: > Hi Jeff, > > Sorry for bothering you. Could you please review this patch, or give me > some feedbacks. If this patch can be applied, I can rebase it to the > lastest version of mainline kernel. Thank you. [] >> -int libata_fua = 0; >> +int libata_fua = 1; I guess it can be enabled after LOTS of testing of various drives out there, which shows that no current drives have issues with FUA anymore, and after building a black-list of devices which misbehave. It is quite a big project, I think. But what it gives us in return? (I've no idea if this is the right answer, speaking of myself only) Thanks, /mjt -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Michael, On Wed, Jul 04, 2012 at 12:11:36AM +0400, Michael Tokarev wrote: > On 03.07.2012 13:04, Zheng Liu wrote: > > Hi Jeff, > > > > Sorry for bothering you. Could you please review this patch, or give me > > some feedbacks. If this patch can be applied, I can rebase it to the > > lastest version of mainline kernel. Thank you. > [] > >> -int libata_fua = 0; > >> +int libata_fua = 1; > > I guess it can be enabled after LOTS of testing of various drives out > there, which shows that no current drives have issues with FUA anymore, > and after building a black-list of devices which misbehave. It is quite > a big project, I think. But what it gives us in return? > > (I've no idea if this is the right answer, speaking of myself only) Thanks for the reply. Indeed it is quite a big project but we enable FUA feature for SAS disk. Is there any differences? Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04.07.2012 06:47, Zheng Liu wrote: [] >> I guess it can be enabled after LOTS of testing of various drives out >> there, which shows that no current drives have issues with FUA anymore, >> and after building a black-list of devices which misbehave. It is quite >> a big project, I think. But what it gives us in return? >> >> (I've no idea if this is the right answer, speaking of myself only) > > Thanks for the reply. Indeed it is quite a big project but we enable > FUA feature for SAS disk. Is there any differences? Yes, there's a very big difference with SAS disks. Even in parallel SCSI world DPO/FUA has been enabled since the day it has been implemented IIRC, because, apparently, hardware raid controllers enabled it too. In other words, it has been tested and proved to be working even before linux implementation. When first SAS disks started appearing, DPO/FUA were enabled for them in linux already -- at that time any breakage were solely due to the particular disk model, and were easy to blacklist, if necessary, since only a few disk models were in production. With SATA disks, initial hardware implementation proved to be more non-functional than functional, ie, initially there were more drives with non-working FUA. I have a few not-so-old SATA drives here which behaves strangely when FUA is enabled (I don't remember exact details, but I had to disable FA again after I tried to enable it once, the system started behaving not as good as before). So, for SATA drives, we've exactly the opposite picture: we've some proof that "generally, drives dislikes FUA", and now when fua has been disabled for a lot of drives and users, turning it on by default needs lots of testing. But I ask again: what is the benefit of turning FUA on to start with? Thanks, /mjt -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 04, 2012 at 10:36:40AM +0400, Michael Tokarev wrote: > > Thanks for the reply. Indeed it is quite a big project but we enable > > FUA feature for SAS disk. Is there any differences? > > Yes, there's a very big difference with SAS disks. Even in parallel SCSI > world DPO/FUA has been enabled since the day it has been implemented IIRC, > because, apparently, hardware raid controllers enabled it too. In other > words, it has been tested and proved to be working even before linux > implementation. When first SAS disks started appearing, DPO/FUA were > enabled for them in linux already -- at that time any breakage were > solely due to the particular disk model, and were easy to blacklist, > if necessary, since only a few disk models were in production. > > With SATA disks, initial hardware implementation proved to be more > non-functional than functional, ie, initially there were more drives > with non-working FUA. I have a few not-so-old SATA drives here which > behaves strangely when FUA is enabled (I don't remember exact details, > but I had to disable FA again after I tried to enable it once, the > system started behaving not as good as before). So, for SATA drives, > we've exactly the opposite picture: we've some proof that "generally, > drives dislikes FUA", and now when fua has been disabled for a lot > of drives and users, turning it on by default needs lots of testing. > > But I ask again: what is the benefit of turning FUA on to start with? Thanks for your clarification. :-) Turning FUA on can reduce the overhead of flushes AFAIK. In our product system we have a lot of SATA disks with FUA, but we must add a boot parameter 'libata.fua=1' to enable it. Meanwhile there already has a number of SATA disks that have supported this feature. So I think maybe we can enable it. Regards, Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/07/2012 11:24 PM, Zheng Liu wrote: > From: Zheng Liu <wenqing.lz@taobao.com> > > Currently, SATA disk fua detection is disabled on default because most of > devices don't support this feature at that time. With the development of > technology, more and more SATA disks support this feature. So now we can enable > this detection on default. > > Although fua detection is defined as a kernel module parameter, it is too hard > to set its value because it must be loaded and set before system starts up. > That needs to modify initrd file. So it is inconvenient for administrator who > needs to manage a huge number of servers. > > CC: Jeff Garzik <jeff@garzik.org> > Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> > --- > drivers/ata/libata-core.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > index 23763a1..3627251 100644 > --- a/drivers/ata/libata-core.c > +++ b/drivers/ata/libata-core.c > @@ -133,9 +133,9 @@ int atapi_passthru16 = 1; > module_param(atapi_passthru16, int, 0444); > MODULE_PARM_DESC(atapi_passthru16, "Enable ATA_16 passthru for ATAPI devices (0=off, 1=on [default])"); > > -int libata_fua = 0; > +int libata_fua = 1; > module_param_named(fua, libata_fua, int, 0444); > -MODULE_PARM_DESC(fua, "FUA support (0=off [default], 1=on)"); > +MODULE_PARM_DESC(fua, "FUA support (0=off, 1=on [default])"); Applied. Let's see how far down the rabbit hole we go ;-) The FUA decision, as previously indicated, was based on early SATA drives, and perhaps better ones are available now. Only testing will tell, at this point. The larger questions, raised by Christoph and others remain unaddressed (though perhaps we can start addressing them now, with this patch): * what is smart flushing policy for ATA devices with FUA? * ATA NCQ's flush is not queued * ATA NCQ always had the FUA bit... * ...but mixing ATA NCQ FUA and !FUA in a queue is not fully supported by the existing code and probably a few other details I forgot :) Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Meanwhile there already has a number of SATA disks that have supported > this feature. So I think maybe we can enable it. > > Regards, > Zheng Blindly enabling FUA by default in my opinion is not a good idea at all. I believe the focus should be on detecting FUA support on the device instead, which the kernel at this point cannot reliably do (is it possible?). This patch introduced regression: https://lkml.org/lkml/2012/8/27/66 Arvydas -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09/09/2012 04:34 PM, Arvydas Sidorenko wrote: >> Meanwhile there already has a number of SATA disks that have supported >> this feature. So I think maybe we can enable it. >> >> Regards, >> Zheng > > Blindly enabling FUA by default in my opinion is not a good idea at all. > I believe the focus should be on detecting FUA support on the device > instead, which the kernel at this point cannot reliably do (is it possible?). Premature assumptions. It is entirely possible that FUA is detected accurately, but the software is missing a flush somewhere that FUA requires, if disks are to be used in FUA mode. The filesystem and the block layer must properly generate and order their I/Os based on the FUA enablement bits that appear in the block layer after libata discovers the SATA FUA feature. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 23763a1..3627251 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -133,9 +133,9 @@ int atapi_passthru16 = 1; module_param(atapi_passthru16, int, 0444); MODULE_PARM_DESC(atapi_passthru16, "Enable ATA_16 passthru for ATAPI devices (0=off, 1=on [default])"); -int libata_fua = 0; +int libata_fua = 1; module_param_named(fua, libata_fua, int, 0444); -MODULE_PARM_DESC(fua, "FUA support (0=off [default], 1=on)"); +MODULE_PARM_DESC(fua, "FUA support (0=off, 1=on [default])"); static int ata_ignore_hpa; module_param_named(ignore_hpa, ata_ignore_hpa, int, 0644);