Message ID | 1282751267-3530-27-git-send-email-tj@kernel.org |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On 08/25/2010 06:00 PM, Christoph Hellwig wrote: > On Wed, Aug 25, 2010 at 05:58:42PM +0200, Christoph Hellwig wrote: >> On Wed, Aug 25, 2010 at 05:47:43PM +0200, Tejun Heo wrote: >>> From: Christoph Hellwig <hch@infradead.org> >>> >>> ext4 already uses synchronous discards, no need to add I/O barriers. >> >> This needs the patch that Jan sent in reply to my initial version merged >> into it. > > Actually the jbd2 patch needs it merged, but the point still stands. Yeah, wasn't sure about that one. Has anyone tested it? I'll be happy to merge it but I have no idea whether it's correct or not and Jan didn't seem to have tested it so... Jan, shall I merge the patch? Thanks.
On Wed, Aug 25, 2010 at 05:47:43PM +0200, Tejun Heo wrote: > From: Christoph Hellwig <hch@infradead.org> > > ext4 already uses synchronous discards, no need to add I/O barriers. This needs the patch that Jan sent in reply to my initial version merged into it. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 25, 2010 at 05:58:42PM +0200, Christoph Hellwig wrote: > On Wed, Aug 25, 2010 at 05:47:43PM +0200, Tejun Heo wrote: > > From: Christoph Hellwig <hch@infradead.org> > > > > ext4 already uses synchronous discards, no need to add I/O barriers. > > This needs the patch that Jan sent in reply to my initial version merged > into it. Actually the jbd2 patch needs it merged, but the point still stands. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed 25-08-10 17:57:41, Tejun Heo wrote: > On 08/25/2010 06:00 PM, Christoph Hellwig wrote: > > On Wed, Aug 25, 2010 at 05:58:42PM +0200, Christoph Hellwig wrote: > >> On Wed, Aug 25, 2010 at 05:47:43PM +0200, Tejun Heo wrote: > >>> From: Christoph Hellwig <hch@infradead.org> > >>> > >>> ext4 already uses synchronous discards, no need to add I/O barriers. > >> > >> This needs the patch that Jan sent in reply to my initial version merged > >> into it. > > > > Actually the jbd2 patch needs it merged, but the point still stands. > > Yeah, wasn't sure about that one. Has anyone tested it? I'll be > happy to merge it but I have no idea whether it's correct or not and > Jan didn't seem to have tested it so... Jan, shall I merge the patch? I'm quite confident the patch is correct so you can merge it I think but tomorrow I'll give it some crash testing together with the rest of your patch set in KVM to be sure. Honza
On 08/25/2010 10:02 PM, Jan Kara wrote: > On Wed 25-08-10 17:57:41, Tejun Heo wrote: >> On 08/25/2010 06:00 PM, Christoph Hellwig wrote: >>> On Wed, Aug 25, 2010 at 05:58:42PM +0200, Christoph Hellwig wrote: >>>> On Wed, Aug 25, 2010 at 05:47:43PM +0200, Tejun Heo wrote: >>>>> From: Christoph Hellwig <hch@infradead.org> >>>>> >>>>> ext4 already uses synchronous discards, no need to add I/O barriers. >>>> >>>> This needs the patch that Jan sent in reply to my initial version merged >>>> into it. >>> >>> Actually the jbd2 patch needs it merged, but the point still stands. >> >> Yeah, wasn't sure about that one. Has anyone tested it? I'll be >> happy to merge it but I have no idea whether it's correct or not and >> Jan didn't seem to have tested it so... Jan, shall I merge the patch? > I'm quite confident the patch is correct so you can merge it I think but > tomorrow I'll give it some crash testing together with the rest of your > patch set in KVM to be sure. Patch included in the series before jbd2 conversion patch. Thanks.
On Thu 26-08-10 10:25:47, Tejun Heo wrote: > On 08/25/2010 10:02 PM, Jan Kara wrote: > > On Wed 25-08-10 17:57:41, Tejun Heo wrote: > >> On 08/25/2010 06:00 PM, Christoph Hellwig wrote: > >>> On Wed, Aug 25, 2010 at 05:58:42PM +0200, Christoph Hellwig wrote: > >>>> On Wed, Aug 25, 2010 at 05:47:43PM +0200, Tejun Heo wrote: > >>>>> From: Christoph Hellwig <hch@infradead.org> > >>>>> > >>>>> ext4 already uses synchronous discards, no need to add I/O barriers. > >>>> > >>>> This needs the patch that Jan sent in reply to my initial version merged > >>>> into it. > >>> > >>> Actually the jbd2 patch needs it merged, but the point still stands. > >> > >> Yeah, wasn't sure about that one. Has anyone tested it? I'll be > >> happy to merge it but I have no idea whether it's correct or not and > >> Jan didn't seem to have tested it so... Jan, shall I merge the patch? > > I'm quite confident the patch is correct so you can merge it I think but > > tomorrow I'll give it some crash testing together with the rest of your > > patch set in KVM to be sure. > > Patch included in the series before jbd2 conversion patch. An update: I've set up an ext4 barrier testing in KVM - run fsstress, kill KVM at some random moment and check that the filesystem is consistent (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs without journal_async_commit passed fine, now I'm running some tests with the option enabled and the first few rounds passed OK as well. Honza
Jan Kara <jack@suse.cz> writes: > An update: I've set up an ext4 barrier testing in KVM - run fsstress, > kill KVM at some random moment and check that the filesystem is consistent > (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs But doesn't your "disk cache" survive the "power cycle" of your guest? It's tough to tell exactly what you're testing with so few details; care to elaborate? Cheers, Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon 30-08-10 15:56:43, Jeff Moyer wrote: > Jan Kara <jack@suse.cz> writes: > > > An update: I've set up an ext4 barrier testing in KVM - run fsstress, > > kill KVM at some random moment and check that the filesystem is consistent > > (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs > > But doesn't your "disk cache" survive the "power cycle" of your guest? Yes, you're right. Thinking about it now the test setup was wrong because it didn't refuse writes to the VM's data partition after the moment I killed KVM. Thanks for catching this. I will probably have to use the fault injection on the host to disallow writing the device at a certain moment. Or does somebody have a better option? My setup is that I have a dedicated partition / drive for a filesystem which is written to from a guest kernel running under KVM. I have set it up using virtio driver with cache=writeback so that the host caches the writes in a similar way disk caches them. At some point I just kill the qemu-kvm process and at that point I'd like to also throw away data cached by the host... Honza
On 08/30/2010 05:20 PM, Jan Kara wrote: > On Mon 30-08-10 15:56:43, Jeff Moyer wrote: >> Jan Kara<jack@suse.cz> writes: >> >>> An update: I've set up an ext4 barrier testing in KVM - run fsstress, >>> kill KVM at some random moment and check that the filesystem is consistent >>> (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs >> But doesn't your "disk cache" survive the "power cycle" of your guest? > Yes, you're right. Thinking about it now the test setup was wrong because > it didn't refuse writes to the VM's data partition after the moment I > killed KVM. Thanks for catching this. I will probably have to use the fault > injection on the host to disallow writing the device at a certain moment. > Or does somebody have a better option? > My setup is that I have a dedicated partition / drive for a filesystem > which is written to from a guest kernel running under KVM. I have set it up > using virtio driver with cache=writeback so that the host caches the writes > in a similar way disk caches them. At some point I just kill the qemu-kvm > process and at that point I'd like to also throw away data cached by the > host... > > Honza Hi Jan, Not sure if this is relevant, but what we have been using for part of the testing is an external e-sata enclosure that you can stick pretty much any S-ATA disk into. Important to drop power to the external disk (do not pull the s-ata cable, the firmware will destage the write cache for some/many disks if it has power and sees link loss :)). Once you turn the drive back on, the test was can you mount without error, unmount and do a fsck -f to verify no meta-data corruption, Ric -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kara, on 08/31/2010 12:20 AM wrote: > On Mon 30-08-10 15:56:43, Jeff Moyer wrote: >> Jan Kara<jack@suse.cz> writes: >> >>> An update: I've set up an ext4 barrier testing in KVM - run fsstress, >>> kill KVM at some random moment and check that the filesystem is consistent >>> (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs >> >> But doesn't your "disk cache" survive the "power cycle" of your guest? > Yes, you're right. Thinking about it now the test setup was wrong because > it didn't refuse writes to the VM's data partition after the moment I > killed KVM. Thanks for catching this. I will probably have to use the fault > injection on the host to disallow writing the device at a certain moment. > Or does somebody have a better option? Have you considered to setup a second box as an iSCSI target (e.g. with iSCSI-SCST)? With it killing the connectivity is just a matter of a single iptables command + a lot more options. Vlad -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Kara <jack@suse.cz> writes: > On Mon 30-08-10 15:56:43, Jeff Moyer wrote: >> Jan Kara <jack@suse.cz> writes: >> >> > An update: I've set up an ext4 barrier testing in KVM - run fsstress, >> > kill KVM at some random moment and check that the filesystem is consistent >> > (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs >> >> But doesn't your "disk cache" survive the "power cycle" of your guest? > Yes, you're right. Thinking about it now the test setup was wrong because > it didn't refuse writes to the VM's data partition after the moment I > killed KVM. Thanks for catching this. I will probably have to use the fault > injection on the host to disallow writing the device at a certain moment. > Or does somebody have a better option? > My setup is that I have a dedicated partition / drive for a filesystem > which is written to from a guest kernel running under KVM. I have set it up > using virtio driver with cache=writeback so that the host caches the writes > in a similar way disk caches them. At some point I just kill the qemu-kvm > process and at that point I'd like to also throw away data cached by the > host... I've used ilo to power off the system under test from remote. I have a tool to automate the testing. It works as follows: There's a client and a server. The server listens on an ip/port for connections. A client will connect, tell the server it's configuration (including what disk it's writing to, what block size it's using, and the total amount of I/O to be done), and then start doing I/O. The I/O is done using the AIO api, and the data written includes a block number, a generation number, fill, and a crc. As each completion comes in, the completed sectors are communicated to the server program. Upon completion of an entire series of writes (writing the entire data set once), the server waits some amount of time and then power cycles the client. The client comes back up and is run in check mode to verify that all of the data it reported as completed to the server is actually in tact. I recently updated the code to run against a file on a file system (previously it would only work on a block device). It makes use of stonith modules to do the power cycling. It works, but it isn't the most elegant bit of engineering I've ever done. ;-) Anyway, that code is available here: http://people.redhat.com/jmoyer/dainto-0.99.4.tar.bz2 Cheers, Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue 31-08-10 00:39:41, Vladislav Bolkhovitin wrote: > Jan Kara, on 08/31/2010 12:20 AM wrote: > >On Mon 30-08-10 15:56:43, Jeff Moyer wrote: > >>Jan Kara<jack@suse.cz> writes: > >> > >>> An update: I've set up an ext4 barrier testing in KVM - run fsstress, > >>>kill KVM at some random moment and check that the filesystem is consistent > >>>(kvm is run in cache=writeback mode to simulate disk cache). About 70 runs > >> > >>But doesn't your "disk cache" survive the "power cycle" of your guest? > > Yes, you're right. Thinking about it now the test setup was wrong because > >it didn't refuse writes to the VM's data partition after the moment I > >killed KVM. Thanks for catching this. I will probably have to use the fault > >injection on the host to disallow writing the device at a certain moment. > >Or does somebody have a better option? > > Have you considered to setup a second box as an iSCSI target (e.g. > with iSCSI-SCST)? With it killing the connectivity is just a matter > of a single iptables command + a lot more options. Hmm, this might be an interesting option. Will try that. Thanks for suggestion. Honza
On 08/30/2010 10:20 PM, Jan Kara wrote: > My setup is that I have a dedicated partition / drive for a filesystem > which is written to from a guest kernel running under KVM. I have set it up > using virtio driver with cache=writeback so that the host caches the writes > in a similar way disk caches them. At some point I just kill the qemu-kvm > process and at that point I'd like to also throw away data cached by the > host... $ echo 1 > /sys/block/sdX/device/delete $ echo - - - > /sys/class/scsi_host/hostX/scan should do the trick. Thanks.
On 08/31/2010 12:02 AM, Jan Kara wrote: > On Tue 31-08-10 00:39:41, Vladislav Bolkhovitin wrote: >> Jan Kara, on 08/31/2010 12:20 AM wrote: >>> On Mon 30-08-10 15:56:43, Jeff Moyer wrote: >>>> Jan Kara<jack@suse.cz> writes: >>>> >>>>> An update: I've set up an ext4 barrier testing in KVM - run fsstress, >>>>> kill KVM at some random moment and check that the filesystem is consistent >>>>> (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs >>>> >>>> But doesn't your "disk cache" survive the "power cycle" of your guest? >>> Yes, you're right. Thinking about it now the test setup was wrong because >>> it didn't refuse writes to the VM's data partition after the moment I >>> killed KVM. Thanks for catching this. I will probably have to use the fault >>> injection on the host to disallow writing the device at a certain moment. >>> Or does somebody have a better option? >> >> Have you considered to setup a second box as an iSCSI target (e.g. >> with iSCSI-SCST)? With it killing the connectivity is just a matter >> of a single iptables command + a lot more options. Still same problem no? the data is still cached on the backing store device how do you trash the cached data? > Hmm, this might be an interesting option. Will try that. Thanks for > suggestion. > > Honza with stgt it's very simple as well. It's a user mode application. All on the same machine: - run stgt application - login + mount a filesystem - run test - kill -9 stgt mid flight But how to throw away the data on the backing store cache? Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/31/2010 11:11 AM, Tejun Heo wrote: > On 08/30/2010 10:20 PM, Jan Kara wrote: >> My setup is that I have a dedicated partition / drive for a filesystem >> which is written to from a guest kernel running under KVM. I have set it up >> using virtio driver with cache=writeback so that the host caches the writes >> in a similar way disk caches them. At some point I just kill the qemu-kvm >> process and at that point I'd like to also throw away data cached by the >> host... > > $ echo 1 > /sys/block/sdX/device/delete > $ echo - - - > /sys/class/scsi_host/hostX/scan > I don't know all the specifics of the virtio driver and the KVM backend but don't the KVM target io is eventually directed to a local file or device? If so the scsi device has disappeard but the bulk of the data is in host cache at the backstore (file or bdev). Once all files are closed the data is synced to disk. Is it not the same as Ric's problem of disconnecting the sata cable but not dropping power to the drive. The main of the cache is still intact. > should do the trick. > > Thanks. > Thanks Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On 08/31/2010 12:07 PM, Boaz Harrosh wrote: > I don't know all the specifics of the virtio driver and the KVM backend but > don't the KVM target io is eventually directed to a local file or device? > If so the scsi device has disappeard but the bulk of the data is in host cache > at the backstore (file or bdev). Once all files are closed the data is synced > to disk. > > Is it not the same as Ric's problem of disconnecting the sata cable but > not dropping power to the drive. The main of the cache is still intact. There are two layers of caching there. drive cache - host page cache - guest When guest issues FLUSH, qemu will translate it into fdatasync which will flush the host page cache followed by FLUSH to the drive which will flush the drive cache to the media. If you delete the host disk device, it will be detached w/o host page cache flushed. So, although it's not complete, it will lose good part of cache. With out write out timeout increased and/or with laptop mode enabled, it will probably lose most of cache. Thanks.
On 08/31/2010 01:13 PM, Tejun Heo wrote: > Hello, > > On 08/31/2010 12:07 PM, Boaz Harrosh wrote: >> I don't know all the specifics of the virtio driver and the KVM backend but >> don't the KVM target io is eventually directed to a local file or device? >> If so the scsi device has disappeard but the bulk of the data is in host cache >> at the backstore (file or bdev). Once all files are closed the data is synced >> to disk. >> >> Is it not the same as Ric's problem of disconnecting the sata cable but >> not dropping power to the drive. The main of the cache is still intact. > > There are two layers of caching there. > > drive cache - host page cache - guest > > When guest issues FLUSH, qemu will translate it into fdatasync which > will flush the host page cache followed by FLUSH to the drive which > will flush the drive cache to the media. If you delete the host disk > device, it will be detached w/o host page cache flushed. So, although > it's not complete, it will lose good part of cache. With out write > out timeout increased and/or with laptop mode enabled, it will > probably lose most of cache. > Ha, ok you meant that device. So if you have a dedicated physical device for backstore that would be a very nice scriptable way. Thanks, that's a much better automated test than pulling drives out of sockets. > Thanks. > Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Boaz Harrosh, on 08/31/2010 01:55 PM wrote: >>>>>> An update: I've set up an ext4 barrier testing in KVM - run fsstress, >>>>>> kill KVM at some random moment and check that the filesystem is consistent >>>>>> (kvm is run in cache=writeback mode to simulate disk cache). About 70 runs >>>>> >>>>> But doesn't your "disk cache" survive the "power cycle" of your guest? >>>> Yes, you're right. Thinking about it now the test setup was wrong because >>>> it didn't refuse writes to the VM's data partition after the moment I >>>> killed KVM. Thanks for catching this. I will probably have to use the fault >>>> injection on the host to disallow writing the device at a certain moment. >>>> Or does somebody have a better option? >>> >>> Have you considered to setup a second box as an iSCSI target (e.g. >>> with iSCSI-SCST)? With it killing the connectivity is just a matter >>> of a single iptables command + a lot more options. > > Still same problem no? the data is still cached on the backing store device > how do you trash the cached data? If you need to kill the device's cache you can crash/panic/power off the target. That also can be well scriptable. Vlad -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue 31-08-10 10:11:34, Tejun Heo wrote: > On 08/30/2010 10:20 PM, Jan Kara wrote: > > My setup is that I have a dedicated partition / drive for a filesystem > > which is written to from a guest kernel running under KVM. I have set it up > > using virtio driver with cache=writeback so that the host caches the writes > > in a similar way disk caches them. At some point I just kill the qemu-kvm > > process and at that point I'd like to also throw away data cached by the > > host... > > $ echo 1 > /sys/block/sdX/device/delete > $ echo - - - > /sys/class/scsi_host/hostX/scan > > should do the trick. I've tested that when mounting with barrier=0 option inside KVM, this indeed does destroy the filesystem rather badly. With the barrier option, ext4 has already survived several crash cycles while running fsstress with journal_async_commit option. So the patch seems to work as expected. Honza
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index df44b34..a22bfef 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2567,7 +2567,7 @@ static inline void ext4_issue_discard(struct super_block *sb, trace_ext4_discard_blocks(sb, (unsigned long long) discard_block, count); ret = sb_issue_discard(sb, discard_block, count, GFP_NOFS, - BLKDEV_IFL_WAIT | BLKDEV_IFL_BARRIER); + BLKDEV_IFL_WAIT); if (ret == EOPNOTSUPP) { ext4_warning(sb, "discard not supported, disabling"); clear_opt(EXT4_SB(sb)->s_mount_opt, DISCARD);