Message ID | 20130403102204.GA15383@gmail.com |
---|---|
State | Superseded, archived |
Headers | show |
On Wed, Apr 03, 2013 at 06:22:04PM +0800, Zheng Liu wrote: > Subject: [PATCH] ext4: fix a big-endian bug when an extent is zeroed out > > From: Zheng Liu <wenqing.lz@taobao.com> > > When an extent was zeroed out, we forgot to do convert from cpu to le16. > It could make us hit a BUG_ON when we try to write dirty pages out. So > fix it. > > Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Thanks for finding this! I think we should push this to Linus right away, and not wait for the next merge window. The bug has been here for a long time, but it was unmasked by the fact that we unbroke extent zeroing in 3.9-rcX. I have two big questions. (1) Shouldn't Eric Whitney have picked this up with his ARM pandaboard testing, since IIRC it's big-endian as well? If not, is there something we can do to improve our testing wrt to big-endian systems? And (2) does it make sense to have an inline function ext4_ext_set_len(len)? It might save some lines of code, but more importantly, it might make it less likely that we will overlook this sort of bug in the future. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 3 Apr 2013 08:20:58 -0400, Theodore Ts'o <tytso@mit.edu> wrote: > On Wed, Apr 03, 2013 at 06:22:04PM +0800, Zheng Liu wrote: > > Subject: [PATCH] ext4: fix a big-endian bug when an extent is zeroed out > > > > From: Zheng Liu <wenqing.lz@taobao.com> > > > > When an extent was zeroed out, we forgot to do convert from cpu to le16. > > It could make us hit a BUG_ON when we try to write dirty pages out. So > > fix it. > > > > Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> > > Thanks for finding this! I think we should push this to Linus right > away, and not wait for the next merge window. The bug has been here > for a long time, but it was unmasked by the fact that we unbroke > extent zeroing in 3.9-rcX. IMHO you have to pick this one http://patchwork.ozlabs.org/patch/233397 because it also fix ext_to_indirect_migration and inode's csum > > I have two big questions. (1) Shouldn't Eric Whitney have picked this > up with his ARM pandaboard testing, since IIRC it's big-endian as > well? If not, is there something we can do to improve our testing wrt > to big-endian systems? > > And (2) does it make sense to have an inline function > ext4_ext_set_len(len)? It might save some lines of code, but more > importantly, it might make it less likely that we will overlook this > sort of bug in the future. Let's live it for now, later I'll cleanup/optimize this 'zero_ex' at least from from ext4_split_extent_at because at the end we are shure that 'ex' is fully mapped and initialized > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Theodore Ts'o <tytso@mit.edu>: > On Wed, Apr 03, 2013 at 06:22:04PM +0800, Zheng Liu wrote: > > Subject: [PATCH] ext4: fix a big-endian bug when an extent is zeroed out > > > > From: Zheng Liu <wenqing.lz@taobao.com> > > > > When an extent was zeroed out, we forgot to do convert from cpu to le16. > > It could make us hit a BUG_ON when we try to write dirty pages out. So > > fix it. > > > > Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> > > Thanks for finding this! I think we should push this to Linus right > away, and not wait for the next merge window. The bug has been here > for a long time, but it was unmasked by the fact that we unbroke > extent zeroing in 3.9-rcX. > > I have two big questions. (1) Shouldn't Eric Whitney have picked this > up with his ARM pandaboard testing, since IIRC it's big-endian as > well? If not, is there something we can do to improve our testing wrt > to big-endian systems? The TI OMAP4 processor on my Pandaboard test system is little endian. Eric > > And (2) does it make sense to have an inline function > ext4_ext_set_len(len)? It might save some lines of code, but more > importantly, it might make it less likely that we will overlook this > sort of bug in the future. > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 03, 2013 at 10:34:06AM -0400, Eric Whitney wrote: > > The TI OMAP4 processor on my Pandaboard test system is little endian. Ah... so basically, we need to find a test platform which allows us to boot arbitrary kernels and allows us to have root access (which means it's unlikely we'll be able to do this via remote access) and which doesn't have exotic power requirements (which as far as I know rules out pSeries and zSeries systems....) It would also be nice if we could run tests in finite time, which probably rules out the Hercules emulator (it runs at one-tenth zSeries processor speeds, which doesn't win speed competitions by default, and I suspect their storage speeds are even worse). Anyone else have any suggestions? Or anyone willing to help us run ext4 regression tests on the ext4 dev tree, so we can find these problems before we merge into mainline? Thanks, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Le 04/03/13 16:41, Theodore Ts'o a écrit : > On Wed, Apr 03, 2013 at 10:34:06AM -0400, Eric Whitney wrote: >> >> The TI OMAP4 processor on my Pandaboard test system is little endian. > > Ah... so basically, we need to find a test platform which allows us to > boot arbitrary kernels and allows us to have root access (which means > it's unlikely we'll be able to do this via remote access) and which > doesn't have exotic power requirements (which as far as I know rules > out pSeries and zSeries systems....) > > It would also be nice if we could run tests in finite time, which > probably rules out the Hercules emulator (it runs at one-tenth zSeries > processor speeds, which doesn't win speed competitions by default, and > I suspect their storage speeds are even worse). > > Anyone else have any suggestions? Or anyone willing to help us run > ext4 regression tests on the ext4 dev tree, so we can find these > problems before we merge into mainline? Qemu emulates various mainline PowerPC, MIPS and SPARC big-endian systems pretty efficiently and it should not be too hard neither to script nor to get a recent kernel up and running on these platforms. My 2 cents. -- Florian -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Ted, ----- Original Message ----- > From: "Theodore Ts'o" <tytso@mit.edu> > To: "Eric Whitney" <enwlinux@gmail.com> > Cc: "Dmitry Monakhov" <dmonakhov@openvz.org>, "Christian Kujau" <lists@nerdbynature.de>, "CAI Qian" > <caiqian@redhat.com>, "LKML" <linux-kernel@vger.kernel.org>, "linux-s390" <linux-s390@vger.kernel.org>, "Steve Best" > <sbest@redhat.com>, linux-ext4@vger.kernel.org > Sent: Wednesday, April 3, 2013 10:41:14 PM > Subject: Re: [PATCH] ext4: fix a big-endian bug when an extent is zeroed out > > On Wed, Apr 03, 2013 at 10:34:06AM -0400, Eric Whitney wrote: > > > > The TI OMAP4 processor on my Pandaboard test system is little endian. > > Ah... so basically, we need to find a test platform which allows us to > boot arbitrary kernels and allows us to have root access (which means > it's unlikely we'll be able to do this via remote access) and which > doesn't have exotic power requirements (which as far as I know rules > out pSeries and zSeries systems....) > > It would also be nice if we could run tests in finite time, which > probably rules out the Hercules emulator (it runs at one-tenth zSeries > processor speeds, which doesn't win speed competitions by default, and > I suspect their storage speeds are even worse). > > Anyone else have any suggestions? Or anyone willing to help us run > ext4 regression tests on the ext4 dev tree, so we can find these > problems before we merge into mainline? I can help run xfstests for ext4 dev tree on x64, Power7, Z10 and KVM platforms with back-storage like SAN/multipath, iSCSI and FCoE. I plan to run this weekly and setup a wiki page to update the testing status by every Friday. CAI Qian > > Thanks, > > - Ted > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Apr 08, 2013 at 11:05:11PM -0400, CAI Qian wrote: > I can help run xfstests for ext4 dev tree on x64, Power7, Z10 and > KVM platforms with back-storage like SAN/multipath, iSCSI and FCoE. > I plan to run this weekly and setup a wiki page to update the testing > status by every Friday. Hi CAI, Sorry for not getting back to you sooner; I was at Collaboration Summit and LSF/MM last week. It would be great if you could help run xfstests on the ext4 dev tree on various platforms. We don't have any coverage on Power7 or s390/Z10 at the moment, so that would be especially welcome. Coverage on alternate storage backends can be interesting in finding timing problems so they would be valuable as well. If you have any Itanium platforms, that would be great too, since we don't have that today. The various ext4 configurations which I test can be found in the kvm-autorun/conf directory in my xfstests-bld git repository: git://git.kernel.org/pub/scm/fs/ext2/xfstests-bld.git (This repository is a convenient setup to do build xfstests in a hermetic environment, and convenience scripts to run xfstests under kvm, and scripts on the host OS kick off the kvm test run and parse the test output afterwards.) Thanks for offering to test the dev branch! - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Ted, ----- Original Message ----- > From: "Theodore Ts'o" <tytso@mit.edu> > To: "CAI Qian" <caiqian@redhat.com> > Cc: "Eric Whitney" <enwlinux@gmail.com>, "Dmitry Monakhov" <dmonakhov@openvz.org>, "Christian Kujau" > nerdbynature.de>, "LKML" <linux-kernel@vger.kernel.org>, "linux-s390" <linux-s390@vger.kernel.org>, "Steve > Best" <sbest@redhat.com>, linux-ext4@vger.kernel.org > Sent: Saturday, April 20, 2013 11:19:45 PM > Subject: Re: [PATCH] ext4: fix a big-endian bug when an extent is zeroed out > > On Mon, Apr 08, 2013 at 11:05:11PM -0400, CAI Qian wrote: > > I can help run xfstests for ext4 dev tree on x64, Power7, Z10 and > > KVM platforms with back-storage like SAN/multipath, iSCSI and FCoE. > > I plan to run this weekly and setup a wiki page to update the testing > > status by every Friday. > > Hi CAI, > > Sorry for not getting back to you sooner; I was at Collaboration > Summit and LSF/MM last week. > > It would be great if you could help run xfstests on the ext4 dev tree > on various platforms. We don't have any coverage on Power7 or > s390/Z10 at the moment, so that would be especially welcome. Coverage > on alternate storage backends can be interesting in finding timing > problems so they would be valuable as well. > > If you have any Itanium platforms, that would be great too, since we > don't have that today. Unfortunately, to get those ia64 up and running with the upstream kernel required some significant efforts. I'd leave that for now until it is something very important. > > The various ext4 configurations which I test can be found in the > kvm-autorun/conf directory in my xfstests-bld git repository: > > git://git.kernel.org/pub/scm/fs/ext2/xfstests-bld.git > > (This repository is a convenient setup to do build xfstests in a > hermetic environment, and convenience scripts to run xfstests under > kvm, and scripts on the host OS kick off the kvm test run and parse > the test output afterwards.) > > Thanks for offering to test the dev branch! OK, will check with that. BTW, git.kernel.org is kind of broken for me very often those days, as almost all my tests got this, + git clone http://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git -b dev Cloning into 'ext4'... fatal: The remote end hung up unexpectedly fatal: recursion detected in die handler Therefore, it is going to take a while to re-try later as the test systems here always do testing from a clean environment, i.e., re-install OS, and then re-clone the tree etc. :\ CAI Qian > > - Ted > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20/04/13 17:19, Theodore Ts'o wrote: > On Mon, Apr 08, 2013 at 11:05:11PM -0400, CAI Qian wrote: >> I can help run xfstests for ext4 dev tree on x64, Power7, Z10 and >> KVM platforms with back-storage like SAN/multipath, iSCSI and FCoE. >> I plan to run this weekly and setup a wiki page to update the testing >> status by every Friday. > > Hi CAI, > > Sorry for not getting back to you sooner; I was at Collaboration > Summit and LSF/MM last week. > > It would be great if you could help run xfstests on the ext4 dev tree > on various platforms. We don't have any coverage on Power7 or > s390/Z10 at the moment, so that would be especially welcome. Coverage > on alternate storage backends can be interesting in finding timing > problems so they would be valuable as well. It is probably easier to get access to a Power system (gcc compile farm?). Having said that, there is a service available to get access to a System z for open source development: http://www-03.ibm.com/systems/z/os/linux/support/community.html When we did the valgrind port we used that a lot to gave access to non-IBMers. Dont know if it is ok to install a private kernel, though. Will double check. Christian -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index e4a6844..2352467 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2999,20 +2999,23 @@ static int ext4_split_extent_at(handle_t *handle, if (split_flag & EXT4_EXT_DATA_VALID1) { err = ext4_ext_zeroout(inode, ex2); zero_ex.ee_block = ex2->ee_block; - zero_ex.ee_len = ext4_ext_get_actual_len(ex2); + zero_ex.ee_len = cpu_to_le16( + ext4_ext_get_actual_len(ex2)); ext4_ext_store_pblock(&zero_ex, ext4_ext_pblock(ex2)); } else { err = ext4_ext_zeroout(inode, ex); zero_ex.ee_block = ex->ee_block; - zero_ex.ee_len = ext4_ext_get_actual_len(ex); + zero_ex.ee_len = cpu_to_le16( + ext4_ext_get_actual_len(ex)); ext4_ext_store_pblock(&zero_ex, ext4_ext_pblock(ex)); } } else { err = ext4_ext_zeroout(inode, &orig_ex); zero_ex.ee_block = orig_ex.ee_block; - zero_ex.ee_len = ext4_ext_get_actual_len(&orig_ex); + zero_ex.ee_len = cpu_to_le16( + ext4_ext_get_actual_len(&orig_ex)); ext4_ext_store_pblock(&zero_ex, ext4_ext_pblock(&orig_ex)); } @@ -3272,7 +3275,7 @@ static int ext4_ext_convert_to_initialized(handle_t *handle, if (err) goto out; zero_ex.ee_block = ex->ee_block; - zero_ex.ee_len = ext4_ext_get_actual_len(ex); + zero_ex.ee_len = cpu_to_le16(ext4_ext_get_actual_len(ex)); ext4_ext_store_pblock(&zero_ex, ext4_ext_pblock(ex)); err = ext4_ext_get_access(handle, inode, path + depth);