Patchwork [06/31] e2p: Fix f[gs]etflags argument size mismatch

login
register
mail settings
Submitter Darrick J. Wong
Date Oct. 1, 2013, 1:27 a.m.
Message ID <20131001012721.28415.97544.stgit@birch.djwong.org>
Download mbox | patch
Permalink /patch/279285/
State Rejected
Headers show

Comments

Darrick J. Wong - Oct. 1, 2013, 1:27 a.m.
The EXT2_IOC_[GS]ETFLAGS ioctls take longs as arguments, however this code only
reserves enough storage for an int.  The kernel drivers (so far) don't transfer
more than an int but FUSE sees the long and assumes that it's ok to write the
full size of the long, which crashes if sizeof(long) > sizeof(int).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 lib/e2p/fgetflags.c |    3 ++-
 lib/e2p/fsetflags.c |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Oct. 7, 2013, 1:33 p.m.
On Mon, Sep 30, 2013 at 06:27:21PM -0700, Darrick J. Wong wrote:

> The EXT2_IOC_[GS]ETFLAGS ioctls take longs as arguments, however
> this code only reserves enough storage for an int.  The kernel
> drivers (so far) don't transfer more than an int but FUSE sees the
> long and assumes that it's ok to write the full size of the long,
> which crashes if sizeof(long) > sizeof(int).

All of the kernel code I was able to audit is treating the
EXT2_IOC_[SG]ETFLAGS ioctls as taking an int, not a long.  So the
defacto definition of [SG]ETFLAGS is that that they take ints, not
longs.  If we make this change which you are proposing, it will cause
problems on big-endian systems where sizeof(long) > sizeof(int) ---
for example, as would be the case on the ppc64 architecture.

I'm not sure what the FUSE problem is?  Can you say more?  Is there
some other way we can work around the problem you are trying to solve?

     	       	      	   	      	      - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong - Oct. 7, 2013, 8:40 p.m.
On Mon, Oct 07, 2013 at 09:33:04AM -0400, Theodore Ts'o wrote:
> On Mon, Sep 30, 2013 at 06:27:21PM -0700, Darrick J. Wong wrote:
> 
> > The EXT2_IOC_[GS]ETFLAGS ioctls take longs as arguments, however
> > this code only reserves enough storage for an int.  The kernel
> > drivers (so far) don't transfer more than an int but FUSE sees the
> > long and assumes that it's ok to write the full size of the long,
> > which crashes if sizeof(long) > sizeof(int).
> 
> All of the kernel code I was able to audit is treating the
> EXT2_IOC_[SG]ETFLAGS ioctls as taking an int, not a long.  So the
> defacto definition of [SG]ETFLAGS is that that they take ints, not
> longs.  If we make this change which you are proposing, it will cause
> problems on big-endian systems where sizeof(long) > sizeof(int) ---
> for example, as would be the case on the ppc64 architecture.
> 
> I'm not sure what the FUSE problem is?  Can you say more?  Is there
> some other way we can work around the problem you are trying to solve?

If I mount a FUSE fs (specifically the fuse2fs thing in the last patch) and try
to run 'chattr +i /mnt/foo', chattr segfaults.  valgrind had this[1] to offer.
It seems that FUSE's ioctl implementation depends on the 'size' argument to
_IOR (in the EXT2_IOC_[SG]ETFLAGS definition) to figure out how much data to
transfer in or out of userspace.  Unfortunately for fgetflags, it passes in a
pointer to an int, and fuse seems to smash up the stack trying to write back a
long.  Valgrind and gdb show that the lower 32-bits of name get overwritten,
which causes the segfault.

Unfortunately, this is a bit of a heisenbug; if I build with -O0 (gcc 4.6.3,
Ubuntu 12.04, libfuse 2.9.2 backport) it goes away.  The stack corruption also
seems to go away if I print the address of f, though save_errno gets
overwritten with some suspicious looking 32695 value.

I'll keep poking at what's going on here, though I'll try to come up with a
clever solution for BE machines.

--D

[1] Valgrind messsage:
==3512== Memcheck, a memory error detector
==3512== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==3512== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==3512== Command: chattr +i /mnt/foo
==3512== 
==3512== Syscall param lstat(file_name) points to unaddressable byte(s)
==3512==    at 0x53232B5: _lxstat (lxstat.c:38)
==3512==    by 0x4E34610: fsetflags (in /lib/x86_64-linux-gnu/libe2p.so.2.3)
==3512==    by 0x401350: ??? (in /usr/bin/chattr)
==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
==3512==    by 0x525E76C: (below main) (libc-start.c:226)
==3512==  Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
==3512== 
==3512== Syscall param open(filename) points to unaddressable byte(s)
==3512==    at 0x53236C0: __open_nocancel (syscall-template.S:82)
==3512==    by 0x4E3463E: fsetflags (in /lib/x86_64-linux-gnu/libe2p.so.2.3)
==3512==    by 0x401350: ??? (in /usr/bin/chattr)
==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
==3512==    by 0x525E76C: (below main) (libc-start.c:226)
==3512==  Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
==3512== 
chattr: Bad address ==3512== Invalid read of size 1
==3512==    at 0x52883B1: vfprintf (vfprintf.c:1630)
==3512==    by 0x528B1A3: buffered_vfprintf (vfprintf.c:2313)
==3512==    by 0x5285BDD: vfprintf (vfprintf.c:1316)
==3512==    by 0x53463D7: __vfprintf_chk (vfprintf_chk.c:35)
==3512==    by 0x503ABA8: ??? (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
==3512==    by 0x503AD1E: com_err (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
==3512==    by 0x401435: ??? (in /usr/bin/chattr)
==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
==3512==    by 0x525E76C: (below main) (libc-start.c:226)
==3512==  Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
==3512== 
==3512== 
==3512== Process terminating with default action of signal 11 (SIGSEGV)
==3512==  Access not within mapped region at address 0x700007FB7
==3512==    at 0x52883B1: vfprintf (vfprintf.c:1630)
==3512==    by 0x528B1A3: buffered_vfprintf (vfprintf.c:2313)
==3512==    by 0x5285BDD: vfprintf (vfprintf.c:1316)
==3512==    by 0x53463D7: __vfprintf_chk (vfprintf_chk.c:35)
==3512==    by 0x503ABA8: ??? (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
==3512==    by 0x503AD1E: com_err (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
==3512==    by 0x401435: ??? (in /usr/bin/chattr)
==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
==3512==    by 0x525E76C: (below main) (libc-start.c:226)
==3512==  If you believe this happened as a result of a stack
==3512==  overflow in your program's main thread (unlikely but
==3512==  possible), you can try to increase the size of the
==3512==  main thread stack using the --main-stacksize= flag.
==3512==  The main thread stack size used in this run was 8388608.
==3512== 
==3512== HEAP SUMMARY:
==3512==     in use at exit: 0 bytes in 0 blocks
==3512==   total heap usage: 247 allocs, 247 frees, 22,481 bytes allocated
==3512== 
==3512== All heap blocks were freed -- no leaks are possible
==3512== 
==3512== For counts of detected and suppressed errors, rerun with: -v
==3512== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 2 from 2)
Segmentation fault
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong - Oct. 7, 2013, 11:23 p.m.
Well, it's nastier than this.  For an ioctl that only writes to userspace,
libfuse allocates an internal buffer with whatever junk is in memory at the
time.  If the fuse server's ioctl handler doesn't zero the buffer before
returning, the remnants of this garbage are passed from the server back to the
kernel, which dutifully copies the garbage into the user program's memory.
Even if the result is some error code!

fuse2fs' getflags handler would write an int to the lower end of this buffer,
so garbage + flags would get sent back to userspace.  In my particular case,
gcc was applying some kind of optimization to the chattr binary that places the
flags variable next to the 'name' variable in change_attributes(), and so the
return from the ioctl (through fs/fuse/file.c) would write a 64-bit value over
flags and the lower 32 bits of name.  The next time anyone tried to use name,
it would get a garbage pointer and (probably) crash.

Yuck.  FUSE assumes an interface contract (the data size encoded in the ioctl
number) that neither userspace nor kernel actually abide.  This has gone on for
years with no problems, since both components made the same implicit assumption
about data size in the same way.  Unfortunately, userspace breaks only on FUSE,
so I don't know what to do.

It's possible to adapt both e2p and FUSE servers to allocate an unsigned long
and shift the values around accordingly, but that won't help avoid a crash with
existing e2p binaries.  Even if I comment out the .ioctl function pointer in
fuse2fs.c, running chattr will still crash--FUSE copies the buffer even for an
error result.

Long term I guess we could define a new pair of ioctls that work with pointers
to 64-bit values and deprecate the old ones.  Or perhaps there's a better
suggestion than "don't run chattr/lsattr on a FUSE"?

--D

On Mon, Oct 07, 2013 at 01:40:55PM -0700, Darrick J. Wong wrote:
> On Mon, Oct 07, 2013 at 09:33:04AM -0400, Theodore Ts'o wrote:
> > On Mon, Sep 30, 2013 at 06:27:21PM -0700, Darrick J. Wong wrote:
> > 
> > > The EXT2_IOC_[GS]ETFLAGS ioctls take longs as arguments, however
> > > this code only reserves enough storage for an int.  The kernel
> > > drivers (so far) don't transfer more than an int but FUSE sees the
> > > long and assumes that it's ok to write the full size of the long,
> > > which crashes if sizeof(long) > sizeof(int).
> > 
> > All of the kernel code I was able to audit is treating the
> > EXT2_IOC_[SG]ETFLAGS ioctls as taking an int, not a long.  So the
> > defacto definition of [SG]ETFLAGS is that that they take ints, not
> > longs.  If we make this change which you are proposing, it will cause
> > problems on big-endian systems where sizeof(long) > sizeof(int) ---
> > for example, as would be the case on the ppc64 architecture.
> > 
> > I'm not sure what the FUSE problem is?  Can you say more?  Is there
> > some other way we can work around the problem you are trying to solve?
> 
> If I mount a FUSE fs (specifically the fuse2fs thing in the last patch) and try
> to run 'chattr +i /mnt/foo', chattr segfaults.  valgrind had this[1] to offer.
> It seems that FUSE's ioctl implementation depends on the 'size' argument to
> _IOR (in the EXT2_IOC_[SG]ETFLAGS definition) to figure out how much data to
> transfer in or out of userspace.  Unfortunately for fgetflags, it passes in a
> pointer to an int, and fuse seems to smash up the stack trying to write back a
> long.  Valgrind and gdb show that the lower 32-bits of name get overwritten,
> which causes the segfault.
> 
> Unfortunately, this is a bit of a heisenbug; if I build with -O0 (gcc 4.6.3,
> Ubuntu 12.04, libfuse 2.9.2 backport) it goes away.  The stack corruption also
> seems to go away if I print the address of f, though save_errno gets
> overwritten with some suspicious looking 32695 value.
> 
> I'll keep poking at what's going on here, though I'll try to come up with a
> clever solution for BE machines.
> 
> --D
> 
> [1] Valgrind messsage:
> ==3512== Memcheck, a memory error detector
> ==3512== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
> ==3512== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
> ==3512== Command: chattr +i /mnt/foo
> ==3512== 
> ==3512== Syscall param lstat(file_name) points to unaddressable byte(s)
> ==3512==    at 0x53232B5: _lxstat (lxstat.c:38)
> ==3512==    by 0x4E34610: fsetflags (in /lib/x86_64-linux-gnu/libe2p.so.2.3)
> ==3512==    by 0x401350: ??? (in /usr/bin/chattr)
> ==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512==    by 0x525E76C: (below main) (libc-start.c:226)
> ==3512==  Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
> ==3512== 
> ==3512== Syscall param open(filename) points to unaddressable byte(s)
> ==3512==    at 0x53236C0: __open_nocancel (syscall-template.S:82)
> ==3512==    by 0x4E3463E: fsetflags (in /lib/x86_64-linux-gnu/libe2p.so.2.3)
> ==3512==    by 0x401350: ??? (in /usr/bin/chattr)
> ==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512==    by 0x525E76C: (below main) (libc-start.c:226)
> ==3512==  Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
> ==3512== 
> chattr: Bad address ==3512== Invalid read of size 1
> ==3512==    at 0x52883B1: vfprintf (vfprintf.c:1630)
> ==3512==    by 0x528B1A3: buffered_vfprintf (vfprintf.c:2313)
> ==3512==    by 0x5285BDD: vfprintf (vfprintf.c:1316)
> ==3512==    by 0x53463D7: __vfprintf_chk (vfprintf_chk.c:35)
> ==3512==    by 0x503ABA8: ??? (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512==    by 0x503AD1E: com_err (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512==    by 0x401435: ??? (in /usr/bin/chattr)
> ==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512==    by 0x525E76C: (below main) (libc-start.c:226)
> ==3512==  Address 0x700007fb7 is not stack'd, malloc'd or (recently) free'd
> ==3512== 
> ==3512== 
> ==3512== Process terminating with default action of signal 11 (SIGSEGV)
> ==3512==  Access not within mapped region at address 0x700007FB7
> ==3512==    at 0x52883B1: vfprintf (vfprintf.c:1630)
> ==3512==    by 0x528B1A3: buffered_vfprintf (vfprintf.c:2313)
> ==3512==    by 0x5285BDD: vfprintf (vfprintf.c:1316)
> ==3512==    by 0x53463D7: __vfprintf_chk (vfprintf_chk.c:35)
> ==3512==    by 0x503ABA8: ??? (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512==    by 0x503AD1E: com_err (in /lib/x86_64-linux-gnu/libcom_err.so.2.1)
> ==3512==    by 0x401435: ??? (in /usr/bin/chattr)
> ==3512==    by 0x4010A0: ??? (in /usr/bin/chattr)
> ==3512==    by 0x525E76C: (below main) (libc-start.c:226)
> ==3512==  If you believe this happened as a result of a stack
> ==3512==  overflow in your program's main thread (unlikely but
> ==3512==  possible), you can try to increase the size of the
> ==3512==  main thread stack using the --main-stacksize= flag.
> ==3512==  The main thread stack size used in this run was 8388608.
> ==3512== 
> ==3512== HEAP SUMMARY:
> ==3512==     in use at exit: 0 bytes in 0 blocks
> ==3512==   total heap usage: 247 allocs, 247 frees, 22,481 bytes allocated
> ==3512== 
> ==3512== All heap blocks were freed -- no leaks are possible
> ==3512== 
> ==3512== For counts of detected and suppressed errors, rerun with: -v
> ==3512== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 2 from 2)
> Segmentation fault
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Oct. 8, 2013, 12:06 a.m.
On Mon, Oct 07, 2013 at 04:23:58PM -0700, Darrick J. Wong wrote:
> Yuck.  FUSE assumes an interface contract (the data size encoded in the ioctl
> number) that neither userspace nor kernel actually abide.  This has gone on for
> years with no problems, since both components made the same implicit assumption
> about data size in the same way.  Unfortunately, userspace breaks only on FUSE,
> so I don't know what to do.

I suspect we've never noticed because traditionally, FUSE has never
been used to front-end a file system that supports chattr/lsattr ---
most of thsoe file systems are available as native Linux file systems,
so it's probably not a common use case for FUSE.

Can we make the FUSE ioctl handler in fs/fuse/ioctl.c special case
handle the EXT2_IOC_[SG]ETFLAG ioctls.  That would it be consistent
with the other file systms.

> Long term I guess we could define a new pair of ioctls that work with pointers
> to 64-bit values and deprecate the old ones.  Or perhaps there's a better
> suggestion than "don't run chattr/lsattr on a FUSE"?

Well we can create a new pair of ioctls, and then have the userspace
code try the new ioctl, and if the kernel doesn't support it, try the
new ioctl.  But then we would have to fix up all of the file systems
in Linux, and it would take a while before users have a new kernel and
a new userspace which supports the new ioctl.

If we put the hack in fs/fuse/file.c's ioctl handler, then it only
requires a kernel upgrade....

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong - Oct. 8, 2013, 12:28 a.m.
On Mon, Oct 07, 2013 at 08:06:02PM -0400, Theodore Ts'o wrote:
> On Mon, Oct 07, 2013 at 04:23:58PM -0700, Darrick J. Wong wrote:
> > Yuck.  FUSE assumes an interface contract (the data size encoded in the ioctl
> > number) that neither userspace nor kernel actually abide.  This has gone on for
> > years with no problems, since both components made the same implicit assumption
> > about data size in the same way.  Unfortunately, userspace breaks only on FUSE,
> > so I don't know what to do.
> 
> I suspect we've never noticed because traditionally, FUSE has never
> been used to front-end a file system that supports chattr/lsattr ---
> most of thsoe file systems are available as native Linux file systems,
> so it's probably not a common use case for FUSE.
> 
> Can we make the FUSE ioctl handler in fs/fuse/ioctl.c special case
> handle the EXT2_IOC_[SG]ETFLAG ioctls.  That would it be consistent
> with the other file systms.

I can try changing fuse_do_ioctl() to add in the special case.  There don't
seem to be any definitions of _IOR('f', 1, long) that aren't *_IOC_GETFLAGS
(same for _IOW_('f', 2, long) for *_IOC_SETFLAGS) so a hack to bring FUSE in
line with xfs/btrfs/ext*/hfs+/reiser3/nilfs/gfs2/logfs/f2fs/ubifs/cifs might be
doable.

If not I suppose new ioctls are a slow-moving plan B.

--D
> 
> > Long term I guess we could define a new pair of ioctls that work with pointers
> > to 64-bit values and deprecate the old ones.  Or perhaps there's a better
> > suggestion than "don't run chattr/lsattr on a FUSE"?
> 
> Well we can create a new pair of ioctls, and then have the userspace
> code try the new ioctl, and if the kernel doesn't support it, try the
> new ioctl.  But then we would have to fix up all of the file systems
> in Linux, and it would take a while before users have a new kernel and
> a new userspace which supports the new ioctl.
> 
> If we put the hack in fs/fuse/file.c's ioctl handler, then it only
> requires a kernel upgrade....
> 
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/lib/e2p/fgetflags.c b/lib/e2p/fgetflags.c
index 2af8462..bfa87f2 100644
--- a/lib/e2p/fgetflags.c
+++ b/lib/e2p/fgetflags.c
@@ -66,7 +66,8 @@  int fgetflags (const char * name, unsigned long * flags)
 	return 0;
 #else /* !HAVE_STAT_FLAGS || (APPLE_DARWIN && HAVE_EXT2_IOCTLS) */
 #if HAVE_EXT2_IOCTLS
-	int fd, r, f, save_errno = 0;
+	int fd, r, save_errno = 0;
+	unsigned long f;
 
 	if (!lstat(name, &buf) &&
 	    !S_ISREG(buf.st_mode) && !S_ISDIR(buf.st_mode)) {
diff --git a/lib/e2p/fsetflags.c b/lib/e2p/fsetflags.c
index 167d16e..050cb4a 100644
--- a/lib/e2p/fsetflags.c
+++ b/lib/e2p/fsetflags.c
@@ -71,7 +71,8 @@  int fsetflags (const char * name, unsigned long flags)
 	return chflags (name, bsd_flags);
 #else /* !HAVE_CHFLAGS || (APPLE_DARWIN && HAVE_EXT2_IOCTLS) */
 #if HAVE_EXT2_IOCTLS
-	int fd, r, f, save_errno = 0;
+	int fd, r, save_errno = 0;
+	unsigned long f;
 	struct stat buf;
 
 	if (!lstat(name, &buf) &&