Patchwork times(2) sys call bug?

login
register
mail settings
Submitter Paul Mackerras
Date Nov. 20, 2008, 11:52 p.m.
Message ID <18725.63534.3316.841949@cargo.ozlabs.ibm.com>
Download mbox | patch
Permalink /patch/9909/
State Superseded
Delegated to: Paul Mackerras
Headers show

Comments

Paul Mackerras - Nov. 20, 2008, 11:52 p.m.
Joakim Tjernlund writes:

> This little hack changes the kernel sys call handling in an crude
> way and then it works. Apperently the kernel thinks is an error if the
> syscall returns a value between -_LAST_ERRNO and -1.

Try this patch and let me if it fixes it.  If it does I'll push it
upstream.

Paul.
Joakim Tjernlund - Nov. 21, 2008, 8:31 a.m.
On Fri, 2008-11-21 at 10:52 +1100, Paul Mackerras wrote:
> Joakim Tjernlund writes:
> 
> > This little hack changes the kernel sys call handling in an crude
> > way and then it works. Apperently the kernel thinks is an error if the
> > syscall returns a value between -_LAST_ERRNO and -1.
> 
> Try this patch and let me if it fixes it.  If it does I'll push it
> upstream.
> 

It does fix the problem, thanks. You might want to do the same
to time(2)?
This workaround lets you get around the times(2) problem. Perhaps
you want to mention it in the commit msg:

static clock_t
our_times(void)        /* Make times(2) behave rationally on Linux */
{
	clock_t ret;

	errno   = 0;
	ret     = times(NULL);
	if (errno != 0)
		ret = (clock_t) (-errno);
	return ret;
}
Gabriel Paubert - Nov. 21, 2008, 8:41 a.m.
On Fri, Nov 21, 2008 at 10:52:14AM +1100, Paul Mackerras wrote:
> Joakim Tjernlund writes:
> 
> > This little hack changes the kernel sys call handling in an crude
> > way and then it works. Apperently the kernel thinks is an error if the
> > syscall returns a value between -_LAST_ERRNO and -1.
> 
> Try this patch and let me if it fixes it.  If it does I'll push it
> upstream.

With your patch, you won't get EFAULT if you pass a bad
address, but a constant, time independent value, unless
I miss something. 

Of course there are peoaple who claim that EFAULT is a
bad idea to start with and that you should send a SIGSEGV
instead, and I can see their point. 

But with the current implementation, it is a game that 
you can't win: any syscall that wants to return an
arbitrary integer multiplexed with an error value is
broken beyond repair, by design.

Oh, well. 

	Gabriel
Joakim Tjernlund - Nov. 21, 2008, 8:47 a.m.
On Fri, 2008-11-21 at 09:41 +0100, Gabriel Paubert wrote:
> On Fri, Nov 21, 2008 at 10:52:14AM +1100, Paul Mackerras wrote:
> > Joakim Tjernlund writes:
> > 
> > > This little hack changes the kernel sys call handling in an crude
> > > way and then it works. Apperently the kernel thinks is an error if the
> > > syscall returns a value between -_LAST_ERRNO and -1.
> > 
> > Try this patch and let me if it fixes it.  If it does I'll push it
> > upstream.
> 
> With your patch, you won't get EFAULT if you pass a bad
> address, but a constant, time independent value, unless
> I miss something. 

Not so, look again:

asmlinkage long sys_times(struct tms __user * tbuf)
{
	...
	if (tbuf) {
		...
		if (copy_to_user(tbuf, &tmp, sizeof(struct tms)))
			return -EFAULT;
	}
	force_successful_syscall_return();
	return (long) jiffies_64_to_clock_t(get_jiffies_64());
}
Paul Mackerras - Nov. 21, 2008, 9:03 a.m.
Gabriel Paubert writes:

> On Fri, Nov 21, 2008 at 10:52:14AM +1100, Paul Mackerras wrote:
> > Joakim Tjernlund writes:
> > 
> > > This little hack changes the kernel sys call handling in an crude
> > > way and then it works. Apperently the kernel thinks is an error if the
> > > syscall returns a value between -_LAST_ERRNO and -1.
> > 
> > Try this patch and let me if it fixes it.  If it does I'll push it
> > upstream.
> 
> With your patch, you won't get EFAULT if you pass a bad
> address, but a constant, time independent value, unless
> I miss something. 

I think you are missing something, namely that I put the call to
force_successful_syscall_return() AFTER the return -EFAULT.

You should get an EFAULT error if the address is bad, i.e. on return
to userspace with cr0.SO = 1 and r3 = EFAULT (note, not -EFAULT).  On
a non-error return you should get cr0.SO = 0 and r3 containing the
return value (even if it's -EFAULT).  It's possible that glibc will
stuff it up again after that but I hope not.

Paul.
Joakim Tjernlund - Nov. 21, 2008, 9:15 a.m.
On Fri, 2008-11-21 at 20:03 +1100, Paul Mackerras wrote:
> Gabriel Paubert writes:
> 
> > On Fri, Nov 21, 2008 at 10:52:14AM +1100, Paul Mackerras wrote:
> > > Joakim Tjernlund writes:
> > > 
> > > > This little hack changes the kernel sys call handling in an crude
> > > > way and then it works. Apperently the kernel thinks is an error if the
> > > > syscall returns a value between -_LAST_ERRNO and -1.
> > > 
> > > Try this patch and let me if it fixes it.  If it does I'll push it
> > > upstream.
> > 
> > With your patch, you won't get EFAULT if you pass a bad
> > address, but a constant, time independent value, unless
> > I miss something. 
> 
> I think you are missing something, namely that I put the call to
> force_successful_syscall_return() AFTER the return -EFAULT.
> 
> You should get an EFAULT error if the address is bad, i.e. on return
> to userspace with cr0.SO = 1 and r3 = EFAULT (note, not -EFAULT).  On
> a non-error return you should get cr0.SO = 0 and r3 containing the
> return value (even if it's -EFAULT).  It's possible that glibc will
> stuff it up again after that but I hope not.

With your patch:

	t1 = times((void*) 1);
	if (t1 == -1) {
		my_err = errno;
 		printf("Errno:%d, %s\n", my_err, strerror(my_err));
	}

prints:
	Errno:14, Bad address
Gabriel Paubert - Nov. 21, 2008, 9:50 a.m.
On Fri, Nov 21, 2008 at 08:03:06PM +1100, Paul Mackerras wrote:
> Gabriel Paubert writes:
> 
> > On Fri, Nov 21, 2008 at 10:52:14AM +1100, Paul Mackerras wrote:
> > > Joakim Tjernlund writes:
> > > 
> > > > This little hack changes the kernel sys call handling in an crude
> > > > way and then it works. Apperently the kernel thinks is an error if the
> > > > syscall returns a value between -_LAST_ERRNO and -1.
> > > 
> > > Try this patch and let me if it fixes it.  If it does I'll push it
> > > upstream.
> > 
> > With your patch, you won't get EFAULT if you pass a bad
> > address, but a constant, time independent value, unless
> > I miss something. 
> 
> I think you are missing something, namely that I put the call to
> force_successful_syscall_return() AFTER the return -EFAULT.
> 

Indeed, it may be time to update the syscall documentation, saying
that you need to clear errno before the syscall and check errno
and not the return value since -1 is valid. 

Who does this? I have spotted some errors in other places on 
my man pages too, especially in the networking area (they were
correct once upon a time, but have not been updated).

	Regards,
	Gabriel
Paul Mackerras - Nov. 21, 2008, 9:55 a.m.
Gabriel Paubert writes:

> Who does this? I have spotted some errors in other places on 
> my man pages too, especially in the networking area (they were
> correct once upon a time, but have not been updated).

Michael Kerrisk <mtk.manpages@googlemail.com>, I believe.

Paul.
Joakim Tjernlund - Nov. 21, 2008, 10:13 a.m.
On Fri, 2008-11-21 at 10:50 +0100, Gabriel Paubert wrote:
> On Fri, Nov 21, 2008 at 08:03:06PM +1100, Paul Mackerras wrote:
> > Gabriel Paubert writes:
> > 
> > > On Fri, Nov 21, 2008 at 10:52:14AM +1100, Paul Mackerras wrote:
> > > > Joakim Tjernlund writes:
> > > > 
> > > > > This little hack changes the kernel sys call handling in an crude
> > > > > way and then it works. Apperently the kernel thinks is an error if the
> > > > > syscall returns a value between -_LAST_ERRNO and -1.
> > > > 
> > > > Try this patch and let me if it fixes it.  If it does I'll push it
> > > > upstream.
> > > 
> > > With your patch, you won't get EFAULT if you pass a bad
> > > address, but a constant, time independent value, unless
> > > I miss something. 
> > 
> > I think you are missing something, namely that I put the call to
> > force_successful_syscall_return() AFTER the return -EFAULT.
> > 
> 
> Indeed, it may be time to update the syscall documentation, saying
> that you need to clear errno before the syscall and check errno
> and not the return value since -1 is valid. 

And perhaps mention that times(NULL) never returns an error.
And that times() is broken in 2.6.27 and earlier and needs the
workaround posted earlier.

 Jocke

Patch

diff --git a/kernel/sys.c b/kernel/sys.c
index 31deba8..1bf8c5c 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -33,6 +33,7 @@ 
 #include <linux/task_io_accounting_ops.h>
 #include <linux/seccomp.h>
 #include <linux/cpu.h>
+#include <linux/ptrace.h>
 
 #include <linux/compat.h>
 #include <linux/syscalls.h>
@@ -878,6 +879,7 @@  asmlinkage long sys_times(struct tms __user * tbuf)
 		if (copy_to_user(tbuf, &tmp, sizeof(struct tms)))
 			return -EFAULT;
 	}
+	force_successful_syscall_return();
 	return (long) jiffies_64_to_clock_t(get_jiffies_64());
 }