diff mbox

[v2,1/3] powerpc: Don't force ENOSYS as error on syscall fail

Message ID 1423643778-32525-2-git-send-email-bogdan.purcareata@freescale.com (mailing list archive)
State Superseded
Delegated to: Michael Ellerman
Headers show

Commit Message

Bogdan Purcareata Feb. 11, 2015, 8:36 a.m. UTC
In certain scenarios - e.g. seccomp filtering with ERRNO as default action -
the system call fails for other reasons than the syscall not being available.
The seccomp filter can be configured to store a user-defined error code on
return from a blacklisted syscall. Don't always set ENOSYS on
do_syscall_trace_enter failure.

v2:
- move setting ENOSYS as errno from the syscall entry assembly to
  do_syscall_trace_enter, only in the specific case

Signed-off-by: Bogdan Purcareata <bogdan.purcareata@freescale.com>
---
 arch/powerpc/kernel/entry_32.S | 2 +-
 arch/powerpc/kernel/entry_64.S | 1 -
 arch/powerpc/kernel/ptrace.c   | 4 +++-
 3 files changed, 4 insertions(+), 3 deletions(-)

Comments

Michael Ellerman Feb. 12, 2015, 5:24 a.m. UTC | #1
On Wed, 2015-02-11 at 08:36 +0000, Bogdan Purcareata wrote:
> In certain scenarios - e.g. seccomp filtering with ERRNO as default action -
> the system call fails for other reasons than the syscall not being available.
> The seccomp filter can be configured to store a user-defined error code on
> return from a blacklisted syscall. Don't always set ENOSYS on
> do_syscall_trace_enter failure.
> 
> v2:
> - move setting ENOSYS as errno from the syscall entry assembly to
>   do_syscall_trace_enter, only in the specific case

> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 194e46d..0111e04 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -269,7 +269,6 @@ syscall_dotrace:
>  	b	.Lsyscall_dotrace_cont
>  
>  syscall_enosys:
> -	li	r3,-ENOSYS
>  	b	syscall_exit


This still looks wrong to me.

On 64 bit we do:

	CURRENT_THREAD_INFO(r11, r1)
	ld	r10,TI_FLAGS(r11)
	andi.	r11,r10,_TIF_SYSCALL_DOTRACE
	bne	syscall_dotrace
.Lsyscall_dotrace_cont:
	cmpldi	0,r0,NR_syscalls
	bge-	syscall_enosys
...

syscall_enosys:
	li	r3,-ENOSYS
	b	.Lsyscall_exit


Your patch removes the load of ENOSYS.

Which means if we're not doing syscall tracing, and we get an out-of-bounds
syscall number, we'll return with something random on r3. Won't we?

The 32-bit code looks more or less similar, although the label has a different
name.

cheers
Purcareata Bogdan Feb. 12, 2015, 8:38 a.m. UTC | #2
On 12.02.2015 07:24, Michael Ellerman wrote:
> On Wed, 2015-02-11 at 08:36 +0000, Bogdan Purcareata wrote:
>> In certain scenarios - e.g. seccomp filtering with ERRNO as default action -
>> the system call fails for other reasons than the syscall not being available.
>> The seccomp filter can be configured to store a user-defined error code on
>> return from a blacklisted syscall. Don't always set ENOSYS on
>> do_syscall_trace_enter failure.
>>
>> v2:
>> - move setting ENOSYS as errno from the syscall entry assembly to
>>    do_syscall_trace_enter, only in the specific case
>
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 194e46d..0111e04 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -269,7 +269,6 @@ syscall_dotrace:
>>   	b	.Lsyscall_dotrace_cont
>>
>>   syscall_enosys:
>> -	li	r3,-ENOSYS
>>   	b	syscall_exit
>
>
> This still looks wrong to me.
>
> On 64 bit we do:
>
> 	CURRENT_THREAD_INFO(r11, r1)
> 	ld	r10,TI_FLAGS(r11)
> 	andi.	r11,r10,_TIF_SYSCALL_DOTRACE
> 	bne	syscall_dotrace
> .Lsyscall_dotrace_cont:
> 	cmpldi	0,r0,NR_syscalls
> 	bge-	syscall_enosys
> ...
>
> syscall_enosys:
> 	li	r3,-ENOSYS
> 	b	.Lsyscall_exit
>
>
> Your patch removes the load of ENOSYS.
>
> Which means if we're not doing syscall tracing, and we get an out-of-bounds
> syscall number, we'll return with something random on r3. Won't we?

Thanks for pointing this out, you are absolutely right. Perhaps this is 
a fix for the issue - on 64 bit:

         ld      r10,TI_FLAGS(r11)
         andi.   r11,r10,_TIF_SYSCALL_T_OR_A
         bne     syscall_dotrace
-.Lsyscall_dotrace_cont:
         cmpldi  0,r0,NR_syscalls
         bge-    syscall_enosys

system_call:
...

syscall_enosys:
	li	r3,-ENOSYS
	b	.Lsyscall_exit
...

syscall_dotrace:
...
         addi    r9,r1,STACK_FRAME_OVERHEAD
         CURRENT_THREAD_INFO(r10, r1)
         ld      r10,TI_FLAGS(r10)
-       b       .Lsyscall_dotrace_cont
+       cmpldi  0,r0,NR_syscalls
+       bge     syscall_exit
+       b       system_call

So basically I leave the code for syscall_enosys unchanged, but I keep 
using it only when not doing syscall tracing. When doing syscall 
tracing, I'm assuming do_syscall_trace_enter will take care of setting 
the errno, and should it return an invalid syscall number, go directly 
to syscall_exit.

> The 32-bit code looks more or less similar, although the label has a different
> name.

Same thing for 32-bit:

_GLOBAL(DoSyscall)
         lwz     r11,TI_FLAGS(r10)
         andi.   r11,r11,_TIF_SYSCALL_T_OR_A
         bne-    syscall_dotrace
-syscall_dotrace_cont:
         cmplwi  0,r0,NR_syscalls
         lis     r10,sys_call_table@h
         ori     r10,r10,sys_call_table@l
         slwi    r0,r0,2
         bge     66f
+syscall_dotrace_cont:
         lwzx    r10,r10,r0      /* Fetch system call handler [ptr] */
         mtlr    r10
         addi    r9,r1,STACK_FRAME_OVERHEAD
...

66:	li	r3,-ENOSYS
	b	ret_from_syscall
...

syscall_dotrace:
         lwz     r7,GPR7(r1)
         lwz     r8,GPR8(r1)
         REST_NVGPRS(r1)
+       cmplwi  0,r0,NR_syscalls
+       lis     r10,sys_call_table@h
+       ori     r10,r10,sys_call_table@l
+       slwi    r0,r0,2
+       bge-    ret_from_syscall
         b       syscall_dotrace_cont

However I must admit that I don't like duplicating those 4 lines of code 
associated with verifying the syscall number. I can't think of any 
better way to do this. I also thought about leaving this check in one 
place, and then branch differently according to _TIF_SYSCALL_T_OR_A. Do 
you think that would be a better approach?

Thank you,
Bogdan P.
diff mbox

Patch

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 10a0935..d2c58a3 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -425,7 +425,7 @@  END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRED_STWCX)
 	b	1b
 #endif  /* CONFIG_44x */
 
-66:	li	r3,-ENOSYS
+66:
 	b	ret_from_syscall
 
 	.globl	ret_from_fork
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 194e46d..0111e04 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -269,7 +269,6 @@  syscall_dotrace:
 	b	.Lsyscall_dotrace_cont
 
 syscall_enosys:
-	li	r3,-ENOSYS
 	b	syscall_exit
 	
 syscall_exit_work:
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index f21897b..d82fd0b 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -1775,13 +1775,15 @@  long do_syscall_trace_enter(struct pt_regs *regs)
 	secure_computing_strict(regs->gpr[0]);
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(regs))
+	    tracehook_report_syscall_entry(regs)) {
 		/*
 		 * Tracing decided this syscall should not happen.
 		 * We'll return a bogus call number to get an ENOSYS
 		 * error, but leave the original number in regs->gpr[0].
 		 */
 		ret = -1L;
+		syscall_set_return_value(current, regs, -ENOSYS, 0);
+	}
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
 		trace_sys_enter(regs, regs->gpr[0]);