Message ID | 20231110110348.1815612-10-benjamin@sipsolutions.net |
---|---|
State | Accepted |
Headers | show |
Series | General cleanups and fixes from SECCOMP patchset | expand |
On Fri, Nov 10, 2023 at 12:03 PM <benjamin@sipsolutions.net> wrote: > > From: Benjamin Berg <benjamin@sipsolutions.net> > > These registers are saved/restored together with the other general > registers using ptrace. In arch_set_tls we then just need to set the > register and it will be synced back normally. > > Most of this logic was introduced in commit f355559cf7845 ("[PATCH] uml: > x86_64 thread fixes"). However, at least today we can rely on ptrace to Do you know since when exactly? I don't want to break UML in subtle ways on old kernels.
On Fri, Jan 5, 2024 at 12:05 AM Richard Weinberger <richard.weinberger@gmail.com> wrote: > > Most of this logic was introduced in commit f355559cf7845 ("[PATCH] uml: > > x86_64 thread fixes"). However, at least today we can rely on ptrace to > > Do you know since when exactly? I don't want to break UML in subtle ways > on old kernels. BTW: I have applied the series up to here. Thanks a lot for cleaning up all this. :-)
Hi, On Fri, 2024-01-05 at 00:05 +0100, Richard Weinberger wrote: > On Fri, Nov 10, 2023 at 12:03 PM <benjamin@sipsolutions.net> wrote: > > > > From: Benjamin Berg <benjamin@sipsolutions.net> > > > > These registers are saved/restored together with the other general > > registers using ptrace. In arch_set_tls we then just need to set > > the > > register and it will be synced back normally. > > > > Most of this logic was introduced in commit f355559cf7845 ("[PATCH] > > uml: > > x86_64 thread fixes"). However, at least today we can rely on > > ptrace to > > Do you know since when exactly? I don't want to break UML in subtle ways > on old kernels. To be honest, I don't remember, and I doubt I really understood what I was doing. Anyway, I now found this commit now, which is contained in v2.6.25: commit df5d438e33d7fc914ba9b6e0d6b019a8966c5fcc Author: Roland McGrath <roland@redhat.com> Date: Wed Jan 30 13:30:45 2008 +0100 x86: ptrace fs/gs_base The fs_base and gs_base fields are available in user_regs_struct. But reading these via ptrace (PTRACE_GETREGS or PTRACE_PEEKUSR) does not give a reliably useful value. The thread_struct fields are 0 when do_arch_prctl decided to use a GDT slot instead of MSR_FS_BASE, which it does for a value under 1<<32. This changes ptrace access to fs_base and gs_base to work like PTRACE_ARCH_PRCTL does. That is, it reads the base address that user-mode memory access using the fs/gs instruction prefixes will use, regardless of how it's being implemented in the kernel. The MSR vs GDT is an implementation detail that is pretty much hidden from userland in the actual using, and there is no reason that ptrace should give the internal implementation picture rather than the user-mode semantic picture. In the case of setting the value, this can implicitly change the fsindex/gsindex value (also separately in user_regs_struct), which is what happens when the thread calls arch_prctl itself. In a PTRACE_SETREGS, the fs_base change will come after the fsindex change due to the order of the struct, and so a change the debugger made to fs_base will have the effect intended, another part of the user_regs_struct will now differ when read back from what the debugger wrote. This makes PTRACE_ARCH_PRCTL obsolete. We could consider declaring it deprecated and removing it one day, though there is no hurry. For the foreseeable future, debuggers have to assume an old kernel that does not report reliable fs_base/gs_base values in user_regs_struct and stick to PTRACE_ARCH_PRCTL anyway. I think the last paragraph is quite straight forward in saying that we do not need anything special anymore. Note that the original commit adding the code pre-dates this commit by about a year, i.e.: commit f355559cf78455ed6be103b020e4b800230c64eb Author: Jeff Dike <jdike@addtoit.com> Date: Sat Feb 10 01:44:29 2007 -0800 [PATCH] uml: x86_64 thread fixes x86_64 needs some TLS fixes. What was missing was remembering the child thread id during clone and stuffing it into the child during each context switch. The %fs value is stored separately in the thread structure since the host controls what effect it has on the actual register file. The host also needs to store it in its own thread struct, so we need the value kept outside the register file. arch_prctl_skas was fixed to call PTRACE_ARCH_PRCTL appropriately. There is some saving and restoring of registers in the ARCH_SET_* cases so that the correct set of registers are changed on the host and restored to the process when it runs again. There was some shuffling around in the ptrace code at some point, eventually switching from doing an arch_prctl call to setting things more directly. But I don't think that matters to us. Benjamin
----- Ursprüngliche Mail ----- > Von: "Benjamin Berg" <benjamin@sipsolutions.net> > On Fri, 2024-01-05 at 00:05 +0100, Richard Weinberger wrote: >> On Fri, Nov 10, 2023 at 12:03 PM <benjamin@sipsolutions.net> wrote: >> > >> > From: Benjamin Berg <benjamin@sipsolutions.net> >> > >> > These registers are saved/restored together with the other general >> > registers using ptrace. In arch_set_tls we then just need to set >> > the >> > register and it will be synced back normally. >> > >> > Most of this logic was introduced in commit f355559cf7845 ("[PATCH] >> > uml: >> > x86_64 thread fixes"). However, at least today we can rely on >> > ptrace to >> >> Do you know since when exactly? I don't want to break UML in subtle ways >> on old kernels. > > To be honest, I don't remember, and I doubt I really understood what I > was doing. > > Anyway, I now found this commit now, which is contained in v2.6.25: Okay, 2.6.26 is way older than anything supported. So I'll apply our patch. :-) Thanks, //richard
diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h index 0df646c6651e..aff8906304ea 100644 --- a/arch/um/include/shared/os.h +++ b/arch/um/include/shared/os.h @@ -323,9 +323,6 @@ extern void sigio_broken(int fd); extern int __add_sigio_fd(int fd); extern int __ignore_sigio_fd(int fd); -/* prctl.c */ -extern int os_arch_prctl(int pid, int option, unsigned long *arg2); - /* tty.c */ extern int get_pty(void); diff --git a/arch/x86/um/asm/elf.h b/arch/x86/um/asm/elf.h index 6523eb7c3bd1..6052200fe925 100644 --- a/arch/x86/um/asm/elf.h +++ b/arch/x86/um/asm/elf.h @@ -168,8 +168,8 @@ do { \ (pr_reg)[18] = (_regs)->regs.gp[18]; \ (pr_reg)[19] = (_regs)->regs.gp[19]; \ (pr_reg)[20] = (_regs)->regs.gp[20]; \ - (pr_reg)[21] = current->thread.arch.fs; \ - (pr_reg)[22] = 0; \ + (pr_reg)[21] = (_regs)->regs.gp[21]; \ + (pr_reg)[22] = (_regs)->regs.gp[22]; \ (pr_reg)[23] = 0; \ (pr_reg)[24] = 0; \ (pr_reg)[25] = 0; \ diff --git a/arch/x86/um/asm/processor_64.h b/arch/x86/um/asm/processor_64.h index 1ef9c21877bc..f90159508936 100644 --- a/arch/x86/um/asm/processor_64.h +++ b/arch/x86/um/asm/processor_64.h @@ -10,13 +10,11 @@ struct arch_thread { unsigned long debugregs[8]; int debugregs_seq; - unsigned long fs; struct faultinfo faultinfo; }; #define INIT_ARCH_THREAD { .debugregs = { [ 0 ... 7 ] = 0 }, \ .debugregs_seq = 0, \ - .fs = 0, \ .faultinfo = { 0, 0, 0 } } #define STACKSLOTS_PER_LINE 4 @@ -28,7 +26,6 @@ static inline void arch_flush_thread(struct arch_thread *thread) static inline void arch_copy_thread(struct arch_thread *from, struct arch_thread *to) { - to->fs = from->fs; } #define current_sp() ({ void *sp; __asm__("movq %%rsp, %0" : "=r" (sp) : ); sp; }) diff --git a/arch/x86/um/os-Linux/Makefile b/arch/x86/um/os-Linux/Makefile index ae169125d03f..5249bbc30dcd 100644 --- a/arch/x86/um/os-Linux/Makefile +++ b/arch/x86/um/os-Linux/Makefile @@ -6,7 +6,6 @@ obj-y = registers.o task_size.o mcontext.o obj-$(CONFIG_X86_32) += tls.o -obj-$(CONFIG_64BIT) += prctl.o USER_OBJS := $(obj-y) diff --git a/arch/x86/um/os-Linux/prctl.c b/arch/x86/um/os-Linux/prctl.c deleted file mode 100644 index 8431e87ac333..000000000000 --- a/arch/x86/um/os-Linux/prctl.c +++ /dev/null @@ -1,12 +0,0 @@ -/* - * Copyright (C) 2007 Jeff Dike (jdike@{addtoit.com,linux.intel.com}) - * Licensed under the GPL - */ - -#include <sys/ptrace.h> -#include <asm/ptrace.h> - -int os_arch_prctl(int pid, int option, unsigned long *arg2) -{ - return ptrace(PTRACE_ARCH_PRCTL, pid, (unsigned long) arg2, option); -} diff --git a/arch/x86/um/syscalls_64.c b/arch/x86/um/syscalls_64.c index 27b29ae6c471..6a00a28c9cca 100644 --- a/arch/x86/um/syscalls_64.c +++ b/arch/x86/um/syscalls_64.c @@ -16,60 +16,24 @@ long arch_prctl(struct task_struct *task, int option, unsigned long __user *arg2) { - unsigned long *ptr = arg2, tmp; - long ret; - int pid = task->mm->context.id.u.pid; - - /* - * With ARCH_SET_FS (and ARCH_SET_GS is treated similarly to - * be safe), we need to call arch_prctl on the host because - * setting %fs may result in something else happening (like a - * GDT or thread.fs being set instead). So, we let the host - * fiddle the registers and thread struct and restore the - * registers afterwards. - * - * So, the saved registers are stored to the process (this - * needed because a stub may have been the last thing to run), - * arch_prctl is run on the host, then the registers are read - * back. - */ - switch (option) { - case ARCH_SET_FS: - case ARCH_SET_GS: - ret = restore_pid_registers(pid, ¤t->thread.regs.regs); - if (ret) - return ret; - break; - case ARCH_GET_FS: - case ARCH_GET_GS: - /* - * With these two, we read to a local pointer and - * put_user it to the userspace pointer that we were - * given. If addr isn't valid (because it hasn't been - * faulted in or is just bogus), we want put_user to - * fault it in (or return -EFAULT) instead of having - * the host return -EFAULT. - */ - ptr = &tmp; - } - - ret = os_arch_prctl(pid, option, ptr); - if (ret) - return ret; + long ret = -EINVAL; switch (option) { case ARCH_SET_FS: - current->thread.arch.fs = (unsigned long) ptr; - ret = save_registers(pid, ¤t->thread.regs.regs); + current->thread.regs.regs.gp[FS_BASE / sizeof(unsigned long)] = + (unsigned long) arg2; + ret = 0; break; case ARCH_SET_GS: - ret = save_registers(pid, ¤t->thread.regs.regs); + current->thread.regs.regs.gp[GS_BASE / sizeof(unsigned long)] = + (unsigned long) arg2; + ret = 0; break; case ARCH_GET_FS: - ret = put_user(tmp, arg2); + ret = put_user(current->thread.regs.regs.gp[FS_BASE / sizeof(unsigned long)], arg2); break; case ARCH_GET_GS: - ret = put_user(tmp, arg2); + ret = put_user(current->thread.regs.regs.gp[GS_BASE / sizeof(unsigned long)], arg2); break; } @@ -83,10 +47,10 @@ SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2) void arch_switch_to(struct task_struct *to) { - if ((to->thread.arch.fs == 0) || (to->mm == NULL)) - return; - - arch_prctl(to, ARCH_SET_FS, (void __user *) to->thread.arch.fs); + /* + * Nothing needs to be done on x86_64. + * The FS_BASE/GS_BASE registers are saved in the ptrace register set. + */ } SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len, diff --git a/arch/x86/um/tls_64.c b/arch/x86/um/tls_64.c index ebd3855d9b13..c51a613f6f5c 100644 --- a/arch/x86/um/tls_64.c +++ b/arch/x86/um/tls_64.c @@ -12,7 +12,7 @@ int arch_set_tls(struct task_struct *t, unsigned long tls) * If CLONE_SETTLS is set, we need to save the thread id * so it can be set during context switches. */ - t->thread.arch.fs = tls; + t->thread.regs.regs.gp[FS_BASE / sizeof(unsigned long)] = tls; return 0; }