Message ID | 20210504134250.890401-1-mpe@ellerman.id.au (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/2] powerpc/64s: Fix crashes when toggling stf barrier | expand |
Related | show |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch powerpc/merge (134b5c8a49b594ff6cfb4ea1a92400bb382b46d2) |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 34 lines checked |
snowpatch_ozlabs/needsstable | success | Patch is tagged for stable |
Michael Ellerman <mpe@ellerman.id.au> writes: > -void do_stf_barrier_fixups(enum stf_barrier_type types) > +static int __do_stf_barrier_fixups(void *data) > { > + enum stf_barrier_type types = (enum stf_barrier_type)data; > + > do_stf_entry_barrier_fixups(types); > do_stf_exit_barrier_fixups(types); > + > + return 0; > +} > + > +void do_stf_barrier_fixups(enum stf_barrier_type types) > +{ > + /* > + * The call to the fallback entry flush, and the fallback/sync-ori exit > + * flush can not be safely patched in/out while other CPUs are executing > + * them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs > + * spin in the stop machine core with interrupts hard disabled. > + */ > + stop_machine_cpuslocked(__do_stf_barrier_fixups, (void *)types, NULL); Would it be preferable to avoid the explicit casts: stop_machine_cpuslocked(__do_stf_barrier_fixups, &types, NULL); ... static int __do_stf_barrier_fixups(void *data) { enum stf_barrier_type *types = data; do_stf_entry_barrier_fixups(*types); do_stf_exit_barrier_fixups(*types); ? post_mobility_fixup() does cpus_read_unlock() before calling pseries_setup_security_mitigations(), I think that will need to be changed?
Nathan Lynch <nathanl@linux.ibm.com> writes: > Michael Ellerman <mpe@ellerman.id.au> writes: >> -void do_stf_barrier_fixups(enum stf_barrier_type types) >> +static int __do_stf_barrier_fixups(void *data) >> { >> + enum stf_barrier_type types = (enum stf_barrier_type)data; >> + >> do_stf_entry_barrier_fixups(types); >> do_stf_exit_barrier_fixups(types); >> + >> + return 0; >> +} >> + >> +void do_stf_barrier_fixups(enum stf_barrier_type types) >> +{ >> + /* >> + * The call to the fallback entry flush, and the fallback/sync-ori exit >> + * flush can not be safely patched in/out while other CPUs are executing >> + * them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs >> + * spin in the stop machine core with interrupts hard disabled. >> + */ >> + stop_machine_cpuslocked(__do_stf_barrier_fixups, (void *)types, NULL); > > Would it be preferable to avoid the explicit casts: > > stop_machine_cpuslocked(__do_stf_barrier_fixups, &types, NULL); > > ... > > static int __do_stf_barrier_fixups(void *data) > { > enum stf_barrier_type *types = data; > > do_stf_entry_barrier_fixups(*types); > do_stf_exit_barrier_fixups(*types); > > ? Yes. That will also avoid the pesky issue of undefined behaviour :facepalm: > post_mobility_fixup() does cpus_read_unlock() before calling > pseries_setup_security_mitigations(), I think that will need to be > changed? I don't think so. I'm using stop_machine_cpuslocked() but that's because I'm a goose and forgot to switch to stop_machine() after I reworked the code to not take cpus_read_lock() by hand. I really shouldn't send patches after 11pm. I don't think it's important to keep the cpus lock held from where we take it in post_mobility_fixup(). If some CPUs come or go between there and here that's fine. I'll send a v2. cheers
Michael Ellerman <mpe@ellerman.id.au> writes: > Nathan Lynch <nathanl@linux.ibm.com> writes: >> post_mobility_fixup() does cpus_read_unlock() before calling >> pseries_setup_security_mitigations(), I think that will need to be >> changed? > > I don't think so. > > I'm using stop_machine_cpuslocked() but that's because I'm a goose and > forgot to switch to stop_machine() after I reworked the code to not take > cpus_read_lock() by hand. I really shouldn't send patches after 11pm. > > I don't think it's important to keep the cpus lock held from where we > take it in post_mobility_fixup(). If some CPUs come or go between there > and here that's fine. Yes, agreed.
diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c index 1fd31b4b0e13..8f8c8c98a6ac 100644 --- a/arch/powerpc/lib/feature-fixups.c +++ b/arch/powerpc/lib/feature-fixups.c @@ -14,6 +14,7 @@ #include <linux/string.h> #include <linux/init.h> #include <linux/sched/mm.h> +#include <linux/stop_machine.h> #include <asm/cputable.h> #include <asm/code-patching.h> #include <asm/page.h> @@ -227,11 +228,25 @@ static void do_stf_exit_barrier_fixups(enum stf_barrier_type types) : "unknown"); } - -void do_stf_barrier_fixups(enum stf_barrier_type types) +static int __do_stf_barrier_fixups(void *data) { + enum stf_barrier_type types = (enum stf_barrier_type)data; + do_stf_entry_barrier_fixups(types); do_stf_exit_barrier_fixups(types); + + return 0; +} + +void do_stf_barrier_fixups(enum stf_barrier_type types) +{ + /* + * The call to the fallback entry flush, and the fallback/sync-ori exit + * flush can not be safely patched in/out while other CPUs are executing + * them. So call __do_stf_barrier_fixups() on one CPU while all other CPUs + * spin in the stop machine core with interrupts hard disabled. + */ + stop_machine_cpuslocked(__do_stf_barrier_fixups, (void *)types, NULL); } void do_uaccess_flush_fixups(enum l1d_flush_type types)
The STF (store-to-load forwarding) barrier mitigation can be enabled/disabled at runtime via a debugfs file (stf_barrier), which causes the kernel to patch itself to enable/disable the relevant mitigations. However depending on which mitigation we're using, it may not be safe to do that patching while other CPUs are active. For example the following crash: User access of kernel address (c00000003fff5af0) - exploit attempt? (uid: 0) segfault (11) at c00000003fff5af0 nip 7fff8ad12198 lr 7fff8ad121f8 code 1 code: 40820128 e93c00d0 e9290058 7c292840 40810058 38600000 4bfd9a81 e8410018 code: 2c030006 41810154 3860ffb6 e9210098 <e94d8ff0> 7d295279 39400000 40820a3c Shows that we returned to userspace without restoring the user r13 value, due to executing the partially patched STF exit code. Fix it by doing the patching under stop machine. The CPUs that aren't doing the patching will be spinning in the core of the stop machine logic. That is currently sufficient for our purposes, because none of the patching we do is to that code or anywhere in the vicinity. Fixes: a048a07d7f45 ("powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> --- arch/powerpc/lib/feature-fixups.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)