Message ID | 151571798296.27429.7166552848688034184.stgit@dwillia2-desk3.amr.corp.intel.com |
---|---|
Headers | show |
Series | prevent bounds-check bypass via speculative execution | expand |
On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: > > This series incorporates Mark Rutland's latest ARM changes and adds > the x86 specific implementation of 'ifence_array_ptr'. That ifence > based approach is provided as an opt-in fallback, but the default > mitigation, '__array_ptr', uses a 'mask' approach that removes > conditional branches instructions, and otherwise aims to redirect > speculation to use a NULL pointer rather than a user controlled value. Do you have any performance numbers and perhaps example code generation? Is this noticeable? Are there any microbenchmarks showing the difference between lfence use and the masking model? Having both seems good for testing, but wouldn't we want to pick one in the end? Also, I do think that there is one particular array load that would seem to be pretty obvious: the system call function pointer array. Yes, yes, the actual call is now behind a retpoline, but that protects against a speculative BTB access, it's not obvious that it protects against the mispredict of the __NR_syscall_max comparison in arch/x86/entry/entry_64.S. The act of fetching code is a kind of read too. And retpoline protects against BTB stuffing etc, but what if the _actual_ system call function address is wrong (due to mis-prediction of the system call index check)? Should the array access in entry_SYSCALL_64_fastpath be made to use the masking approach? Linus
On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> >> This series incorporates Mark Rutland's latest ARM changes and adds >> the x86 specific implementation of 'ifence_array_ptr'. That ifence >> based approach is provided as an opt-in fallback, but the default >> mitigation, '__array_ptr', uses a 'mask' approach that removes >> conditional branches instructions, and otherwise aims to redirect >> speculation to use a NULL pointer rather than a user controlled value. > > Do you have any performance numbers and perhaps example code > generation? Is this noticeable? Are there any microbenchmarks showing > the difference between lfence use and the masking model? I don't have performance numbers, but here's a sample code generation from __fcheck_files, where the 'and; lea; and' sequence is portion of array_ptr() after the mask generation with 'sbb'. fdp = array_ptr(fdt->fd, fd, fdt->max_fds); 8e7: 8b 02 mov (%rdx),%eax 8e9: 48 39 c7 cmp %rax,%rdi 8ec: 48 19 c9 sbb %rcx,%rcx 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax 8f3: 48 89 fe mov %rdi,%rsi 8f6: 48 21 ce and %rcx,%rsi 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax 8fd: 48 21 c8 and %rcx,%rax > Having both seems good for testing, but wouldn't we want to pick one in the end? I was thinking we'd keep it as a 'just in case' sort of thing, at least until the 'probably safe' assumption of the 'mask' approach has more time to settle out. > > Also, I do think that there is one particular array load that would > seem to be pretty obvious: the system call function pointer array. > > Yes, yes, the actual call is now behind a retpoline, but that protects > against a speculative BTB access, it's not obvious that it protects > against the mispredict of the __NR_syscall_max comparison in > arch/x86/entry/entry_64.S. > > The act of fetching code is a kind of read too. And retpoline protects > against BTB stuffing etc, but what if the _actual_ system call > function address is wrong (due to mis-prediction of the system call > index check)? > > Should the array access in entry_SYSCALL_64_fastpath be made to use > the masking approach? I'll take a look. I'm firmly in the 'patch first / worry later' stance on these investigations.
Do you think that the appropriate patches could be copied to the appropriate people please? On Thu, Jan 11, 2018 at 04:46:24PM -0800, Dan Williams wrote: > Changes since v1 [1]: > * fixup the ifence definition to use alternative_2 per recent AMD > changes in tip/x86/pti (Tom) > > * drop 'nospec_ptr' (Linus, Mark) > > * rename 'nospec_array_ptr' to 'array_ptr' (Alexei) > > * rename 'nospec_barrier' to 'ifence' (Peter, Ingo) > > * clean up occasions of 'variable assignment in if()' (Sergei, Stephen) > > * make 'array_ptr' use a mask instead of an architectural ifence by > default (Linus, Alexei) > > * provide a command line and compile-time opt-in to the ifence > mechanism, if an architecture provides 'ifence_array_ptr'. > > * provide an optimized mask generation helper, 'array_ptr_mask', for > x86 (Linus) > > * move 'get_user' hardening from '__range_not_ok' to '__uaccess_begin' > (Linus) > > * drop "Thermal/int340x: prevent bounds-check..." since userspace does > not have arbitrary control over the 'trip' index (Srinivas) > > * update the changelog of "net: mpls: prevent bounds-check..." and keep > it in the series to continue the debate about Spectre hygiene patches. > (Eric). > > * record a reviewed-by from Laurent on "[media] uvcvideo: prevent > bounds-check..." > > * update the cover letter > > [1]: https://lwn.net/Articles/743376/ > > --- > > Quoting Mark's original RFC: > > "Recently, Google Project Zero discovered several classes of attack > against speculative execution. One of these, known as variant-1, allows > explicit bounds checks to be bypassed under speculation, providing an > arbitrary read gadget. Further details can be found on the GPZ blog [2] > and the Documentation patch in this series." > > This series incorporates Mark Rutland's latest ARM changes and adds > the x86 specific implementation of 'ifence_array_ptr'. That ifence > based approach is provided as an opt-in fallback, but the default > mitigation, '__array_ptr', uses a 'mask' approach that removes > conditional branches instructions, and otherwise aims to redirect > speculation to use a NULL pointer rather than a user controlled value. > > The mask is generated by the following from Alexei, and Linus: > > mask = ~(long)(_i | (_s - 1 - _i)) >> (BITS_PER_LONG - 1); > > ...and Linus provided an optimized mask generation helper for x86: > > asm ("cmpq %1,%2; sbbq %0,%0;" > :"=r" (mask) > :"r"(sz),"r" (idx) > :"cc"); > > The 'array_ptr' mechanism can be switched between 'mask' and 'ifence' > via the spectre_v1={mask,ifence} command line option, and the > compile-time default is set by selecting either CONFIG_SPECTRE1_MASK or > CONFIG_SPECTRE1_IFENCE. > > The 'array_ptr' infrastructure is the primary focus this patch set. The > individual patches that perform 'array_ptr' conversions are a point in > time (i.e. earlier kernel, early analysis tooling, x86 only etc...) > start at finding some of these gadgets. > > Another consideration for reviewing these patches is the 'hygiene' > argument. When a patch refers to hygiene it is concerned with stopping > speculation on an unconstrained or insufficiently constrained pointer > value under userspace control. That by itself is not sufficient for > attack (per current understanding) [3], but it is a necessary > pre-condition. So 'hygiene' refers to cleaning up those suspect > pointers regardless of whether they are usable as a gadget. > > These patches are also be available via the 'nospec-v2' git branch > here: > > git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v2 > > Note that the BPF fix for Spectre variant1 is merged in the bpf.git > tree [4], and is not included in this branch. > > [2]: https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-with-side.html > [3]: https://spectreattack.com/spectre.pdf > [4]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=b2157399cc98 > > --- > > Dan Williams (16): > x86: implement ifence() > x86: implement ifence_array_ptr() and array_ptr_mask() > asm-generic/barrier: mask speculative execution flows > x86: introduce __uaccess_begin_nospec and ASM_IFENCE > x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths > ipv6: prevent bounds-check bypass via speculative execution > ipv4: prevent bounds-check bypass via speculative execution > vfs, fdtable: prevent bounds-check bypass via speculative execution > userns: prevent bounds-check bypass via speculative execution > udf: prevent bounds-check bypass via speculative execution > [media] uvcvideo: prevent bounds-check bypass via speculative execution > carl9170: prevent bounds-check bypass via speculative execution > p54: prevent bounds-check bypass via speculative execution > qla2xxx: prevent bounds-check bypass via speculative execution > cw1200: prevent bounds-check bypass via speculative execution > net: mpls: prevent bounds-check bypass via speculative execution > > Mark Rutland (3): > Documentation: document array_ptr > arm64: implement ifence_array_ptr() > arm: implement ifence_array_ptr() > > Documentation/speculation.txt | 142 ++++++++++++++++++++++++++++++ > arch/arm/Kconfig | 1 > arch/arm/include/asm/barrier.h | 24 +++++ > arch/arm64/Kconfig | 1 > arch/arm64/include/asm/barrier.h | 24 +++++ > arch/x86/Kconfig | 3 + > arch/x86/include/asm/barrier.h | 46 ++++++++++ > arch/x86/include/asm/msr.h | 3 - > arch/x86/include/asm/smap.h | 4 + > arch/x86/include/asm/uaccess.h | 16 +++ > arch/x86/include/asm/uaccess_32.h | 6 + > arch/x86/include/asm/uaccess_64.h | 12 +-- > arch/x86/lib/copy_user_64.S | 3 + > arch/x86/lib/usercopy_32.c | 8 +- > drivers/media/usb/uvc/uvc_v4l2.c | 9 +- > drivers/net/wireless/ath/carl9170/main.c | 7 + > drivers/net/wireless/intersil/p54/main.c | 9 +- > drivers/net/wireless/st/cw1200/sta.c | 11 +- > drivers/net/wireless/st/cw1200/wsm.h | 4 - > drivers/scsi/qla2xxx/qla_mr.c | 17 ++-- > fs/udf/misc.c | 40 +++++--- > include/linux/fdtable.h | 7 + > include/linux/nospec.h | 71 +++++++++++++++ > kernel/Kconfig.nospec | 31 +++++++ > kernel/Makefile | 1 > kernel/nospec.c | 52 +++++++++++ > kernel/user_namespace.c | 11 +- > lib/Kconfig | 3 + > net/ipv4/raw.c | 10 +- > net/ipv6/raw.c | 10 +- > net/mpls/af_mpls.c | 12 +-- > 31 files changed, 521 insertions(+), 77 deletions(-) > create mode 100644 Documentation/speculation.txt > create mode 100644 include/linux/nospec.h > create mode 100644 kernel/Kconfig.nospec > create mode 100644 kernel/nospec.c >
On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > Should the array access in entry_SYSCALL_64_fastpath be made to use > the masking approach? That one has a bounds check for an inline constant. cmpq $__NR_syscall_max, %rax so should be safe. The classic Spectre variant #1 code sequence is: int array_size; if (x < array_size) { something with array[x] } which runs into problems because the array_size variable may not be in cache, and while the CPU core is waiting for the value it speculates inside the "if" body. The syscall entry is more like: #define ARRAY_SIZE 10 if (x < ARRAY_SIZE) { something with array[x] } Here there isn't any reason for speculation. The core has the value of 'x' in a register and the upper bound encoded into the "cmp" instruction. Both are right there, no waiting, no speculation. -Tony
On Fri, Jan 12, 2018 at 4:15 PM, Tony Luck <tony.luck@gmail.com> wrote: > > Here there isn't any reason for speculation. The core has the > value of 'x' in a register and the upper bound encoded into the > "cmp" instruction. Both are right there, no waiting, no speculation. So this is an argument I haven't seen before (although it was brought up in private long ago), but that is very relevant: the actual scope and depth of speculation. Your argument basically depends on just what gets speculated, and on the _actual_ order of execution. So your argument depends on "the uarch will actually run the code in order if there are no events that block the pipeline". Or at least it depends on a certain latency of the killing of any OoO execution being low enough that the cache access doesn't even begin. I realize that that is very much a particular microarchitectural detail, but it's actually a *big* deal. Do we have a set of rules for what is not a worry, simply because the speculated accesses get killed early enough? Apparently "test a register value against a constant" is good enough, assuming that register is also needed for the address of the access. Linus
On Sat, Jan 13, 2018 at 10:51 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Jan 12, 2018 at 4:15 PM, Tony Luck <tony.luck@gmail.com> wrote: > So your argument depends on "the uarch will actually run the code in > order if there are no events that block the pipeline". And might be bogus ... I'm a software person not a u-arch expert. That sounded good in my head, but the level of parallelism may be greater than I can imagine. > Or at least it depends on a certain latency of the killing of any OoO > execution being low enough that the cache access doesn't even begin. > > I realize that that is very much a particular microarchitectural > detail, but it's actually a *big* deal. Do we have a set of rules for > what is not a worry, simply because the speculated accesses get killed > early enough? > > Apparently "test a register value against a constant" is good enough, > assuming that register is also needed for the address of the access. People who do understand this are working on what can be guaranteed. For now don't make big plans based on my ramblings. -Tony
Hi Dan, Linus, On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote: > On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: > >> > >> This series incorporates Mark Rutland's latest ARM changes and adds > >> the x86 specific implementation of 'ifence_array_ptr'. That ifence > >> based approach is provided as an opt-in fallback, but the default > >> mitigation, '__array_ptr', uses a 'mask' approach that removes > >> conditional branches instructions, and otherwise aims to redirect > >> speculation to use a NULL pointer rather than a user controlled value. > > > > Do you have any performance numbers and perhaps example code > > generation? Is this noticeable? Are there any microbenchmarks showing > > the difference between lfence use and the masking model? > > I don't have performance numbers, but here's a sample code generation > from __fcheck_files, where the 'and; lea; and' sequence is portion of > array_ptr() after the mask generation with 'sbb'. > > fdp = array_ptr(fdt->fd, fd, fdt->max_fds); > 8e7: 8b 02 mov (%rdx),%eax > 8e9: 48 39 c7 cmp %rax,%rdi > 8ec: 48 19 c9 sbb %rcx,%rcx > 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax > 8f3: 48 89 fe mov %rdi,%rsi > 8f6: 48 21 ce and %rcx,%rsi > 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax > 8fd: 48 21 c8 and %rcx,%rax > > > > Having both seems good for testing, but wouldn't we want to pick one in the end? > > I was thinking we'd keep it as a 'just in case' sort of thing, at > least until the 'probably safe' assumption of the 'mask' approach has > more time to settle out. From the arm64 side, the only concern I have (and this actually applies to our CSDB sequence as well) is the calculation of the array size by the caller. As Linus mentioned at the end of [1], if the determination of the size argument is based on a conditional branch, then masking doesn't help because you bound within the wrong range under speculation. We ran into this when trying to use masking to protect our uaccess routines where the conditional bound is either KERNEL_DS or USER_DS. It's possible that a prior conditional set_fs(KERNEL_DS) could defeat the masking and so we'd need to throw some heavy barriers in set_fs to make it robust. The question then is whether or not we're likely to run into this elsewhere, and I don't have a good feel for that. Will [1] http://lkml.kernel.org/r/CA+55aFz0tsreoa=5Ud2noFCpng-dizLBhT9WU9asyhpLfjdcYA@mail.gmail.com
On Thu, Jan 18, 2018 at 5:18 AM, Will Deacon <will.deacon@arm.com> wrote: > Hi Dan, Linus, > > On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote: >> On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds >> <torvalds@linux-foundation.org> wrote: >> > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: >> >> >> >> This series incorporates Mark Rutland's latest ARM changes and adds >> >> the x86 specific implementation of 'ifence_array_ptr'. That ifence >> >> based approach is provided as an opt-in fallback, but the default >> >> mitigation, '__array_ptr', uses a 'mask' approach that removes >> >> conditional branches instructions, and otherwise aims to redirect >> >> speculation to use a NULL pointer rather than a user controlled value. >> > >> > Do you have any performance numbers and perhaps example code >> > generation? Is this noticeable? Are there any microbenchmarks showing >> > the difference between lfence use and the masking model? >> >> I don't have performance numbers, but here's a sample code generation >> from __fcheck_files, where the 'and; lea; and' sequence is portion of >> array_ptr() after the mask generation with 'sbb'. >> >> fdp = array_ptr(fdt->fd, fd, fdt->max_fds); >> 8e7: 8b 02 mov (%rdx),%eax >> 8e9: 48 39 c7 cmp %rax,%rdi >> 8ec: 48 19 c9 sbb %rcx,%rcx >> 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax >> 8f3: 48 89 fe mov %rdi,%rsi >> 8f6: 48 21 ce and %rcx,%rsi >> 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax >> 8fd: 48 21 c8 and %rcx,%rax >> >> >> > Having both seems good for testing, but wouldn't we want to pick one in the end? >> >> I was thinking we'd keep it as a 'just in case' sort of thing, at >> least until the 'probably safe' assumption of the 'mask' approach has >> more time to settle out. > > From the arm64 side, the only concern I have (and this actually applies to > our CSDB sequence as well) is the calculation of the array size by the > caller. As Linus mentioned at the end of [1], if the determination of the > size argument is based on a conditional branch, then masking doesn't help > because you bound within the wrong range under speculation. > > We ran into this when trying to use masking to protect our uaccess routines > where the conditional bound is either KERNEL_DS or USER_DS. It's possible > that a prior conditional set_fs(KERNEL_DS) could defeat the masking and so > we'd need to throw some heavy barriers in set_fs to make it robust. At least in the conditional mask case near set_fs() usage the approach we are taking is to use a barrier. I.e. the following guidance from Linus: "Basically, the rule is trivial: find all 'stac' users, and use address masking if those users already integrate the limit check, and lfence they don't." ...which translates to narrow the pointer for get_user() and use a barrier for __get_user().
On Thu, Jan 18, 2018 at 08:58:08AM -0800, Dan Williams wrote: > On Thu, Jan 18, 2018 at 5:18 AM, Will Deacon <will.deacon@arm.com> wrote: > > On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote: > >> On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds > >> <torvalds@linux-foundation.org> wrote: > >> > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote: > >> >> > >> >> This series incorporates Mark Rutland's latest ARM changes and adds > >> >> the x86 specific implementation of 'ifence_array_ptr'. That ifence > >> >> based approach is provided as an opt-in fallback, but the default > >> >> mitigation, '__array_ptr', uses a 'mask' approach that removes > >> >> conditional branches instructions, and otherwise aims to redirect > >> >> speculation to use a NULL pointer rather than a user controlled value. > >> > > >> > Do you have any performance numbers and perhaps example code > >> > generation? Is this noticeable? Are there any microbenchmarks showing > >> > the difference between lfence use and the masking model? > >> > >> I don't have performance numbers, but here's a sample code generation > >> from __fcheck_files, where the 'and; lea; and' sequence is portion of > >> array_ptr() after the mask generation with 'sbb'. > >> > >> fdp = array_ptr(fdt->fd, fd, fdt->max_fds); > >> 8e7: 8b 02 mov (%rdx),%eax > >> 8e9: 48 39 c7 cmp %rax,%rdi > >> 8ec: 48 19 c9 sbb %rcx,%rcx > >> 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax > >> 8f3: 48 89 fe mov %rdi,%rsi > >> 8f6: 48 21 ce and %rcx,%rsi > >> 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax > >> 8fd: 48 21 c8 and %rcx,%rax > >> > >> > >> > Having both seems good for testing, but wouldn't we want to pick one in the end? > >> > >> I was thinking we'd keep it as a 'just in case' sort of thing, at > >> least until the 'probably safe' assumption of the 'mask' approach has > >> more time to settle out. > > > > From the arm64 side, the only concern I have (and this actually applies to > > our CSDB sequence as well) is the calculation of the array size by the > > caller. As Linus mentioned at the end of [1], if the determination of the > > size argument is based on a conditional branch, then masking doesn't help > > because you bound within the wrong range under speculation. > > > > We ran into this when trying to use masking to protect our uaccess routines > > where the conditional bound is either KERNEL_DS or USER_DS. It's possible > > that a prior conditional set_fs(KERNEL_DS) could defeat the masking and so > > we'd need to throw some heavy barriers in set_fs to make it robust. > > At least in the conditional mask case near set_fs() usage the approach > we are taking is to use a barrier. I.e. the following guidance from > Linus: > > "Basically, the rule is trivial: find all 'stac' users, and use address > masking if those users already integrate the limit check, and lfence > they don't." > > ...which translates to narrow the pointer for get_user() and use a > barrier for __get_user(). Great, that matches my thinking re set_fs but I'm still worried about finding all the places where the bound is conditional for other users of the macro. Then again, finding the places that need this macro in the first place is tough enough so perhaps analysing the bound calculation doesn't make it much worse. Will
Hi Will, On Thursday, 18 January 2018 19:05:47 EET Will Deacon wrote: > On Thu, Jan 18, 2018 at 08:58:08AM -0800, Dan Williams wrote: > > On Thu, Jan 18, 2018 at 5:18 AM, Will Deacon wrote: > >> On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote: > >>> On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds wrote: > >>>> On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams wrote: > >>>>> This series incorporates Mark Rutland's latest ARM changes and adds > >>>>> the x86 specific implementation of 'ifence_array_ptr'. That ifence > >>>>> based approach is provided as an opt-in fallback, but the default > >>>>> mitigation, '__array_ptr', uses a 'mask' approach that removes > >>>>> conditional branches instructions, and otherwise aims to redirect > >>>>> speculation to use a NULL pointer rather than a user controlled > >>>>> value. > >>>> > >>>> Do you have any performance numbers and perhaps example code > >>>> generation? Is this noticeable? Are there any microbenchmarks showing > >>>> the difference between lfence use and the masking model? > >>> > >>> I don't have performance numbers, but here's a sample code generation > >>> from __fcheck_files, where the 'and; lea; and' sequence is portion of > >>> array_ptr() after the mask generation with 'sbb'. > >>> > >>> fdp = array_ptr(fdt->fd, fd, fdt->max_fds); > >>> > >>> 8e7: 8b 02 mov (%rdx),%eax > >>> 8e9: 48 39 c7 cmp %rax,%rdi > >>> 8ec: 48 19 c9 sbb %rcx,%rcx > >>> 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax > >>> 8f3: 48 89 fe mov %rdi,%rsi > >>> 8f6: 48 21 ce and %rcx,%rsi > >>> 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax > >>> 8fd: 48 21 c8 and %rcx,%rax > >>>> > >>>> Having both seems good for testing, but wouldn't we want to pick one > >>>> in the end? > >>> > >>> I was thinking we'd keep it as a 'just in case' sort of thing, at > >>> least until the 'probably safe' assumption of the 'mask' approach has > >>> more time to settle out. > >> > >> From the arm64 side, the only concern I have (and this actually applies > >> to our CSDB sequence as well) is the calculation of the array size by > >> the caller. As Linus mentioned at the end of [1], if the determination > >> of the size argument is based on a conditional branch, then masking > >> doesn't help because you bound within the wrong range under speculation. > >> > >> We ran into this when trying to use masking to protect our uaccess > >> routines where the conditional bound is either KERNEL_DS or USER_DS. > >> It's possible that a prior conditional set_fs(KERNEL_DS) could defeat > >> the masking and so we'd need to throw some heavy barriers in set_fs to > >> make it robust. > > > > At least in the conditional mask case near set_fs() usage the approach > > we are taking is to use a barrier. I.e. the following guidance from > > Linus: > > > > "Basically, the rule is trivial: find all 'stac' users, and use address > > masking if those users already integrate the limit check, and lfence > > they don't." > > > > ...which translates to narrow the pointer for get_user() and use a > > barrier for __get_user(). > > Great, that matches my thinking re set_fs but I'm still worried about > finding all the places where the bound is conditional for other users > of the macro. Then again, finding the places that need this macro in the > first place is tough enough so perhaps analysing the bound calculation > doesn't make it much worse. It might not now, but if the bound calculation changes later, I'm pretty sure we'll forget to update the speculation barrier macro at least in some cases. Without the help of static (or possibly dynamic) code analysis I think we're bound to reintroduce problems over time, but that's true for finding places where the barrier is needed, not just for barrier selection based on bound calculation.