Message ID | 20200206030900.147032-2-leonardo@linux.ibm.com |
---|---|
State | Not Applicable |
Headers | show |
Series | Introduces new functions for tracking lockless pagetable walks | expand |
Le 06/02/2020 à 04:08, Leonardo Bras a écrit : > It's necessary to track lockless pagetable walks, in order to avoid doing > THP splitting/collapsing during them. > > The default solution is to disable irq before lockless pagetable walks and > enable it after it's finished. > > On code, this means you can find local_irq_disable() and local_irq_enable() > around some pieces of code, usually without comments on why it is needed. > > This patch proposes a set of generic functions to be called before starting > and after finishing a lockless pagetable walk. It is supposed to make clear > that a lockless pagetable walk happens there, and also carries information > on why the irq disable/enable is needed. > > begin_lockless_pgtbl_walk() > Insert before starting any lockless pgtable walk > end_lockless_pgtbl_walk() > Insert after the end of any lockless pgtable walk > (Mostly after the ptep is last used) > > A memory barrier was also added just to make sure there is no speculative > read outside the interrupt disabled area. Other than that, it is not > supposed to have any change of behavior from current code. Is that speculative barrier necessary for all architectures ? Does it impact performance ? Shouldn't this be another patch ? > > It is planned to allow arch-specific versions, so that additional steps can > be added while keeping the code clean. > > Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> > --- > include/asm-generic/pgtable.h | 51 +++++++++++++++++++++++++++++++++++ > 1 file changed, 51 insertions(+) > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index e2e2bef07dd2..8d368d3c0974 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -1222,6 +1222,57 @@ static inline bool arch_has_pfn_modify_check(void) > #endif > #endif > > +#ifndef __HAVE_ARCH_LOCKLESS_PGTBL_WALK_CONTROL > +/* > + * begin_lockless_pgtbl_walk: Must be inserted before a function call that does > + * lockless pagetable walks, such as __find_linux_pte() > + */ > +static inline > +unsigned long begin_lockless_pgtbl_walk(void) What about keeping the same syntax as local_irq_save(), something like: #define begin_lockless_pgtbl_walk(flags) \ do { local_irq_save(flags); smp_mb(); } while (0) > +{ > + unsigned long irq_mask; > + > + /* > + * Interrupts must be disabled during the lockless page table walk. > + * That's because the deleting or splitting involves flushing TLBs, > + * which in turn issues interrupts, that will block when disabled. > + */ > + local_irq_save(irq_mask); > + > + /* > + * This memory barrier pairs with any code that is either trying to > + * delete page tables, or split huge pages. Without this barrier, > + * the page tables could be read speculatively outside of interrupt > + * disabling. > + */ > + smp_mb(); > + > + return irq_mask; > +} > + > +/* > + * end_lockless_pgtbl_walk: Must be inserted after the last use of a pointer > + * returned by a lockless pagetable walk, such as __find_linux_pte() > + */ > +static inline void end_lockless_pgtbl_walk(unsigned long irq_mask) Same #define end_lockless_pgtbl_walk(flags) \ do { smp_mb(); local_irq_restore(flags); } while (0); > +{ > + /* > + * This memory barrier pairs with any code that is either trying to > + * delete page tables, or split huge pages. Without this barrier, > + * the page tables could be read speculatively outside of interrupt > + * disabling. > + */ > + smp_mb(); > + > + /* > + * Interrupts must be disabled during the lockless page table walk. > + * That's because the deleting or splitting involves flushing TLBs, > + * which in turn issues interrupts, that will block when disabled. > + */ > + local_irq_restore(irq_mask); > +} > +#endif > + > /* > * On some architectures it depends on the mm if the p4d/pud or pmd > * layer of the page table hierarchy is folded or not. > Christophe
Hello Christophe, thanks for the feedback! On Thu, 2020-02-06 at 06:54 +0100, Christophe Leroy wrote: > > A memory barrier was also added just to make sure there is no speculative > > read outside the interrupt disabled area. Other than that, it is not > > supposed to have any change of behavior from current code. > > Is that speculative barrier necessary for all architectures ? Does it > impact performance ? Shouldn't this be another patch ? It makes sense, better keep the code as much as possible as it was. If any arch finds this barrier needed, it can implement it's own version of this function (or another patch to add this to generic, if proved to be needed in every arch). > > +#ifndef __HAVE_ARCH_LOCKLESS_PGTBL_WALK_CONTROL > > +/* > > + * begin_lockless_pgtbl_walk: Must be inserted before a function call that does > > + * lockless pagetable walks, such as __find_linux_pte() > > + */ > > +static inline > > +unsigned long begin_lockless_pgtbl_walk(void) > > What about keeping the same syntax as local_irq_save(), something like: > > #define begin_lockless_pgtbl_walk(flags) \ > do { > local_irq_save(flags); > smp_mb(); > } while (0) > Makes sense. But wouldn't inlining have the same code output? Best regards, Leonardo Bras
Hi Leonardo, Thank you for the patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on asm-generic/master paulus-powerpc/kvm-ppc-next linus/master v5.5 next-20200207] [cannot apply to kvm-ppc/kvm-ppc-next] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Leonardo-Bras/Introduces-new-functions-for-tracking-lockless-pagetable-walks/20200207-071035 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: xtensa-common_defconfig (attached as .config) compiler: xtensa-linux-gcc (GCC) 7.5.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.5.0 make.cross ARCH=xtensa If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All errors (new ones prefixed by >>): include/asm-generic/pgtable.h: Assembler messages: include/asm-generic/pgtable.h:1230: Error: unknown opcode or format name 'static' include/asm-generic/pgtable.h:1231: Error: unknown opcode or format name 'unsigned' include/asm-generic/pgtable.h:1233: Error: unknown opcode or format name 'unsigned' include/asm-generic/pgtable.h:1240: Error: unknown opcode or format name 'local_irq_save' include/asm-generic/pgtable.h:1248: Error: unknown opcode or format name 'smp_mb' include/asm-generic/pgtable.h:1250: Error: unknown opcode or format name 'return' >> include/asm-generic/pgtable.h:1251: Error: couldn't find a valid instruction format ops were: include/asm-generic/pgtable.h:1257: Error: unknown opcode or format name 'static' include/asm-generic/pgtable.h:1265: Error: unknown opcode or format name 'smp_mb' include/asm-generic/pgtable.h:1272: Error: unknown opcode or format name 'local_irq_restore' include/asm-generic/pgtable.h:1273: Error: couldn't find a valid instruction format ops were: vim +1251 include/asm-generic/pgtable.h 1224 1225 #ifndef __HAVE_ARCH_LOCKLESS_PGTBL_WALK_CONTROL 1226 /* 1227 * begin_lockless_pgtbl_walk: Must be inserted before a function call that does 1228 * lockless pagetable walks, such as __find_linux_pte() 1229 */ 1230 static inline 1231 unsigned long begin_lockless_pgtbl_walk(void) 1232 { 1233 unsigned long irq_mask; 1234 1235 /* 1236 * Interrupts must be disabled during the lockless page table walk. 1237 * That's because the deleting or splitting involves flushing TLBs, 1238 * which in turn issues interrupts, that will block when disabled. 1239 */ 1240 local_irq_save(irq_mask); 1241 1242 /* 1243 * This memory barrier pairs with any code that is either trying to 1244 * delete page tables, or split huge pages. Without this barrier, 1245 * the page tables could be read speculatively outside of interrupt 1246 * disabling. 1247 */ 1248 smp_mb(); 1249 1250 return irq_mask; > 1251 } 1252 --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index e2e2bef07dd2..8d368d3c0974 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1222,6 +1222,57 @@ static inline bool arch_has_pfn_modify_check(void) #endif #endif +#ifndef __HAVE_ARCH_LOCKLESS_PGTBL_WALK_CONTROL +/* + * begin_lockless_pgtbl_walk: Must be inserted before a function call that does + * lockless pagetable walks, such as __find_linux_pte() + */ +static inline +unsigned long begin_lockless_pgtbl_walk(void) +{ + unsigned long irq_mask; + + /* + * Interrupts must be disabled during the lockless page table walk. + * That's because the deleting or splitting involves flushing TLBs, + * which in turn issues interrupts, that will block when disabled. + */ + local_irq_save(irq_mask); + + /* + * This memory barrier pairs with any code that is either trying to + * delete page tables, or split huge pages. Without this barrier, + * the page tables could be read speculatively outside of interrupt + * disabling. + */ + smp_mb(); + + return irq_mask; +} + +/* + * end_lockless_pgtbl_walk: Must be inserted after the last use of a pointer + * returned by a lockless pagetable walk, such as __find_linux_pte() + */ +static inline void end_lockless_pgtbl_walk(unsigned long irq_mask) +{ + /* + * This memory barrier pairs with any code that is either trying to + * delete page tables, or split huge pages. Without this barrier, + * the page tables could be read speculatively outside of interrupt + * disabling. + */ + smp_mb(); + + /* + * Interrupts must be disabled during the lockless page table walk. + * That's because the deleting or splitting involves flushing TLBs, + * which in turn issues interrupts, that will block when disabled. + */ + local_irq_restore(irq_mask); +} +#endif + /* * On some architectures it depends on the mm if the p4d/pud or pmd * layer of the page table hierarchy is folded or not.
It's necessary to track lockless pagetable walks, in order to avoid doing THP splitting/collapsing during them. The default solution is to disable irq before lockless pagetable walks and enable it after it's finished. On code, this means you can find local_irq_disable() and local_irq_enable() around some pieces of code, usually without comments on why it is needed. This patch proposes a set of generic functions to be called before starting and after finishing a lockless pagetable walk. It is supposed to make clear that a lockless pagetable walk happens there, and also carries information on why the irq disable/enable is needed. begin_lockless_pgtbl_walk() Insert before starting any lockless pgtable walk end_lockless_pgtbl_walk() Insert after the end of any lockless pgtable walk (Mostly after the ptep is last used) A memory barrier was also added just to make sure there is no speculative read outside the interrupt disabled area. Other than that, it is not supposed to have any change of behavior from current code. It is planned to allow arch-specific versions, so that additional steps can be added while keeping the code clean. Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com> --- include/asm-generic/pgtable.h | 51 +++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+)