Message ID | 20180914153056.3644-4-npiggin@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 8b92887ced2e3fce223412487f99d4ef3f07b490 |
Headers | show |
Series | SLB miss conversion to C, and SLB optimisations | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | next/apply_patch Successfully applied |
snowpatch_ozlabs/checkpatch | success | Test checkpatch on branch next |
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.19-rc3 next-20180913]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/SLB-miss-conversion-to-C-and-SLB-optimisations/20180917-015458
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc
Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
Note: the linux-review/Nicholas-Piggin/SLB-miss-conversion-to-C-and-SLB-optimisations/20180917-015458 HEAD b26bd44c74488169a0fd19eef43ea3db189a207d builds fine.
It only hurts bisectibility.
All errors (new ones prefixed by >>):
arch/powerpc/mm/slb.c: In function 'switch_slb':
>> arch/powerpc/mm/slb.c:258:4: error: 'slbie_data' may be used uninitialized in this function [-Werror=maybe-uninitialized]
asm volatile("slbie %0" : : "r" (slbie_data));
^~~
cc1: all warnings being treated as errors
vim +/slbie_data +258 arch/powerpc/mm/slb.c
465ccab9 arch/powerpc/mm/slb.c will schmidt 2007-10-31 224
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 225 /* Flush all user entries from the segment table of the current processor. */
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 226 void switch_slb(struct task_struct *tsk, struct mm_struct *mm)
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 227 {
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 228 unsigned long offset;
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 229 unsigned long pc = KSTK_EIP(tsk);
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 230 unsigned long stack = KSTK_ESP(tsk);
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 231 unsigned long exec_base;
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 232
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 233 /*
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 234 * We need interrupts hard-disabled here, not just soft-disabled,
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 235 * so that a PMU interrupt can't occur, which might try to access
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 236 * user memory (to get a stack trace) and possible cause an SLB miss
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 237 * which would update the slb_cache/slb_cache_ptr fields in the PACA.
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 238 */
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 239 hard_irq_disable();
9c1e1052 arch/powerpc/mm/slb.c Paul Mackerras 2009-08-17 240 offset = get_paca()->slb_cache_ptr;
44ae3ab3 arch/powerpc/mm/slb.c Matt Evans 2011-04-06 241 if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) &&
f66bce5e arch/powerpc/mm/slb.c Olof Johansson 2007-10-16 242 offset <= SLB_CACHE_ENTRIES) {
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 243 unsigned long slbie_data;
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 244 int i;
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 245
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 246 asm volatile("isync" : : : "memory");
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 247 for (i = 0; i < offset; i++) {
1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 248 slbie_data = (unsigned long)get_paca()->slb_cache[i]
1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 249 << SID_SHIFT; /* EA */
1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 250 slbie_data |= user_segment_size(slbie_data)
1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 251 << SLBIE_SSIZE_SHIFT;
1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 252 slbie_data |= SLBIE_C; /* C set for user addresses */
1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 253 asm volatile("slbie %0" : : "r" (slbie_data));
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 254 }
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 255
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 256 /* Workaround POWER5 < DD2.1 issue */
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 257 if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1)
1189be65 arch/powerpc/mm/slb.c Paul Mackerras 2007-10-11 @258 asm volatile("slbie %0" : : "r" (slbie_data));
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 259
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 260 asm volatile("isync" : : : "memory");
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 261 } else {
6697d605 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 262 __slb_flush_and_rebolt();
c15c9670 arch/powerpc/mm/slb.c Nicholas Piggin 2018-09-15 263 }
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 264
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 265 get_paca()->slb_cache_ptr = 0;
52b1e665 arch/powerpc/mm/slb.c Aneesh Kumar K.V 2017-03-22 266 copy_mm_to_paca(mm);
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 267
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 268 /*
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 269 * preload some userspace segments into the SLB.
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 270 * Almost all 32 and 64bit PowerPC executables are linked at
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 271 * 0x10000000 so it makes sense to preload this segment.
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 272 */
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 273 exec_base = 0x10000000;
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 274
5eb9bac0 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 275 if (is_kernel_addr(pc) || is_kernel_addr(stack) ||
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 276 is_kernel_addr(exec_base))
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 277 return;
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 278
5eb9bac0 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 279 slb_allocate(pc);
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 280
5eb9bac0 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 281 if (!esids_match(pc, stack))
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 282 slb_allocate(stack);
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 283
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 284 if (!esids_match(pc, exec_base) &&
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 285 !esids_match(stack, exec_base))
de4376c2 arch/powerpc/mm/slb.c Anton Blanchard 2009-07-13 286 slb_allocate(exec_base);
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 287 }
^1da177e arch/ppc64/mm/slb.c Linus Torvalds 2005-04-16 288
:::::: The code at line 258 was first introduced by commit
:::::: 1189be6508d45183013ddb82b18f4934193de274 [POWERPC] Use 1TB segments
:::::: TO: Paul Mackerras <paulus@samba.org>
:::::: CC: Paul Mackerras <paulus@samba.org>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
Nicholas Piggin <npiggin@gmail.com> writes: > The POWER5 < DD2.1 issue is that slbie needs to be issued more than > once. It came in with this change: > > ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, david@gibson.dropbear.id.au > [PATCH] POWER5 erratum workaround > > Early POWER5 revisions (<DD2.1) have a problem requiring slbie > instructions to be repeated under some circumstances. The patch below > adds a workaround (patch made by Anton Blanchard). Thanks for extracting this. Can we add this to the code? Also I am not sure what is repeated here? Is it that we just need one slb extra(hence only applicable to offset == 1) or is it that we need to make sure there is always one slb extra? The code does the former. Do you a have link for that email patch? > > The extra slbie in switch_slb is done even for the case where slbia is > called (slb_flush_and_rebolt). I don't believe that is required > because there are other slb_flush_and_rebolt callers which do not > issue the workaround slbie, which would be broken if it was required. > > It also seems to be fine inside the isync with the first slbie, as it > is in the kernel stack switch code. > > So move this workaround to where it is required. This is not much of > an optimisation because this is the fast path, but it makes the code > more understandable and neater. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > arch/powerpc/mm/slb.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c > index 1c7128c63a4b..d952ece3abf7 100644 > --- a/arch/powerpc/mm/slb.c > +++ b/arch/powerpc/mm/slb.c > @@ -226,7 +226,6 @@ static inline int esids_match(unsigned long addr1, unsigned long addr2) > void switch_slb(struct task_struct *tsk, struct mm_struct *mm) > { > unsigned long offset; > - unsigned long slbie_data = 0; > unsigned long pc = KSTK_EIP(tsk); > unsigned long stack = KSTK_ESP(tsk); > unsigned long exec_base; > @@ -241,7 +240,9 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) > offset = get_paca()->slb_cache_ptr; > if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) && > offset <= SLB_CACHE_ENTRIES) { > + unsigned long slbie_data; > int i; > + > asm volatile("isync" : : : "memory"); > for (i = 0; i < offset; i++) { > slbie_data = (unsigned long)get_paca()->slb_cache[i] > @@ -251,15 +252,14 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) > slbie_data |= SLBIE_C; /* C set for user addresses */ > asm volatile("slbie %0" : : "r" (slbie_data)); > } > - asm volatile("isync" : : : "memory"); > - } else { > - __slb_flush_and_rebolt(); > - } > > - if (!cpu_has_feature(CPU_FTR_ARCH_207S)) { > /* Workaround POWER5 < DD2.1 issue */ > - if (offset == 1 || offset > SLB_CACHE_ENTRIES) > + if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1) > asm volatile("slbie %0" : : "r" (slbie_data)); > + > + asm volatile("isync" : : : "memory"); > + } else { > + __slb_flush_and_rebolt(); > } > > get_paca()->slb_cache_ptr = 0; > -- > 2.18.0
On Mon, 17 Sep 2018 11:30:16 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote: > Nicholas Piggin <npiggin@gmail.com> writes: > > > The POWER5 < DD2.1 issue is that slbie needs to be issued more than > > once. It came in with this change: > > > > ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, david@gibson.dropbear.id.au > > [PATCH] POWER5 erratum workaround > > > > Early POWER5 revisions (<DD2.1) have a problem requiring slbie > > instructions to be repeated under some circumstances. The patch below > > adds a workaround (patch made by Anton Blanchard). > > Thanks for extracting this. Can we add this to the code? The comment? Sure. > Also I am not > sure what is repeated here? Is it that we just need one slb extra(hence > only applicable to offset == 1) or is it that we need to make sure there > is always one slb extra? The code does the former. Yeah it has always done the former, so my assumption is that you just need more than one slbie. I don't think we need to bother revisiting that assumption unless someone can pull up something definitive. What I did change is that slbia no longer has the additional slbie, but I think there are strong reasons not to need that. > Do you a have link for > that email patch? I tried looking through the archives around that date but could not find it. That came from a bitkeeper log. Thanks, Nick
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index 1c7128c63a4b..d952ece3abf7 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -226,7 +226,6 @@ static inline int esids_match(unsigned long addr1, unsigned long addr2) void switch_slb(struct task_struct *tsk, struct mm_struct *mm) { unsigned long offset; - unsigned long slbie_data = 0; unsigned long pc = KSTK_EIP(tsk); unsigned long stack = KSTK_ESP(tsk); unsigned long exec_base; @@ -241,7 +240,9 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) offset = get_paca()->slb_cache_ptr; if (!mmu_has_feature(MMU_FTR_NO_SLBIE_B) && offset <= SLB_CACHE_ENTRIES) { + unsigned long slbie_data; int i; + asm volatile("isync" : : : "memory"); for (i = 0; i < offset; i++) { slbie_data = (unsigned long)get_paca()->slb_cache[i] @@ -251,15 +252,14 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) slbie_data |= SLBIE_C; /* C set for user addresses */ asm volatile("slbie %0" : : "r" (slbie_data)); } - asm volatile("isync" : : : "memory"); - } else { - __slb_flush_and_rebolt(); - } - if (!cpu_has_feature(CPU_FTR_ARCH_207S)) { /* Workaround POWER5 < DD2.1 issue */ - if (offset == 1 || offset > SLB_CACHE_ENTRIES) + if (!cpu_has_feature(CPU_FTR_ARCH_207S) && offset == 1) asm volatile("slbie %0" : : "r" (slbie_data)); + + asm volatile("isync" : : : "memory"); + } else { + __slb_flush_and_rebolt(); } get_paca()->slb_cache_ptr = 0;
The POWER5 < DD2.1 issue is that slbie needs to be issued more than once. It came in with this change: ChangeSet@1.1608, 2004-04-29 07:12:31-07:00, david@gibson.dropbear.id.au [PATCH] POWER5 erratum workaround Early POWER5 revisions (<DD2.1) have a problem requiring slbie instructions to be repeated under some circumstances. The patch below adds a workaround (patch made by Anton Blanchard). The extra slbie in switch_slb is done even for the case where slbia is called (slb_flush_and_rebolt). I don't believe that is required because there are other slb_flush_and_rebolt callers which do not issue the workaround slbie, which would be broken if it was required. It also seems to be fine inside the isync with the first slbie, as it is in the kernel stack switch code. So move this workaround to where it is required. This is not much of an optimisation because this is the fast path, but it makes the code more understandable and neater. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/mm/slb.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)