Message ID | 20190117121328.13395-2-mpe@ellerman.id.au (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [1/4] powerpc/64s: Always set mmu_slb_size using slb_set_size() | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | next/apply_patch Successfully applied |
snowpatch_ozlabs/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 24 lines checked |
On Thu, Jan 17, 2019 at 11:13:26PM +1100, Michael Ellerman wrote: > The recent rewrite of the SLB code into C included the assumption that > all CPUs we run on have at least 32 SLB entries. This is currently > true but a bit fragile as the SLB size is actually defined by the > device tree and so could theoretically change at any time. It also is guaranteed by the architecture, since at least 2.02, FWIW. Segher
Segher Boessenkool <segher@kernel.crashing.org> writes: > On Thu, Jan 17, 2019 at 11:13:26PM +1100, Michael Ellerman wrote: >> The recent rewrite of the SLB code into C included the assumption that >> all CPUs we run on have at least 32 SLB entries. This is currently >> true but a bit fragile as the SLB size is actually defined by the >> device tree and so could theoretically change at any time. > > It also is guaranteed by the architecture, since at least 2.02, FWIW. True. Actually 2.00 says at least 32. Unfortunately we don't live in a world where "the architecture guarantees it" has any bearing on reality :) But given it *should* always be at least 32 maybe I should optimise for that case. We could use a static key to skip the U32_MAX comparison and go down the else path. cheers
On Fri, Jan 18, 2019 at 11:28:24PM +1100, Michael Ellerman wrote: > Segher Boessenkool <segher@kernel.crashing.org> writes: > > > On Thu, Jan 17, 2019 at 11:13:26PM +1100, Michael Ellerman wrote: > >> The recent rewrite of the SLB code into C included the assumption that > >> all CPUs we run on have at least 32 SLB entries. This is currently > >> true but a bit fragile as the SLB size is actually defined by the > >> device tree and so could theoretically change at any time. > > > > It also is guaranteed by the architecture, since at least 2.02, FWIW. > > True. Actually 2.00 says at least 32. > > Unfortunately we don't live in a world where "the architecture > guarantees it" has any bearing on reality :) It's a pretty strong hint. I don't remember any hardware where it is not true, either. (That might be selective memory ;-) ) > But given it *should* always be at least 32 maybe I should optimise for > that case. We could use a static key to skip the U32_MAX comparison and > go down the else path. Ah that sounds like a good idea :-) Segher
Michael Ellerman <mpe@ellerman.id.au> writes: > The recent rewrite of the SLB code into C included the assumption that > all CPUs we run on have at least 32 SLB entries. This is currently > true but a bit fragile as the SLB size is actually defined by the > device tree and so could theoretically change at any time. > > The assumption is encoded in the fact that we use U32_MAX as the value > for a full SLB bitmap. Instead, calculate what the full bitmap would > be based on the SLB size we're given and store it. This still requires > the SLB size to be a power of 2. So if it is less than 32 we want to make sure we don't allocate an index above 32. Is this the reason for that radom assert_slb_presence? Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> > > Fixes: 126b11b294d1 ("powerpc/64s/hash: Add SLB allocation status bitmaps") > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > --- > arch/powerpc/mm/slb.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c > index bc3914d54e26..61450a9cf30d 100644 > --- a/arch/powerpc/mm/slb.c > +++ b/arch/powerpc/mm/slb.c > @@ -506,9 +506,16 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) > asm volatile("isync" : : : "memory"); > } > > +static u32 slb_full_bitmap; > + > void slb_set_size(u16 size) > { > mmu_slb_size = size; > + > + if (size >= 32) > + slb_full_bitmap = U32_MAX; > + else > + slb_full_bitmap = (1ul << size) - 1; > } > > void slb_initialize(void) > @@ -611,7 +618,7 @@ static enum slb_index alloc_slb_index(bool kernel) > * POWER7/8/9 have 32 SLB entries, this could be expanded if a > * future CPU has more. > */ > - if (local_paca->slb_used_bitmap != U32_MAX) { > + if (local_paca->slb_used_bitmap != slb_full_bitmap) { > index = ffz(local_paca->slb_used_bitmap); > local_paca->slb_used_bitmap |= 1U << index; > if (kernel) > -- > 2.20.1
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index bc3914d54e26..61450a9cf30d 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -506,9 +506,16 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) asm volatile("isync" : : : "memory"); } +static u32 slb_full_bitmap; + void slb_set_size(u16 size) { mmu_slb_size = size; + + if (size >= 32) + slb_full_bitmap = U32_MAX; + else + slb_full_bitmap = (1ul << size) - 1; } void slb_initialize(void) @@ -611,7 +618,7 @@ static enum slb_index alloc_slb_index(bool kernel) * POWER7/8/9 have 32 SLB entries, this could be expanded if a * future CPU has more. */ - if (local_paca->slb_used_bitmap != U32_MAX) { + if (local_paca->slb_used_bitmap != slb_full_bitmap) { index = ffz(local_paca->slb_used_bitmap); local_paca->slb_used_bitmap |= 1U << index; if (kernel)
The recent rewrite of the SLB code into C included the assumption that all CPUs we run on have at least 32 SLB entries. This is currently true but a bit fragile as the SLB size is actually defined by the device tree and so could theoretically change at any time. The assumption is encoded in the fact that we use U32_MAX as the value for a full SLB bitmap. Instead, calculate what the full bitmap would be based on the SLB size we're given and store it. This still requires the SLB size to be a power of 2. Fixes: 126b11b294d1 ("powerpc/64s/hash: Add SLB allocation status bitmaps") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> --- arch/powerpc/mm/slb.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)