diff mbox

[v2] powerpc32: memcpy/memset: only use dcbz once cache is enabled

Message ID 1441934656.23806.3.camel@ellerman.id.au (mailing list archive)
State Not Applicable
Headers show

Commit Message

Michael Ellerman Sept. 11, 2015, 1:24 a.m. UTC
On Thu, 2015-09-10 at 17:05 -0500, Scott Wood wrote:
> On Thu, 2015-09-10 at 08:41 +0200, Christophe Leroy wrote:
> >  
> > +/* Cache related sections */
> > +#define BEGIN_CACHE_SECTION_NESTED(label)    START_FTR_SECTION(label)
> > +#define BEGIN_CACHE_SECTION                  START_FTR_SECTION(97)
> > +
> > +#define END_CACHE_SECTION_NESTED(msk, val, label)            \
> > +     FTR_SECTION_ELSE_NESTED(label)                          \
> > +     MAKE_FTR_SECTION_ENTRY(msk, val, label, __cache_fixup)
> > +
> > +#define END_CACHE_SECTION(msk, val)          \
> > +     END_CACHE_SECTION_NESTED(msk, val, 97)
> > +
> > +#define END_CACHE_SECTION_IFSET(msk) END_CACHE_SECTION((msk), (msk))
> > +#define END_CACHE_SECTION_IFCLR(msk) END_CACHE_SECTION((msk), 0)
> > +
> > +/* CACHE feature sections with alternatives, use BEGIN_FTR_SECTION to 
> > start */
> > +#define CACHE_SECTION_ELSE_NESTED(label)     FTR_SECTION_ELSE_NESTED(label)
> > +#define CACHE_SECTION_ELSE   CACHE_SECTION_ELSE_NESTED(97)
> > +#define ALT_CACHE_SECTION_END_NESTED(msk, val, label)        \
> > +     MAKE_FTR_SECTION_ENTRY(msk, val, label, __cache_fixup)
> > +#define ALT_CACHE_SECTION_END_NESTED_IFSET(msk, label)       \
> > +     ALT_CACHE_SECTION_END_NESTED(msk, msk, label)
> > +#define ALT_CACHE_SECTION_END_NESTED_IFCLR(msk, label)       \
> > +     ALT_CACHE_SECTION_END_NESTED(msk, 0, label)
> > +#define ALT_CACHE_SECTION_END(msk, val)      \
> > +     ALT_CACHE_SECTION_END_NESTED(msk, val, 97)
> > +#define ALT_CACHE_SECTION_END_IFSET(msk)     \
> > +     ALT_CACHE_SECTION_END_NESTED_IFSET(msk, 97)
> > +#define ALT_CACHE_SECTION_END_IFCLR(msk)     \
> > +     ALT_CACHE_SECTION_END_NESTED_IFCLR(msk, 97)
> 
> I don't think this duplication is what Michael meant by "the normal cpu 
> feature sections".  What else is going to use this very specific 
> infrastructure?

Yeah, sorry, I was hoping you could do it with the existing cpu feature
mechanism.

It looks like the timing doesn't work, ie. you need to patch this stuff in
machine_init(), which is later than the regular patching which gets done in
early_init().

This is one of the festering differences we have between the 32 and 64-bit
initialisation code, ie. on 64-bit we do the patching much later.


So I think the cleanest solution is to have memcpy branch to generic_memcpy by
default, and then patch that to a nop once you're up and running. Something
like:


cheers

Comments

Christophe Leroy Sept. 12, 2015, 9:57 a.m. UTC | #1
Le 11/09/2015 03:24, Michael Ellerman a écrit :
> On Thu, 2015-09-10 at 17:05 -0500, Scott Wood wrote:
>>
>> I don't think this duplication is what Michael meant by "the normal cpu
>> feature sections".  What else is going to use this very specific
>> infrastructure?
> Yeah, sorry, I was hoping you could do it with the existing cpu feature
> mechanism.
>
> It looks like the timing doesn't work, ie. you need to patch this stuff in
> machine_init(), which is later than the regular patching which gets done in
> early_init().
>
> This is one of the festering differences we have between the 32 and 64-bit
> initialisation code, ie. on 64-bit we do the patching much later.
>
>

I've just thought about maybe another alternative.
Is there any issue with calling do_feature_fixups() twice for the same 
features ?
If not, we could define a MMU_CACHE_NOW_ON dummy MMU feature, then
call again do_feature_fixups() in machine_init() to patch memcpy/memset 
stuff, something like:

In arch/powerpc/include/asm/mmu.h:
+#define MMU_CACHE_NOW_ON                ASM_CONST(0x00008000)

In arch/powerpc/kernel/setup_32.c: @machine_init()

         udbg_early_init();

+        spec = identify_cpu(0, mfspr(SPRN_PVR));
+        do_feature_fixups(spec->mmu_features | MMU_CACHE_NOW_ON,
+                          &__start___mmu_ftr_fixup,
+                          &__stop___mmu_ftr_fixup);

         /* Do some early initialization based on the flat device tree */
         early_init_devtree(__va(dt_ptr));


Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus
Michael Ellerman Sept. 14, 2015, 9:59 a.m. UTC | #2
On Sat, 2015-09-12 at 11:57 +0200, christophe leroy wrote:
> 
> Le 11/09/2015 03:24, Michael Ellerman a écrit :
> > On Thu, 2015-09-10 at 17:05 -0500, Scott Wood wrote:
> >>
> >> I don't think this duplication is what Michael meant by "the normal cpu
> >> feature sections".  What else is going to use this very specific
> >> infrastructure?
> > Yeah, sorry, I was hoping you could do it with the existing cpu feature
> > mechanism.
> >
> > It looks like the timing doesn't work, ie. you need to patch this stuff in
> > machine_init(), which is later than the regular patching which gets done in
> > early_init().
> >
> > This is one of the festering differences we have between the 32 and 64-bit
> > initialisation code, ie. on 64-bit we do the patching much later.
>
> I've just thought about maybe another alternative.
> Is there any issue with calling do_feature_fixups() twice for the same
> features ?

Not that I can think of, but you never know.

> If not, we could define a MMU_CACHE_NOW_ON dummy MMU feature, then
> call again do_feature_fixups() in machine_init() to patch memcpy/memset 
> stuff, something like:
> 
> In arch/powerpc/include/asm/mmu.h:
> +#define MMU_CACHE_NOW_ON                ASM_CONST(0x00008000)
> 
> In arch/powerpc/kernel/setup_32.c: @machine_init()
> 
>          udbg_early_init();
> 
> +        spec = identify_cpu(0, mfspr(SPRN_PVR));
> +        do_feature_fixups(spec->mmu_features | MMU_CACHE_NOW_ON,
> +                          &__start___mmu_ftr_fixup,
> +                          &__stop___mmu_ftr_fixup);


Did you try that? It would be cleaner, especially now that you have to do memset as well.

cheers
Scott Wood Sept. 14, 2015, 4:32 p.m. UTC | #3
On Sat, 2015-09-12 at 11:57 +0200, christophe leroy wrote:
> Le 11/09/2015 03:24, Michael Ellerman a écrit :
> > On Thu, 2015-09-10 at 17:05 -0500, Scott Wood wrote:
> > > 
> > > I don't think this duplication is what Michael meant by "the normal cpu
> > > feature sections".  What else is going to use this very specific
> > > infrastructure?
> > Yeah, sorry, I was hoping you could do it with the existing cpu feature
> > mechanism.
> > 
> > It looks like the timing doesn't work, ie. you need to patch this stuff in
> > machine_init(), which is later than the regular patching which gets done 
> > in
> > early_init().
> > 
> > This is one of the festering differences we have between the 32 and 64-bit
> > initialisation code, ie. on 64-bit we do the patching much later.
> > 
> > 
> 
> I've just thought about maybe another alternative.
> Is there any issue with calling do_feature_fixups() twice for the same 
> features ?
> If not, we could define a MMU_CACHE_NOW_ON dummy MMU feature, then
> call again do_feature_fixups() in machine_init() to patch memcpy/memset 
> stuff, something like:
> 
> In arch/powerpc/include/asm/mmu.h:
> +#define MMU_CACHE_NOW_ON                ASM_CONST(0x00008000)
> 
> In arch/powerpc/kernel/setup_32.c: @machine_init()
> 
>          udbg_early_init();
> 
> +        spec = identify_cpu(0, mfspr(SPRN_PVR));
> +        do_feature_fixups(spec->mmu_features | MMU_CACHE_NOW_ON,
> +                          &__start___mmu_ftr_fixup,
> +                          &__stop___mmu_ftr_fixup);

This will cause cpu_setup() to be called twice on booke.  I'm not sure if 
that will cause any harm with the current cpu_setup() implementation, but 
it's complexity that is better avoided.  Why not just use cur_cpu_spec?

How much code is between the enabling of caches and the application of fixups 
(quite a lot on booke where cache is enabled by the bootloader...)?  Perhaps 
it's better to label it something that indicates that cache block operations 
are safe to use, so nobody gets the idea that it's OK to use it to protect 
things that can only be done before caches are enabled.

What happens if someone sees MMU_CACHE_NOW_ON (or whatever it ends up being 
called) and decides to call mmu_has_feature()?  At least set the bit in
spec->mmu_features rather than just for the do_feature_fixups() argument, and 
hope that nobody implements MMU_FTRS_POSSIBLE/ALWAYS, or checks the feature 
on 64-bit...  I'm not 100% convinced that abusing cpu feature mechanisms for 
boot sequence control is a good idea.  The direct patching alternative is 
quite simple, and if we were to accumulate enough instances of that (or more 
complicated instances) then patching infrastructure that is explicitly 
relating to the current state of the system rather than permanent hardware 
description could be justified.

-Scott
diff mbox

Patch

diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index bb02e9f6944e..1c1a4e8866ad 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -38,6 +38,7 @@ 
 #include <asm/udbg.h>
 #include <asm/mmu_context.h>
 #include <asm/epapr_hcalls.h>
+#include <asm/code-patching.h>
 
 #define DBG(fmt...)
 
@@ -119,6 +120,8 @@  notrace void __init machine_init(u64 dt_ptr)
        /* Do some early initialization based on the flat device tree */
        early_init_devtree(__va(dt_ptr));
 
+       patch_instruction((unsigned int *)&memcpy, 0x60000000);
+
        epapr_paravirt_early_init();
 
        early_init_mmu();
diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S
index 2ef50c629470..6446d2915e41 100644
--- a/arch/powerpc/lib/copy_32.S
+++ b/arch/powerpc/lib/copy_32.S
@@ -135,6 +135,7 @@  _GLOBAL(memmove)
        /* fall through */
 
 _GLOBAL(memcpy)
+       b       generic_memcpy
        add     r7,r3,r5                /* test if the src & dst overlap */
        add     r8,r4,r5
        cmplw   0,r4,r7