Message ID | 20130819115029.GD20146@pek-khao-d1.corp.ad.wrs.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Mon, 2013-08-19 at 19:50 +0800, Kevin Hao wrote: > On Mon, Aug 19, 2013 at 01:37:17PM +1000, Benjamin Herrenschmidt wrote: > > On Mon, 2013-08-19 at 13:36 +1000, Benjamin Herrenschmidt wrote: > > > > > The semantic of this function is to make data executable. Even if the > > > implementation has a snooping icache, it *still* needs to make sure > > > prefetched code is tossed out of the pipeline which is what isync > > > should provide. > > > > > > The architecture actually specifies that. > > > > In fact, on P5 and later, I think we are supposed to do a single dummy > > icbi followed by sync and isync. > > Do you mean something like this? But why? It has to do with making sure prefetched and/or speculated instructions are properly tossed. My understanding is that icbi basically tells the ifetch buffers to drop it, sync orders the icbi and isync ensures its execution has been synchronized. At least I *think* that's the required sequence, I have to dbl check the arch, maybe tomorrow. I wouldn't be surprise if we also need a sync before the icbi to order the actual stores to memory that might have modified instructions with the icbi. Cheers, Ben. > diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S > index d74fefb..8fcdec7 100644 > --- a/arch/powerpc/kernel/misc_64.S > +++ b/arch/powerpc/kernel/misc_64.S > @@ -69,6 +69,8 @@ PPC64_CACHES: > > _KPROBE(flush_icache_range) > BEGIN_FTR_SECTION > + icbi 0,r3 > + sync > isync > blr > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > @@ -209,6 +211,8 @@ _GLOBAL(flush_inval_dcache_range) > */ > _GLOBAL(__flush_dcache_icache) > BEGIN_FTR_SECTION > + icbi 0,r3 > + sync > isync > blr > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > Thanks, > Kevin > > > > > Cheers, > > Ben. > > > > > Cheers, > > > Ben. > > > > > > > > blr > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > rlwinm r3,r3,0,0,31-PAGE_SHIFT /* Get page base address > > > > > */ > > > > > @@ -474,6 +475,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x) > > > > > */ > > > > > _GLOBAL(__flush_dcache_icache_phys) > > > > > BEGIN_FTR_SECTION > > > > > + isync > > > > > blr /* for 601, do nothing */ > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > mfmsr r10 > > > > > diff --git a/arch/powerpc/kernel/misc_64.S > > > > > b/arch/powerpc/kernel/misc_64.S > > > > > index 992a78e..d74fefb 100644 > > > > > --- a/arch/powerpc/kernel/misc_64.S > > > > > +++ b/arch/powerpc/kernel/misc_64.S > > > > > @@ -69,6 +69,7 @@ PPC64_CACHES: > > > > > > > > > > _KPROBE(flush_icache_range) > > > > > BEGIN_FTR_SECTION > > > > > + isync > > > > > blr > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > /* > > > > > -- > > > > > 1.8.3.1 > > > > > > > > > > _______________________________________________ > > > > > Linuxppc-dev mailing list > > > > > Linuxppc-dev@lists.ozlabs.org > > > > > https://lists.ozlabs.org/listinfo/linuxppc-dev > > > > > > > > > > >
On Mon, Aug 19, 2013 at 10:45:35PM +1000, Benjamin Herrenschmidt wrote: > On Mon, 2013-08-19 at 19:50 +0800, Kevin Hao wrote: > > On Mon, Aug 19, 2013 at 01:37:17PM +1000, Benjamin Herrenschmidt wrote: > > > On Mon, 2013-08-19 at 13:36 +1000, Benjamin Herrenschmidt wrote: > > > > > > > The semantic of this function is to make data executable. Even if the > > > > implementation has a snooping icache, it *still* needs to make sure > > > > prefetched code is tossed out of the pipeline which is what isync > > > > should provide. > > > > > > > > The architecture actually specifies that. > > > > > > In fact, on P5 and later, I think we are supposed to do a single dummy > > > icbi followed by sync and isync. > > > > Do you mean something like this? But why? > > It has to do with making sure prefetched and/or speculated instructions > are properly tossed. > > My understanding is that icbi basically tells the ifetch buffers to drop > it Dummy question: What does the ifetch buffers mean? The instruction fetch pipeline or instruction dispatch pipeline? Shouldn't all the prefetched instructions in these buffers be discarded by isync? >, sync orders the icbi and isync ensures its execution has been > synchronized. At least I *think* that's the required sequence, I have to > dbl check the arch, maybe tomorrow. I wouldn't be surprise if we also > need a sync before the icbi to order the actual stores to memory that > might have modified instructions with the icbi. Doesn't the coherence between icache and dcache be maintained by the snooping? Thanks, Kevin > > Cheers, > Ben. > > > diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S > > index d74fefb..8fcdec7 100644 > > --- a/arch/powerpc/kernel/misc_64.S > > +++ b/arch/powerpc/kernel/misc_64.S > > @@ -69,6 +69,8 @@ PPC64_CACHES: > > > > _KPROBE(flush_icache_range) > > BEGIN_FTR_SECTION > > + icbi 0,r3 > > + sync > > isync > > blr > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > @@ -209,6 +211,8 @@ _GLOBAL(flush_inval_dcache_range) > > */ > > _GLOBAL(__flush_dcache_icache) > > BEGIN_FTR_SECTION > > + icbi 0,r3 > > + sync > > isync > > blr > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > Thanks, > > Kevin > > > > > > > > Cheers, > > > Ben. > > > > > > > Cheers, > > > > Ben. > > > > > > > > > > blr > > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > > rlwinm r3,r3,0,0,31-PAGE_SHIFT /* Get page base address > > > > > > */ > > > > > > @@ -474,6 +475,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x) > > > > > > */ > > > > > > _GLOBAL(__flush_dcache_icache_phys) > > > > > > BEGIN_FTR_SECTION > > > > > > + isync > > > > > > blr /* for 601, do nothing */ > > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > > mfmsr r10 > > > > > > diff --git a/arch/powerpc/kernel/misc_64.S > > > > > > b/arch/powerpc/kernel/misc_64.S > > > > > > index 992a78e..d74fefb 100644 > > > > > > --- a/arch/powerpc/kernel/misc_64.S > > > > > > +++ b/arch/powerpc/kernel/misc_64.S > > > > > > @@ -69,6 +69,7 @@ PPC64_CACHES: > > > > > > > > > > > > _KPROBE(flush_icache_range) > > > > > > BEGIN_FTR_SECTION > > > > > > + isync > > > > > > blr > > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > > /* > > > > > > -- > > > > > > 1.8.3.1 > > > > > > > > > > > > _______________________________________________ > > > > > > Linuxppc-dev mailing list > > > > > > Linuxppc-dev@lists.ozlabs.org > > > > > > https://lists.ozlabs.org/listinfo/linuxppc-dev > > > > > > > > > > > > > > > > >
On Tue, 2013-08-20 at 20:16 +0800, Kevin Hao wrote: > Dummy question: What does the ifetch buffers mean? The instruction fetch > pipeline or instruction dispatch pipeline? Shouldn't all the prefetched > instructions in these buffers be discarded by isync? Architecturally isync doesn't have to toss prefetch completely. It only needs to make sense that context changes performed by previous instructions (and interrupts/traps) happen at the point of the isync, for example, it will ensure that a trap conditional is fully evaluated before subsequent instruction execution, etc.... So in this case, it makes sure the icbi has been executed and it's the icbi that invalidates the prefetched instructions. > > >, sync orders the icbi and isync ensures its execution has been > > synchronized. At least I *think* that's the required sequence, I have to > > dbl check the arch, maybe tomorrow. I wouldn't be surprise if we also > > need a sync before the icbi to order the actual stores to memory that > > might have modified instructions with the icbi. > > Doesn't the coherence between icache and dcache be maintained by the snooping? The icache yes, but not the ifetch buffers (basically think of them as buffering stages between icache and dispatch). At least that's my understanding of the implementation. Cheers, Ben. > Thanks, > Kevin > > > > > Cheers, > > Ben. > > > > > diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S > > > index d74fefb..8fcdec7 100644 > > > --- a/arch/powerpc/kernel/misc_64.S > > > +++ b/arch/powerpc/kernel/misc_64.S > > > @@ -69,6 +69,8 @@ PPC64_CACHES: > > > > > > _KPROBE(flush_icache_range) > > > BEGIN_FTR_SECTION > > > + icbi 0,r3 > > > + sync > > > isync > > > blr > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > @@ -209,6 +211,8 @@ _GLOBAL(flush_inval_dcache_range) > > > */ > > > _GLOBAL(__flush_dcache_icache) > > > BEGIN_FTR_SECTION > > > + icbi 0,r3 > > > + sync > > > isync > > > blr > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > > Thanks, > > > Kevin > > > > > > > > > > > Cheers, > > > > Ben. > > > > > > > > > Cheers, > > > > > Ben. > > > > > > > > > > > > blr > > > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > > > rlwinm r3,r3,0,0,31-PAGE_SHIFT /* Get page base address > > > > > > > */ > > > > > > > @@ -474,6 +475,7 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_TYPE_44x) > > > > > > > */ > > > > > > > _GLOBAL(__flush_dcache_icache_phys) > > > > > > > BEGIN_FTR_SECTION > > > > > > > + isync > > > > > > > blr /* for 601, do nothing */ > > > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > > > mfmsr r10 > > > > > > > diff --git a/arch/powerpc/kernel/misc_64.S > > > > > > > b/arch/powerpc/kernel/misc_64.S > > > > > > > index 992a78e..d74fefb 100644 > > > > > > > --- a/arch/powerpc/kernel/misc_64.S > > > > > > > +++ b/arch/powerpc/kernel/misc_64.S > > > > > > > @@ -69,6 +69,7 @@ PPC64_CACHES: > > > > > > > > > > > > > > _KPROBE(flush_icache_range) > > > > > > > BEGIN_FTR_SECTION > > > > > > > + isync > > > > > > > blr > > > > > > > END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) > > > > > > > /* > > > > > > > -- > > > > > > > 1.8.3.1 > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Linuxppc-dev mailing list > > > > > > > Linuxppc-dev@lists.ozlabs.org > > > > > > > https://lists.ozlabs.org/listinfo/linuxppc-dev > > > > > > > > > > > > > > > > > > > > > > >
On Wed, Aug 21, 2013 at 07:28:54AM +1000, Benjamin Herrenschmidt wrote: > On Tue, 2013-08-20 at 20:16 +0800, Kevin Hao wrote: > > > Dummy question: What does the ifetch buffers mean? The instruction fetch > > pipeline or instruction dispatch pipeline? Shouldn't all the prefetched > > instructions in these buffers be discarded by isync? > > Architecturally isync doesn't have to toss prefetch completely. It only > needs to make sense that context changes performed by previous > instructions (and interrupts/traps) happen at the point of the isync, > for example, it will ensure that a trap conditional is fully evaluated > before subsequent instruction execution, etc.... > > So in this case, it makes sure the icbi has been executed and it's the > icbi that invalidates the prefetched instructions. > > > > > >, sync orders the icbi and isync ensures its execution has been > > > synchronized. At least I *think* that's the required sequence, I have to > > > dbl check the arch, maybe tomorrow. I wouldn't be surprise if we also > > > need a sync before the icbi to order the actual stores to memory that > > > might have modified instructions with the icbi. > > > > Doesn't the coherence between icache and dcache be maintained by the snooping? > > The icache yes, but not the ifetch buffers (basically think of them as > buffering stages between icache and dispatch). At least that's my > understanding of the implementation. OK. Thanks for the explanation. I will update the patch according to your suggestions. Thanks, Kevin
diff --git a/arch/powerpc/kernel/misc_64.S b/arch/powerpc/kernel/misc_64.S index d74fefb..8fcdec7 100644 --- a/arch/powerpc/kernel/misc_64.S +++ b/arch/powerpc/kernel/misc_64.S @@ -69,6 +69,8 @@ PPC64_CACHES: _KPROBE(flush_icache_range) BEGIN_FTR_SECTION + icbi 0,r3 + sync isync blr END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE) @@ -209,6 +211,8 @@ _GLOBAL(flush_inval_dcache_range) */ _GLOBAL(__flush_dcache_icache) BEGIN_FTR_SECTION + icbi 0,r3 + sync isync blr END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)