mbox

[GIT,PULL] prefetch support for 3.13

Message ID 20131009171312.GJ8378@mudshark.cambridge.arm.com
State New
Headers show

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-rmk/prefetch

Message

Will Deacon Oct. 9, 2013, 5:13 p.m. UTC
Hi Russell,

Please pull the following patches for 3.13. They add support for the pldw
instruction (prefetch with intent to modify) in ARMv7 SMP cores, which is
then used to gain a measurable performance boost for particular atomic
sequences.

Cheers,

Will

--->8

The following changes since commit 15c03dd4859ab16f9212238f29dd315654aa94f6:

  Linux 3.12-rc3 (2013-09-29 15:02:38 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-rmk/prefetch

for you to fetch changes up to d779c07dd72098a7416d907494f958213b7726f3:

  ARM: bitops: prefetch the destination word for write prior to strex (2013-09-30 16:42:56 +0100)

----------------------------------------------------------------
Will Deacon (6):
      ARM: prefetch: remove redundant "cc" clobber
      ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.h
      ARM: prefetch: add support for prefetchw using pldw on SMP ARMv7+ CPUs
      ARM: locks: prefetch the destination word for write prior to strex
      ARM: atomics: prefetch the destination word for write prior to strex
      ARM: bitops: prefetch the destination word for write prior to strex

 arch/arm/include/asm/atomic.h         |  7 +++++++
 arch/arm/include/asm/processor.h      | 33 +++++++++++++++++++++++++--------
 arch/arm/include/asm/spinlock.h       | 28 ++++++++++++++--------------
 arch/arm/include/asm/spinlock_types.h |  2 +-
 arch/arm/include/asm/unified.h        |  4 ++++
 arch/arm/lib/bitops.h                 |  5 +++++
 6 files changed, 56 insertions(+), 23 deletions(-)

Comments

Paul Walmsley Oct. 30, 2013, 3:24 p.m. UTC | #1
Hi Will et al.,

On 10/09/2013 10:13 AM, Will Deacon wrote:
> Hi Russell,
>
> Please pull the following patches for 3.13. They add support for the pldw
> instruction (prefetch with intent to modify) in ARMv7 SMP cores, which is
> then used to gain a measurable performance boost for particular atomic
> sequences.

Looks like the pldw changes require binutils >= 2.21.  Might be worth 
considering a patch to update Documentation/Changes?


- Paul

>
> Cheers,
>
> Will
>
> --->8
>
> The following changes since commit 15c03dd4859ab16f9212238f29dd315654aa94f6:
>
>    Linux 3.12-rc3 (2013-09-29 15:02:38 -0700)
>
> are available in the git repository at:
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-rmk/prefetch
>
> for you to fetch changes up to d779c07dd72098a7416d907494f958213b7726f3:
>
>    ARM: bitops: prefetch the destination word for write prior to strex (2013-09-30 16:42:56 +0100)
>
> ----------------------------------------------------------------
> Will Deacon (6):
>        ARM: prefetch: remove redundant "cc" clobber
>        ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.h
>        ARM: prefetch: add support for prefetchw using pldw on SMP ARMv7+ CPUs
>        ARM: locks: prefetch the destination word for write prior to strex
>        ARM: atomics: prefetch the destination word for write prior to strex
>        ARM: bitops: prefetch the destination word for write prior to strex
>
>   arch/arm/include/asm/atomic.h         |  7 +++++++
>   arch/arm/include/asm/processor.h      | 33 +++++++++++++++++++++++++--------
>   arch/arm/include/asm/spinlock.h       | 28 ++++++++++++++--------------
>   arch/arm/include/asm/spinlock_types.h |  2 +-
>   arch/arm/include/asm/unified.h        |  4 ++++
>   arch/arm/lib/bitops.h                 |  5 +++++
>   6 files changed, 56 insertions(+), 23 deletions(-)
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Russell King - ARM Linux Oct. 30, 2013, 3:25 p.m. UTC | #2
On Wed, Oct 30, 2013 at 08:24:35AM -0700, Paul Walmsley wrote:
> Hi Will et al.,
>
> On 10/09/2013 10:13 AM, Will Deacon wrote:
>> Hi Russell,
>>
>> Please pull the following patches for 3.13. They add support for the pldw
>> instruction (prefetch with intent to modify) in ARMv7 SMP cores, which is
>> then used to gain a measurable performance boost for particular atomic
>> sequences.
>
> Looks like the pldw changes require binutils >= 2.21.  Might be worth  
> considering a patch to update Documentation/Changes?

Not really - because that says "for all architectures the minimum
requirement is now 2.21 or later" and that's certainly not the case.

It's only ARMv7 which requires this.
Stephen Warren Oct. 30, 2013, 3:32 p.m. UTC | #3
On 10/30/2013 09:25 AM, Russell King - ARM Linux wrote:
> On Wed, Oct 30, 2013 at 08:24:35AM -0700, Paul Walmsley wrote:
>> Hi Will et al.,
>>
>> On 10/09/2013 10:13 AM, Will Deacon wrote:
>>> Hi Russell,
>>>
>>> Please pull the following patches for 3.13. They add support for the pldw
>>> instruction (prefetch with intent to modify) in ARMv7 SMP cores, which is
>>> then used to gain a measurable performance boost for particular atomic
>>> sequences.
>>
>> Looks like the pldw changes require binutils >= 2.21.  Might be worth  
>> considering a patch to update Documentation/Changes?
> 
> Not really - because that says "for all architectures the minimum
> requirement is now 2.21 or later" and that's certainly not the case.

By "that", do you mean the text Paul wrote? I don't think he was
suggesting that as a patch.

> It's only ARMv7 which requires this.

Wouldn't it make sense to document this still? Can't we just put a list
of minimum requirements into Documentation/Changes based on architecure,
e.g.:

o  binutils               2.21                    # ld -v (ARMv7)
o  binutils               2.12                    # ld -v (other)

At the very least, the current documentation is wrong, because there are
clearly cases where binutils-2.12 isn't sufficient.
Russell King - ARM Linux Oct. 30, 2013, 4:01 p.m. UTC | #4
On Wed, Oct 30, 2013 at 08:53:08AM -0700, Paul Walmsley wrote:
> Hi Russell,
>
> On Wed, 30 Oct 2013, Russell King - ARM Linux wrote:
>
>> On Wed, Oct 30, 2013 at 08:24:35AM -0700, Paul Walmsley wrote:
>>> Hi Will et al.,
>>>
>>> On 10/09/2013 10:13 AM, Will Deacon wrote:
>>>> Hi Russell,
>>>>
>>>> Please pull the following patches for 3.13. They add support for the pldw
>>>> instruction (prefetch with intent to modify) in ARMv7 SMP cores, which is
>>>> then used to gain a measurable performance boost for particular atomic
>>>> sequences.
>>>
>>> Looks like the pldw changes require binutils >= 2.21.  Might be worth
>>> considering a patch to update Documentation/Changes?
>>
>> Not really - because that says "for all architectures the minimum
>> requirement is now 2.21 or later" and that's certainly not the case.
>>
>> It's only ARMv7 which requires this.
>
> Would you consider something like the following?

I think it's up to others whether we want to start adding this level of
detail to this file.  Given that we already have x86 there, I don't see
a problem with this.

Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Uwe Kleine-König Nov. 11, 2013, 9:08 a.m. UTC | #5
Hello Will,

On Wed, Oct 09, 2013 at 06:13:13PM +0100, Will Deacon wrote:
> The following changes since commit 15c03dd4859ab16f9212238f29dd315654aa94f6:
> 
>   Linux 3.12-rc3 (2013-09-29 15:02:38 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-rmk/prefetch
> 
> for you to fetch changes up to d779c07dd72098a7416d907494f958213b7726f3:
> 
>   ARM: bitops: prefetch the destination word for write prior to strex (2013-09-30 16:42:56 +0100)
> 
> ----------------------------------------------------------------
> Will Deacon (6):
>       ARM: prefetch: remove redundant "cc" clobber
>       ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.h
>       ARM: prefetch: add support for prefetchw using pldw on SMP ARMv7+ CPUs
>       ARM: locks: prefetch the destination word for write prior to strex
>       ARM: atomics: prefetch the destination word for write prior to strex
>       ARM: bitops: prefetch the destination word for write prior to strex
Hello Will,

it seems the last patch breaks on efm32:

arch/arm/lib/changebit.S: Assembler messages:
arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is not allowed for the current base architecture

Best regards
Uwe
Russell King - ARM Linux Nov. 11, 2013, 10:31 a.m. UTC | #6
On Mon, Nov 11, 2013 at 10:08:10AM +0100, Uwe Kleine-König wrote:
> Hello Will,
> 
> it seems the last patch breaks on efm32:
> 
> arch/arm/lib/changebit.S: Assembler messages:
> arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is not allowed for the current base architecture

EFM32 support is not available on either Will's nor my tree, therefore
this is new breakage which we're going to have to fix during the -rc
series.

My view is that we've tried to do too much this merge window: we've
had rather a large number of conflicts of all kinds not only between
patch series (such as the BE series and others) as well as a large
number of build problems caused by changes via my tree interfering
with changes from the arm-soc tree.

In other words, I think that both arm-soc and myself need to push back
and slow things down a bit, and merge less during each cycle, especially
if it is submitted later than about -rc3.
Uwe Kleine-König Nov. 19, 2013, 9:27 a.m. UTC | #7
On Mon, Nov 11, 2013 at 11:25:54AM +0000, Will Deacon wrote:
> On Mon, Nov 11, 2013 at 09:08:10AM +0000, Uwe Kleine-König wrote:
> > Hello Will,
> > 
> > On Wed, Oct 09, 2013 at 06:13:13PM +0100, Will Deacon wrote:
> > > The following changes since commit 15c03dd4859ab16f9212238f29dd315654aa94f6:
> > > 
> > >   Linux 3.12-rc3 (2013-09-29 15:02:38 -0700)
> > > 
> > > are available in the git repository at:
> > > 
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-rmk/prefetch
> > > 
> > > for you to fetch changes up to d779c07dd72098a7416d907494f958213b7726f3:
> > > 
> > >   ARM: bitops: prefetch the destination word for write prior to strex (2013-09-30 16:42:56 +0100)
> > > 
> > > ----------------------------------------------------------------
> > > Will Deacon (6):
> > >       ARM: prefetch: remove redundant "cc" clobber
> > >       ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.h
> > >       ARM: prefetch: add support for prefetchw using pldw on SMP ARMv7+ CPUs
> > >       ARM: locks: prefetch the destination word for write prior to strex
> > >       ARM: atomics: prefetch the destination word for write prior to strex
> > >       ARM: bitops: prefetch the destination word for write prior to strex
> > Hello Will,
> > 
> > it seems the last patch breaks on efm32:
> > 
> > arch/arm/lib/changebit.S: Assembler messages:
> > arch/arm/lib/changebit.S:15: Error: architectural extension `mp' is not allowed for the current base architecture
> 
> I see gas is being as helpful as ever. Something like the (untested) patch
> below should fix the issue.
> 
> Will
> 
> --->8
> 
> diff --git a/arch/arm/lib/bitops.h b/arch/arm/lib/bitops.h
> index e0c68d5bb7dc..52886b89706c 100644
> --- a/arch/arm/lib/bitops.h
> +++ b/arch/arm/lib/bitops.h
> @@ -10,7 +10,7 @@ UNWIND(       .fnstart        )
>         and     r3, r0, #31             @ Get bit offset
>         mov     r0, r0, lsr #5
>         add     r1, r1, r0, lsl #2      @ Get word offset
> -#if __LINUX_ARM_ARCH__ >= 7
> +#if __LINUX_ARM_ARCH__ >= 7 && defined(CONFIG_SMP)
>         .arch_extension mp
>         ALT_SMP(W(pldw) [r1])
>         ALT_UP(W(nop))
> 
It does fix compilation and I booted successfully on my efm32-tree +
next-20131119 + this patch + the patch for the be signal stuff.

Best regards
Uwe
Will Deacon Nov. 19, 2013, 12:01 p.m. UTC | #8
On Tue, Nov 19, 2013 at 09:27:41AM +0000, Uwe Kleine-König wrote:
> On Mon, Nov 11, 2013 at 11:25:54AM +0000, Will Deacon wrote:
> > I see gas is being as helpful as ever. Something like the (untested) patch
> > below should fix the issue.
> > 
> > Will
> > 
> > --->8
> > 
> > diff --git a/arch/arm/lib/bitops.h b/arch/arm/lib/bitops.h
> > index e0c68d5bb7dc..52886b89706c 100644
> > --- a/arch/arm/lib/bitops.h
> > +++ b/arch/arm/lib/bitops.h
> > @@ -10,7 +10,7 @@ UNWIND(       .fnstart        )
> >         and     r3, r0, #31             @ Get bit offset
> >         mov     r0, r0, lsr #5
> >         add     r1, r1, r0, lsl #2      @ Get word offset
> > -#if __LINUX_ARM_ARCH__ >= 7
> > +#if __LINUX_ARM_ARCH__ >= 7 && defined(CONFIG_SMP)
> >         .arch_extension mp
> >         ALT_SMP(W(pldw) [r1])
> >         ALT_UP(W(nop))
> > 
> It does fix compilation and I booted successfully on my efm32-tree +
> next-20131119 + this patch + the patch for the be signal stuff.

Ok, great. Mind if I add your tested-by before I put it in the patch system?

Cheers,

Will
Dirk Behme Nov. 24, 2013, 9:54 a.m. UTC | #9
Am 09.10.2013 19:13, schrieb Will Deacon:
> Hi Russell,
>
> Please pull the following patches for 3.13. They add support for the pldw
> instruction (prefetch with intent to modify) in ARMv7 SMP cores, which is
> then used to gain a measurable performance boost for particular atomic
> sequences.
>
> Cheers,
>
> Will
>
> --->8
>
> The following changes since commit 15c03dd4859ab16f9212238f29dd315654aa94f6:
>
>    Linux 3.12-rc3 (2013-09-29 15:02:38 -0700)
>
> are available in the git repository at:
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git for-rmk/prefetch
>
> for you to fetch changes up to d779c07dd72098a7416d907494f958213b7726f3:
>
>    ARM: bitops: prefetch the destination word for write prior to strex (2013-09-30 16:42:56 +0100)
>
> ----------------------------------------------------------------
> Will Deacon (6):
>        ARM: prefetch: remove redundant "cc" clobber
>        ARM: smp_on_up: move inline asm ALT_SMP patching macro out of spinlock.h
>        ARM: prefetch: add support for prefetchw using pldw on SMP ARMv7+ CPUs
>        ARM: locks: prefetch the destination word for write prior to strex
>        ARM: atomics: prefetch the destination word for write prior to strex
>        ARM: bitops: prefetch the destination word for write prior to strex
>
>   arch/arm/include/asm/atomic.h         |  7 +++++++
>   arch/arm/include/asm/processor.h      | 33 +++++++++++++++++++++++++--------
>   arch/arm/include/asm/spinlock.h       | 28 ++++++++++++++--------------
>   arch/arm/include/asm/spinlock_types.h |  2 +-
>   arch/arm/include/asm/unified.h        |  4 ++++
>   arch/arm/lib/bitops.h                 |  5 +++++
>   6 files changed, 56 insertions(+), 23 deletions(-)

Would this patch series be a candidate for the -stable kernels as 
discussed recently in the "Patches from ARM folks solicited for the 
-stable tree" thread?

Best regards

Dirk
Russell King - ARM Linux Nov. 24, 2013, 10:11 a.m. UTC | #10
On Sun, Nov 24, 2013 at 10:54:07AM +0100, Dirk Behme wrote:
> Would this patch series be a candidate for the -stable kernels as  
> discussed recently in the "Patches from ARM folks solicited for the  
> -stable tree" thread?

I don't think so.

Firstly, patches suitable for the stable tree are also suitable to be
pushed in during -rc time.  In other words, they're bug fixes and
regression fixes.  This is neither.

Secondly, it has the effect that it raises the bar on the binutils to
build the kernel, and if we push that into -stable kernels, those kernels
will fail to build with older binutils - which is itself a regression
as far as -stable is concerned.

So no, I don't think it's appropriate.