mbox

[GIT,PULL] ARM: kernel mode NEON support

Message ID CAKv+Gu_0XXdCw4-V6LLHCiaH4Wc+=4DTzP6gwgV_h-Qb90gYTA@mail.gmail.com
State New
Headers show

Pull-request

git://git.linaro.org/people/ardbiesheuvel/linux-arm.git for-rmk

Message

Ard Biesheuvel July 8, 2013, 10:23 p.m. UTC
The following changes since commit 8bb495e3f02401ee6f76d1b1d77f3ac9f079e376:

  Linux 3.10 (2013-06-30 15:13:29 -0700)

are available in the git repository at:

  git://git.linaro.org/people/ardbiesheuvel/linux-arm.git for-rmk

for you to fetch changes up to 7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b:

  lib/raid6: add ARM-NEON accelerated syndrome calculation (2013-07-08
22:09:18 +0100)

----------------------------------------------------------------
Ard Biesheuvel (5):
      ARM: move VFP init to an earlier boot stage
      ARM: be strict about FP exceptions in kernel mode
      ARM: add support for kernel mode NEON
      ARM: crypto: add NEON accelerated XOR implementation
      lib/raid6: add ARM-NEON accelerated syndrome calculation

 arch/arm/Kconfig            |  7 +++++++
 arch/arm/include/asm/neon.h | 36 ++++++++++++++++++++++++++++++++++++
 arch/arm/include/asm/xor.h  | 73
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 arch/arm/lib/Makefile       |  6 ++++++
 arch/arm/lib/xor-neon.c     | 42 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/vfp/vfphw.S        |  5 +++++
 arch/arm/vfp/vfpmodule.c    | 69
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/raid/pq.h     |  5 +++++
 lib/raid6/.gitignore        |  1 +
 lib/raid6/Makefile          | 40 ++++++++++++++++++++++++++++++++++++++++
 lib/raid6/algos.c           |  6 ++++++
 lib/raid6/neon.c            | 58
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/raid6/neon.uc           | 80
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 lib/raid6/test/Makefile     | 26 +++++++++++++++++++++++++-
 14 files changed, 452 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm/include/asm/neon.h
 create mode 100644 arch/arm/lib/xor-neon.c
 create mode 100644 lib/raid6/neon.c
 create mode 100644 lib/raid6/neon.uc

Comments

Russell King - ARM Linux July 22, 2013, 4:31 p.m. UTC | #1
On Mon, Jul 08, 2013 at 11:23:11PM +0100, Ard Biesheuvel wrote:
> The following changes since commit 8bb495e3f02401ee6f76d1b1d77f3ac9f079e376:
> 
>   Linux 3.10 (2013-06-30 15:13:29 -0700)
> 
> are available in the git repository at:
> 
>   git://git.linaro.org/people/ardbiesheuvel/linux-arm.git for-rmk
> 
> for you to fetch changes up to 7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b:
> 
>   lib/raid6: add ARM-NEON accelerated syndrome calculation (2013-07-08
> 22:09:18 +0100)

I'm assuming that the comments in your previous postings are valid as I've
included those in the merge commit:

    I have included two use cases that I have been using, XOR and RAID-6
    checksumming. The former gets a 60% performance boost on the NEON, the
    latter over 400%.

    ARM: add support for kernel mode NEON

    Adds kernel_neon_begin/end (renamed from kernel_vfp_begin/end in the
    previous version to de-emphasize the VFP part as VFP code that needs
    software assistance is not supported currently.)

    Introduces <asm/neon.h> and the Kconfig symbol KERNEL_MODE_NEON. This
    has been aligned with Catalin for arm64, so any NEON code that does
    not use assembly but intrinsics or the GCC vectorizer (such as my
    examples) can potentially be shared between arm and arm64 archs.

    ARM: move VFP init to an earlier boot stage

    This is needed so the NEON is enabled when the XOR and RAID-6 algo
    boot time benchmarks are run.

    ARM: be strict about FP exceptions in kernel mode

    This adds a check to vfp_support_entry() to flag unsupported uses of
    the NEON/VFP in kernel mode. FP exceptions (bounces) are flagged as
    a BUG(), this is because of their potentially intermittent nature.
    Exceptions caused by the fact that kernel_neon_begin has not been
    called are just routed through the undef handler.

    ARM: crypto: add NEON accelerated XOR implementation

    This is the xor_blocks() implementation built with -ftree-vectorize,
    60% faster than optimized ARM code. It calls in_interrupt() to check
    whether the NEON flavor can be used: this should really not be
    necessary, but due to xor_blocks'squite generic nature, there is no
    telling how exactly people may be using it in the real world.

    lib/raid6: add ARM-NEON accelerated syndrome calculation

    This is a port of the RAID-6 checksumming code in altivec.uc ported
    to use NEON intrinsics. It is about 4x faster than the sequential
    code.
Ard Biesheuvel July 22, 2013, 4:45 p.m. UTC | #2
On 22 July 2013 18:31, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> On Mon, Jul 08, 2013 at 11:23:11PM +0100, Ard Biesheuvel wrote:
>> The following changes since commit 8bb495e3f02401ee6f76d1b1d77f3ac9f079e376:
>>
>>   Linux 3.10 (2013-06-30 15:13:29 -0700)
>>
>> are available in the git repository at:
>>
>>   git://git.linaro.org/people/ardbiesheuvel/linux-arm.git for-rmk
>>
>> for you to fetch changes up to 7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b:
>>
>>   lib/raid6: add ARM-NEON accelerated syndrome calculation (2013-07-08
>> 22:09:18 +0100)
>
> I'm assuming that the comments in your previous postings are valid as I've
> included those in the merge commit:
>

I think they're close enough. I did remove the BUG() call in the
kernel mode FP exception handler, as just returning from that function
will cause an oops to be triggered anyway.

Cheers,