Message ID | DB5PR08MB1030A888EFC9735023F2021D83BE0@DB5PR08MB1030.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
Series | [AArch64] Add ifunc support for Ares | expand |
On 19/12/18 11:58 PM, Wilco Dijkstra wrote: > Add Ares to the midr_el0 list and support ifunc dispatch. Since Ares > supports 2 128-bit loads/stores, use Neon registers for memcpy by > selecting __memcpy_falkor by default (we should rename this to > __memcpy_simd or similar). The falkor memcpy has a very specific quirk that reuses register names to optimise prefetcher usage so it may not necessarily work well with other implementations. Perhaps a new implementation similar to the stock memcpy but with vector registers would be more suitable for a __memcpy_simd. Siddhesh
Hi Siddhesh, > The falkor memcpy has a very specific quirk that reuses register names > to optimise prefetcher usage so it may not necessarily work well with > other implementations. Perhaps a new implementation similar to the > stock memcpy but with vector registers would be more suitable for a > __memcpy_simd. Reusing registers does not matter on an out-of-order core since they are renamed. But you're right that a generic SIMD memcpy could do better. For example using LDP/STP of Q registers will be smaller and faster, maybe even on Falkor. Cheers, Wilco
diff --git a/manual/tunables.texi b/manual/tunables.texi index 09a25655aeebbbd5489c1680e00ba4444d21dcc0..af820820e044f718b125d39491b35e7e273da20c 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -360,7 +360,7 @@ This tunable is specific to powerpc, powerpc64 and powerpc64le. The @code{glibc.cpu.name=xxx} tunable allows the user to tell @theglibc{} to assume that the CPU is @code{xxx} where xxx may have one of these values: @code{generic}, @code{falkor}, @code{thunderxt88}, @code{thunderx2t99}, -@code{thunderx2t99p1}. +@code{thunderx2t99p1}, @code{ares}. This tunable is specific to aarch64. @end deftp diff --git a/sysdeps/aarch64/multiarch/memcpy.c b/sysdeps/aarch64/multiarch/memcpy.c index 4a04a63b0fe0c84b9225286c4aaf1386d01a736a..8f5d4e7df51af09c88b3c4c4d2a0a0f477ce405e 100644 --- a/sysdeps/aarch64/multiarch/memcpy.c +++ b/sysdeps/aarch64/multiarch/memcpy.c @@ -36,7 +36,7 @@ extern __typeof (__redirect_memcpy) __memcpy_falkor attribute_hidden; libc_ifunc (__libc_memcpy, (IS_THUNDERX (midr) ? __memcpy_thunderx - : (IS_FALKOR (midr) || IS_PHECDA (midr) + : (IS_FALKOR (midr) || IS_PHECDA (midr) || IS_ARES (midr) ? __memcpy_falkor : (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr) ? __memcpy_thunderx2 diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h b/sysdeps/unix/sysv/linux/aarch64/cpu-features.h index eb35adfbe9d429d5622a712738fa75bafe8e7322..153d258afe975ab463580cba4248a6950126c89a 100644 --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.h @@ -51,6 +51,8 @@ #define IS_PHECDA(midr) (MIDR_IMPLEMENTOR(midr) == 'h' \ && MIDR_PARTNUM(midr) == 0x000) +#define IS_ARES(midr) (MIDR_IMPLEMENTOR(midr) == 'A' \ + && MIDR_PARTNUM(midr) == 0xd0c) struct cpu_features { diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c index b4f348509eb1c6b319add6eb8ed8a198c00df149..69be36869ebcc2105ec161b24d52be9bbf00b627 100644 --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c @@ -36,6 +36,7 @@ static struct cpu_list cpu_list[] = { {"thunderx2t99", 0x431F0AF0}, {"thunderx2t99p1", 0x420F5160}, {"phecda", 0x680F0000}, + {"ares", 0x411FD0C0}, {"generic", 0x0} };