Message ID | 0162d4b1-fe2a-a670-7a39-6fcef4dae19a@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | [nvptx] libgomp: Split testcase in order to XFAIL __sync_val_compare_and_swap_16 (was: [PATCH] nvptx: Add support for subword compare-and-swap) | expand |
On 9/1/20 1:41 PM, Tobias Burnus wrote: > Hi Tom, hello all, > > it turned out that the testcase fails on PowerPC (but not x86_64) > as the nvptx lto complains: unresolved symbol > __sync_val_compare_and_swap_16 > > The testcase uses int128 – and that's the culprit, but I have no idea > why it only fails with PowerPC and not with x86-64. > Well, I'm guessing the explanation is here in omp-expand.c: ... /* Expand an GIMPLE_OMP_ATOMIC statement. We try to expand using expand_omp_atomic_fetch_op. If it failed, we try to call expand_omp_atomic_pipeline, and if it fails too, the ultimate fallback is wrapping the operation in a mutex (expand_omp_atomic_mutex). REGION is the atomic region built by build_omp_regions_1(). */ static void expand_omp_atomic (struct omp_region *region) ... In the x86_64 case, when doing: ... $ gcc-11 src/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c -fdump-tree-all-details -fopenmp ... we get: ... <bb 33> : D.3382 = .omp_data_i->res; GOMP_atomic_start (); D.3383 = MEM[(__int128 * {ref-all})D.3382]; <bb 34> : D.3384 = (_Bool) D.3383; if (D.3384 != 0) goto <bb 35>; [INV] else goto <bb 37>; [INV] <bb 38> : MEM[(__int128 * {ref-all})D.3382] = iftmp.80; GOMP_atomic_end (); ... which means we're triggering the "expand_omp_atomic_mutex" case for x86_64. Apparently we're triggering the "expand_omp_atomic_pipeline" for powerpc. > Unless someone sees a good way to implement __sync_val_compare_and_swap_16, Hmm, one could implement it in the compiler using calls to GOMP_atomic_start/GOMP_atomic_end, but it feels somewhat hacky. Thanks, - Tom
On 9/1/20 2:58 PM, Tom de Vries wrote: > On 9/1/20 1:41 PM, Tobias Burnus wrote: >> Hi Tom, hello all, >> >> it turned out that the testcase fails on PowerPC (but not x86_64) >> as the nvptx lto complains: unresolved symbol >> __sync_val_compare_and_swap_16 >> >> The testcase uses int128 – and that's the culprit, but I have no idea >> why it only fails with PowerPC and not with x86-64. >> > Reproduced on x86_64 using trigger patch: ... $ git diff diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index ed17bb00205..eccedac192f 100644 --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -153,9 +153,15 @@ (DI "TARGET_64BIT || (TARGET_CMPXCHG8B && (TARGET_80387 || TARGET_SSE))") ]) + (define_mode_iterator ATOMIC2 + [QI HI SI + (DI "TARGET_64BIT || (TARGET_CMPXCHG8B && (TARGET_80387 || TARGET_SSE))") + TI + ]) + (define_expand "atomic_load<mode>" - [(set (match_operand:ATOMIC 0 "nonimmediate_operand") - (unspec:ATOMIC [(match_operand:ATOMIC 1 "memory_operand") + [(set (match_operand:ATOMIC2 0 "nonimmediate_operand") + (unspec:ATOMIC2 [(match_operand:ATOMIC2 1 "memory_operand") (match_operand:SI 2 "const_int_operand")] UNSPEC_LDA))] "" diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c index d0e82b04790..62b0e032c33 100644 --- a/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* { dg-additional-options "-mcx16" } */ #include <stdlib.h> ... Thanks, - Tom
libgomp: Split testcase in order to XFAIL __sync_val_compare_and_swap_16 On PowerPC, only, __sync_val_compare_and_swap_16 is generated for the testcase – which nvptx currently does not support. libgomp/ChangeLog: * testsuite/libgomp.c-c++-common/reduction-16.c: Move int128 test to ... * testsuite/libgomp.c-c++-common/reduction-17.c: ... this new test. * testsuite/libgomp.c-c++-common/reduction-17a.c: New test. .../testsuite/libgomp.c-c++-common/reduction-16.c | 8 +---- .../testsuite/libgomp.c-c++-common/reduction-17.c | 41 ++++++++++++++++++++++ .../testsuite/libgomp.c-c++-common/reduction-17a.c | 11 ++++++ 3 files changed, 53 insertions(+), 7 deletions(-) diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c index d0e82b04790..02ec4cd36b8 100644 --- a/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c @@ -1,4 +1,5 @@ /* { dg-do run } */ +/* See also reduction-17.c for an int128 testcase. */ #include <stdlib.h> @@ -32,9 +33,6 @@ GENERATE_TEST(char) GENERATE_TEST(short) GENERATE_TEST(int) GENERATE_TEST(long) -#ifdef __SIZEOF_INT128__ -GENERATE_TEST(__int128) -#endif int main(void) { @@ -46,8 +44,4 @@ int main(void) abort (); if (test_long ()) abort (); -#ifdef __SIZEOF_INT128__ - if (test___int128 ()) - abort (); -#endif } diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-17.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-17.c new file mode 100644 index 00000000000..44dabf6e139 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-17.c @@ -0,0 +1,41 @@ +/* { dg-do run { target { ! powerpc*-*-linux* } || { ! offload_target_nvptx } } } */ +/* { dg-require-effective-target int128 } */ + +/* See also reduction-17a.c for powerpc*-*-linux* + offload_target_nvptx. */ +/* See also reduction-16.c for a char/short/int/long testcase. */ + +#include <stdlib.h> + +#define N 512 + +#define GENERATE_TEST(T) \ +int test_##T (void) \ +{ \ + T a[N], res = 0; \ + \ + for (int i = 0; i < N; ++i) \ + a[i] = i & 1; \ + \ +_Pragma("omp target teams distribute reduction(||:res) defaultmap(tofrom:scalar)") \ + for (int i = 0; i < N; ++i) \ + res = res || a[i]; \ + \ + /* res should be non-zero. */\ + if (!res) \ + return 1; \ + \ +_Pragma("omp target teams distribute reduction(&&:res) defaultmap(tofrom:scalar)") \ + for (int i = 0; i < N; ++i) \ + res = res && a[i]; \ + \ + /* res should be zero. */ \ + return res; \ +} + +GENERATE_TEST(__int128) + +int main(void) +{ + if (test___int128 ()) + abort (); +} diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-17a.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-17a.c new file mode 100644 index 00000000000..1e6a7bc990f --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-17a.c @@ -0,0 +1,11 @@ +/* Duplicates reduction-17.c as it fails on PowerPC with nvptx offload + due to 'unresolved symbol __sync_val_compare_and_swap_16' on the nvptx side. */ + +/* { dg-do link { target { powerpc*-*-linux* } && { offload_target_nvptx } } } */ +/* { dg-require-effective-target int128 } */ +/* { dg-xfail-if "__sync_val_compare_and_swap_16" { *-*-* } } */ + +/* See also reduction-17a.c for either not powerpc*-*-linux* or not offload_target_nvptx. */ +/* See also reduction-16.c for a char/short/int/long testcase. */ + +#include "reduction-17.c"