diff mbox series

[committed,nvptx] Update default ptx isa to 6.3

Message ID 20220201183057.GA4227@delia.home
State New
Headers show
Series [committed,nvptx] Update default ptx isa to 6.3 | expand

Commit Message

Tom de Vries Feb. 1, 2022, 6:30 p.m. UTC
Hi,

With the following example, minimized from parallel-dims.c:
...
int
main (void)
{
  int vectors_max = -1;
  #pragma acc parallel num_gangs (1) num_workers (1) copy (vectors_max)
  {
    for (int i = 0; i < 2; i++)
      for (int j = 0; j < 2; j++)
        #pragma acc loop vector reduction (max: vectors_max)
        for (int k = 0; k < 32; k++)
          vectors_max = k;
  }

  if (vectors_max != 31)
    __builtin_abort ();

  return 0;
}
...
I run into (T400, driver version 470.94):
...
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/parallel-dims.c \
  -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 \
  execution test
...
The FAIL does not happen with GOMP_NVPTX_JIT=-O0.

The problem seems to be that the shfl insns for the vector reduction are not
executed uniformly by the warp.  Enforcing this by using shfl.sync fixes the
problem.

Fix this by setting the ptx isa to 6.3 by default, which allows the use of
shfl.sync.

Tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom

[nvptx] Update default ptx isa to 6.3

gcc/ChangeLog:

2022-01-27  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.opt (mptx): Set to PTX_VERSION_6_3 by default.

---
 gcc/config/nvptx/nvptx.opt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 6514dd326fb..6e12b1f7296 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -89,5 +89,5 @@  EnumValue
 Enum(ptx_version) String(7.0) Value(PTX_VERSION_7_0)
 
 mptx=
-Target RejectNegative ToLower Joined Enum(ptx_version) Var(ptx_version_option) Init(PTX_VERSION_3_1)
+Target RejectNegative ToLower Joined Enum(ptx_version) Var(ptx_version_option) Init(PTX_VERSION_6_3)
 Specify the version of the ptx version to use.