diff mbox

[gomp4] default to runtime gang size

Message ID bf23f225-b056-a3f3-a0f4-74bc42f9211c@acm.org
State New
Headers show

Commit Message

Nathan Sidwell Aug. 30, 2016, 6:30 p.m. UTC
This patch changes the default for gang partitioned size to be determined at 
runtime (and thus interrogate the hardware).

The auto-loop test is designed for num_gangs 32.  The fortran nested function 
test appears to also require that, but my be hiding another defect.  Its use of 
gang(static:1) seems a little strange and not justified by the comments 
discussing its use.

nathan
diff mbox

Patch

2016-08-28  Nathan Sidwell  <nathan@codesourcery.com>

	gcc/
	* config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Set to zero.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Set gang
	dimension.
	* testsuite/libgomp.oacc-fortran/nested-function-1.f90: Likewise.

Index: gcc/config/nvptx/nvptx.c
===================================================================
--- gcc/config/nvptx/nvptx.c	(revision 239868)
+++ gcc/config/nvptx/nvptx.c	(working copy)
@@ -4157,7 +4157,7 @@  nvptx_expand_builtin (tree exp, rtx targ
 /* Define dimension sizes for known hardware.  */
 #define PTX_VECTOR_LENGTH 32
 #define PTX_WORKER_LENGTH 32
-#define PTX_GANG_DEFAULT  32
+#define PTX_GANG_DEFAULT  0 /* Defer to runtime.  */
 
 /* Validate compute dimensions of an OpenACC offload or routine, fill
    in non-unity defaults.  FN_LEVEL indicates the level at which a
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c	(revision 239868)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c	(working copy)
@@ -2,6 +2,8 @@ 
    not optimized away at -O0, and then confuses the target assembler.
    { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
 
+/* { dg-additional-options "-fopenacc-dim=32" } */
+
 #include <stdio.h>
 #include <openacc.h>
 
@@ -151,8 +153,7 @@  int gang_1 (int *ary, int size)
 {
   clear (ary, size);
   
-#pragma acc parallel num_gangs (32) num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size)
-  /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 154 } */
+#pragma acc parallel num_gangs (32) num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size)/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } } */
   {
 #pragma acc loop auto
     for (int jx = 0; jx <  size  / 64; jx++)
Index: libgomp/testsuite/libgomp.oacc-fortran/nested-function-1.f90
===================================================================
--- libgomp/testsuite/libgomp.oacc-fortran/nested-function-1.f90	(revision 239868)
+++ libgomp/testsuite/libgomp.oacc-fortran/nested-function-1.f90	(working copy)
@@ -1,6 +1,7 @@ 
 ! Exercise nested function decomposition, gcc/tree-nested.c.
 
 ! { dg-do run }
+! { dg-additional-options "-fopenacc-dim=32" }
 
 program collapse2
   call test1