diff mbox series

[middle-end,&,nvptx] gcc/tracer.c: Don't split BB with SIMT LANE [PR95654]

Message ID b85c13ea-c225-1c9f-3305-606007e101a1@codesourcery.com
State New
Headers show
Series [middle-end,&,nvptx] gcc/tracer.c: Don't split BB with SIMT LANE [PR95654] | expand

Commit Message

Tobias Burnus Sept. 16, 2020, 10:39 a.m. UTC
Hi Tom, hi Richard, hello all,

@Richard: does it look okay from the ME side?
@Tom: Can you check which IFN_GOMP_SIMT should be
excluded with -ftracer?

Pre-remark, I do not know much about SIMT – except that they
only appear with nvptx and somehow relate to lanes on the
GPU.

In any case, as the testcase libgomp.fortran/pr66199-5.f90 shows,
if a basic block with  GOMP_SIMT_VOTE_ANY in it is duplicated,
which happens via -ftracer for this testcase, the result is wrong.

The testcase ignores all the loop work but via "lastprivate" takes
the upper loop bound (as assigned to the loop indices); instead of
the expected 32*32 = 1024, either some number (like 4 or very large
or negative) is returned.

While GOMP_SIMT_VOTE_ANY fixes the issue for this testcase, I
have the feeling that at least GOMP_SIMT_LAST_LANE should be
not copied - but I might be wrong.

Tom: Do you think GOMP_SIMT_LAST_LANE should be removed from
that list – or any of the following added as well?
GOMP_USE_SIMT, GOMP_SIMT_ENTER, GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT,
GOMP_SIMT_VF, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_XCHG_BFLY,
GOMP_SIMT_XCHG_IDX

OK for mainline?

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

Comments

Tom de Vries Sept. 16, 2020, 11:08 a.m. UTC | #1
[ cc-ing author omp support for nvptx. ]

On 9/16/20 12:39 PM, Tobias Burnus wrote:
> Hi Tom, hi Richard, hello all,
> 
> @Richard: does it look okay from the ME side?
> @Tom: Can you check which IFN_GOMP_SIMT should be
> excluded with -ftracer?
> 
> Pre-remark, I do not know much about SIMT – except that they
> only appear with nvptx and somehow relate to lanes on the
> GPU.
> 
> In any case, as the testcase libgomp.fortran/pr66199-5.f90 shows,
> if a basic block with  GOMP_SIMT_VOTE_ANY in it is duplicated,
> which happens via -ftracer for this testcase, the result is wrong.
> 
> The testcase ignores all the loop work but via "lastprivate" takes
> the upper loop bound (as assigned to the loop indices); instead of
> the expected 32*32 = 1024, either some number (like 4 or very large
> or negative) is returned.
> 
> While GOMP_SIMT_VOTE_ANY fixes the issue for this testcase, I
> have the feeling that at least GOMP_SIMT_LAST_LANE should be
> not copied - but I might be wrong.
> 
> Tom: Do you think GOMP_SIMT_LAST_LANE should be removed from
> that list – or any of the following added as well?
> GOMP_USE_SIMT, GOMP_SIMT_ENTER, GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT,
> GOMP_SIMT_VF, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_XCHG_BFLY,
> GOMP_SIMT_XCHG_IDX
> 
> OK for mainline?
> 
> Tobias
> 
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
> Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
> Alexander Walter
Alexander Monakov Sept. 16, 2020, 11:24 a.m. UTC | #2
Can you supply a tar with tree dumps for me to look at please?

Also, if you can check if the problem can be triggered without a
collapsed loop (e.g. try removing collapse(2), remove mentions of
d2) and if so supply dumps from that instead, I'd appreciate that too.

Alexander

On Wed, 16 Sep 2020, Tom de Vries wrote:

> [ cc-ing author omp support for nvptx. ]
> 
> On 9/16/20 12:39 PM, Tobias Burnus wrote:
> > Hi Tom, hi Richard, hello all,
> > 
> > @Richard: does it look okay from the ME side?
> > @Tom: Can you check which IFN_GOMP_SIMT should be
> > excluded with -ftracer?
> > 
> > Pre-remark, I do not know much about SIMT – except that they
> > only appear with nvptx and somehow relate to lanes on the
> > GPU.
> > 
> > In any case, as the testcase libgomp.fortran/pr66199-5.f90 shows,
> > if a basic block with  GOMP_SIMT_VOTE_ANY in it is duplicated,
> > which happens via -ftracer for this testcase, the result is wrong.
> > 
> > The testcase ignores all the loop work but via "lastprivate" takes
> > the upper loop bound (as assigned to the loop indices); instead of
> > the expected 32*32 = 1024, either some number (like 4 or very large
> > or negative) is returned.
> > 
> > While GOMP_SIMT_VOTE_ANY fixes the issue for this testcase, I
> > have the feeling that at least GOMP_SIMT_LAST_LANE should be
> > not copied - but I might be wrong.
> > 
> > Tom: Do you think GOMP_SIMT_LAST_LANE should be removed from
> > that list – or any of the following added as well?
> > GOMP_USE_SIMT, GOMP_SIMT_ENTER, GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT,
> > GOMP_SIMT_VF, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_XCHG_BFLY,
> > GOMP_SIMT_XCHG_IDX
> > 
> > OK for mainline?
> > 
> > Tobias
> > 
> > -----------------
> > Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
> > Germany
> > Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
> > Alexander Walter
>
Tom de Vries Sept. 16, 2020, 3:58 p.m. UTC | #3
On 9/16/20 1:24 PM, Alexander Monakov wrote:
> Can you supply a tar with tree dumps for me to look at please?
> 

Hi Alexander,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95654#c6 .

> Also, if you can check if the problem can be triggered without a
> collapsed loop (e.g. try removing collapse(2), remove mentions of
> d2) and if so supply dumps from that instead, I'd appreciate that too.
> 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95654#c8

Thanks,
- Tom

> Alexander
> 
> On Wed, 16 Sep 2020, Tom de Vries wrote:
> 
>> [ cc-ing author omp support for nvptx. ]
>>
>> On 9/16/20 12:39 PM, Tobias Burnus wrote:
>>> Hi Tom, hi Richard, hello all,
>>>
>>> @Richard: does it look okay from the ME side?
>>> @Tom: Can you check which IFN_GOMP_SIMT should be
>>> excluded with -ftracer?
>>>
>>> Pre-remark, I do not know much about SIMT – except that they
>>> only appear with nvptx and somehow relate to lanes on the
>>> GPU.
>>>
>>> In any case, as the testcase libgomp.fortran/pr66199-5.f90 shows,
>>> if a basic block with  GOMP_SIMT_VOTE_ANY in it is duplicated,
>>> which happens via -ftracer for this testcase, the result is wrong.
>>>
>>> The testcase ignores all the loop work but via "lastprivate" takes
>>> the upper loop bound (as assigned to the loop indices); instead of
>>> the expected 32*32 = 1024, either some number (like 4 or very large
>>> or negative) is returned.
>>>
>>> While GOMP_SIMT_VOTE_ANY fixes the issue for this testcase, I
>>> have the feeling that at least GOMP_SIMT_LAST_LANE should be
>>> not copied - but I might be wrong.
>>>
>>> Tom: Do you think GOMP_SIMT_LAST_LANE should be removed from
>>> that list – or any of the following added as well?
>>> GOMP_USE_SIMT, GOMP_SIMT_ENTER, GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT,
>>> GOMP_SIMT_VF, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_XCHG_BFLY,
>>> GOMP_SIMT_XCHG_IDX
>>>
>>> OK for mainline?
>>>
>>> Tobias
>>>
>>> -----------------
>>> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
>>> Germany
>>> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
>>> Alexander Walter
Alexander Monakov Sept. 16, 2020, 6:20 p.m. UTC | #4
On Wed, 16 Sep 2020, Tom de Vries wrote:

> [ cc-ing author omp support for nvptx. ]

The issue looks familiar. I recognized it back in 2017 (and LLVM people
recognized it too for their GPU targets). In an attempt to get agreement
to fix the issue "properly" for GCC I found a similar issue that affects
all targets, not just offloading, and filed it as PR 80053.

(yes, there are no addressable labels involved in offloading, but nevertheless
the nature of the middle-end issue is related)

Alexander
Tom de Vries Sept. 22, 2020, 4:31 p.m. UTC | #5
On 9/16/20 12:39 PM, Tobias Burnus wrote:
> Hi Tom, hi Richard, hello all,
> 
> @Richard: does it look okay from the ME side?
> @Tom: Can you check which IFN_GOMP_SIMT should be
> excluded with -ftracer?
> 
> Pre-remark, I do not know much about SIMT – except that they
> only appear with nvptx and somehow relate to lanes on the
> GPU.
> 
> In any case, as the testcase libgomp.fortran/pr66199-5.f90 shows,
> if a basic block with  GOMP_SIMT_VOTE_ANY in it is duplicated,
> which happens via -ftracer for this testcase, the result is wrong.
> 
> The testcase ignores all the loop work but via "lastprivate" takes
> the upper loop bound (as assigned to the loop indices); instead of
> the expected 32*32 = 1024, either some number (like 4 or very large
> or negative) is returned.
> 
> While GOMP_SIMT_VOTE_ANY fixes the issue for this testcase, I
> have the feeling that at least GOMP_SIMT_LAST_LANE should be
> not copied - but I might be wrong.
> 
> Tom: Do you think GOMP_SIMT_LAST_LANE should be removed from
> that list – or any of the following added as well?
> GOMP_USE_SIMT, GOMP_SIMT_ENTER, GOMP_SIMT_ENTER_ALLOC, GOMP_SIMT_EXIT,
> GOMP_SIMT_VF, GOMP_SIMT_ORDERED_PRED, GOMP_SIMT_XCHG_BFLY,
> GOMP_SIMT_XCHG_IDX
> 
> OK for mainline?
> 

You can commit the test-case bit as obvious.

Thanks,
- Tom

> libgomp/ChangeLog:
> 
> 	* testsuite/libgomp.fortran/pr66199-5.f90: Make stop codes unique.

> diff --git a/libgomp/testsuite/libgomp.fortran/pr66199-5.f90 b/libgomp/testsuite/libgomp.fortran/pr66199-5.f90
> index 9482f08..2627a81 100644
> --- a/libgomp/testsuite/libgomp.fortran/pr66199-5.f90
> +++ b/libgomp/testsuite/libgomp.fortran/pr66199-5.f90
> @@ -67,5 +67,5 @@ program main
>    if (f1 (0, 1024) /= 1024) stop 1
>    if (f2 (0, 1024, 17) /= 1024 + (17 + 5 * 1023)) stop 2
>    if (f3 (0, 32, 0, 32) /= 64) stop 3
> -  if (f4 (0, 32, 0, 32) /= 64) stop 3
> +  if (f4 (0, 32, 0, 32) /= 64) stop 4
>  end
diff mbox series

Patch

gcc/tracer.c: Don't split BB with SIMT LANE [PR95654]

gcc/ChangeLog:

	PR fortran/95654
	* gimple.h (gimple_call_internal_gomp_simt): Add; return true
	if call is IFN_GOMP_SIMT_LAST_LANE or IFN_GOMP_SIMT_VOTE_ANY.
	* tracer.c (ignore_bb_p): Return true if the BB contains those simt ifn.

libgomp/ChangeLog:

	PR fortran/95654
	* testsuite/libgomp.fortran/pr66199-5.f90: Make stop codes unique.

 gcc/gimple.h                                    | 17 +++++++++++++++++
 gcc/tracer.c                                    | 12 ++++++++++++
 libgomp/testsuite/libgomp.fortran/pr66199-5.f90 |  2 +-
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/gcc/gimple.h b/gcc/gimple.h
index 6cc7e66..e0cefc9 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -3021,6 +3021,23 @@  gimple_call_internal_unique_p (const gimple *gs)
   return gimple_call_internal_unique_p (gc);
 }
 
+/* Return true, if this internal gimple call is
+   either gomp_simt_last_lane or gomp_simt_vote_any.  */
+
+static inline bool
+gimple_call_internal_gomp_simt (const gcall *gs)
+{
+  return gimple_call_internal_fn (gs) == IFN_GOMP_SIMT_LAST_LANE
+  || gimple_call_internal_fn (gs) == IFN_GOMP_SIMT_VOTE_ANY;
+}
+
+static inline bool
+gimple_call_internal_gomp_simt (const gimple *gs)
+{
+  const gcall *gc = GIMPLE_CHECK2<const gcall *> (gs);
+  return gimple_call_internal_gomp_simt (gc);
+}
+
 /* Return true if GS is an internal function FN.  */
 
 static inline bool
diff --git a/gcc/tracer.c b/gcc/tracer.c
index 82ede72..6c02e79 100644
--- a/gcc/tracer.c
+++ b/gcc/tracer.c
@@ -37,6 +37,7 @@ 
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
+#include "target.h"
 #include "rtl.h"
 #include "tree.h"
 #include "gimple.h"
@@ -107,6 +108,17 @@  ignore_bb_p (const_basic_block bb)
 	  && gimple_call_internal_unique_p (g))
 	return true;
     }
+  if (targetm.have_omp_simt_lane ())
+    for (gimple_stmt_iterator gsi = gsi_start_bb (CONST_CAST_BB (bb));
+	 !gsi_end_p (gsi);
+	 gsi_next (&gsi))
+      {
+	gimple *g = gsi_stmt (gsi);
+	if (is_gimple_call (g)
+	    && gimple_call_internal_p (g)
+	    && gimple_call_internal_gomp_simt (g))
+	  return true;
+      }
 
   return false;
 }
diff --git a/libgomp/testsuite/libgomp.fortran/pr66199-5.f90 b/libgomp/testsuite/libgomp.fortran/pr66199-5.f90
index 9482f08..2627a81 100644
--- a/libgomp/testsuite/libgomp.fortran/pr66199-5.f90
+++ b/libgomp/testsuite/libgomp.fortran/pr66199-5.f90
@@ -67,5 +67,5 @@  program main
   if (f1 (0, 1024) /= 1024) stop 1
   if (f2 (0, 1024, 17) /= 1024 + (17 + 5 * 1023)) stop 2
   if (f3 (0, 32, 0, 32) /= 64) stop 3
-  if (f4 (0, 32, 0, 32) /= 64) stop 3
+  if (f4 (0, 32, 0, 32) /= 64) stop 4
 end