diff mbox

[gomp4] expand acc_on_device earlier

Message ID 874mkgob6j.fsf@kepler.schwinge.homeip.net
State New
Headers show

Commit Message

Thomas Schwinge Aug. 3, 2015, 11:37 a.m. UTC
Hi!

On Sun, 2 Aug 2015 21:23:30 -0400, Nathan Sidwell <nathan@acm.org> wrote:
> I've committed this to gomp4 branch.  It expands the acc_on_device builtin 
> earlier in the new oacc_xform pass.  This will allow more optimization earlier on.

Thanks!

> The existing expansion point now only needs to deal with the host-side case.

Actually, no; committed to gomp-4_0-branch in r226498:

commit 5d1009bd4b959da699cf439200ffa571bf547026
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon Aug 3 11:36:01 2015 +0000

    Fix offload compiler build, and de-duplicate acc_on_device logic
    
    When building an x86_64-intelmicemul-linux-gnu offloading compiler with
    r226484, the __builtin_acc_on_device usage in libgomp/oacc-init.c:acc_on_device
    makes it blow up:
    
        [...]/source-gcc/libgomp/oacc-init.c: In function 'acc_on_device':
        [...]/source-gcc/libgomp/oacc-init.c:643:10: internal compiler error: in expand_builtin_acc_on_device, at builtins.c:5891
           return __builtin_acc_on_device (dev);
                  ^
        0x6d9632 expand_builtin_acc_on_device
                [...]/source-gcc/gcc/builtins.c:5891
        0x6d9632 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
                [...]/source-gcc/gcc/builtins.c:7127
        0x8021ad expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool)
                [...]/source-gcc/gcc/expr.c:10361
        0x80e8b5 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, tree_node*)
                [...]/source-gcc/gcc/expr.c:5400
        0x810eb4 expand_assignment(tree_node*, tree_node*, bool)
                [...]/source-gcc/gcc/expr.c:5169
        0x6f927b expand_call_stmt
                [...]/source-gcc/gcc/cfgexpand.c:2349
        0x6f927b expand_gimple_stmt_1
                [...]/source-gcc/gcc/cfgexpand.c:3238
        0x6fa77c expand_gimple_stmt
                [...]/source-gcc/gcc/cfgexpand.c:3399
        0x6fa842 expand_gimple_tailcall
                [...]/source-gcc/gcc/cfgexpand.c:3446
        0x6fc615 expand_gimple_basic_block
                [...]/source-gcc/gcc/cfgexpand.c:5388
        0x701456 execute
                [...]/source-gcc/gcc/cfgexpand.c:6023
    
    (This is not a problem for a nvptx-none offloading compiler, because of its
    open-coded libgcc/config/nvptx/gomp-acc_on_device.c:acc_on_device
    implementation.)
    
    	gcc/
    	* builtins.c (expand_builtin_on_device): Don't expect to be
    	expanded on host compiler.
    	libgcc/
    	* config/nvptx/gomp-acc_on_device.c: Don't include
    	"gomp-constants.h".
    	(acc_on_device): Don't code directly here.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@226498 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp                       |    3 +++
 gcc/builtins.c                           |   30 ++++++++++++++++++------------
 gcc/omp-low.c                            |    7 ++-----
 libgcc/ChangeLog.gomp                    |   10 ++++++++--
 libgcc/config/nvptx/gomp-acc_on_device.c |   16 +++++++---------
 libgomp/oacc-init.c                      |    2 +-
 6 files changed, 39 insertions(+), 29 deletions(-)



Grüße,
 Thomas

Comments

Nathan Sidwell Aug. 3, 2015, noon UTC | #1
On 08/03/15 07:37, Thomas Schwinge wrote:
> Hi!
>
> On Sun, 2 Aug 2015 21:23:30 -0400, Nathan Sidwell <nathan@acm.org> wrote:
>> I've committed this to gomp4 branch.  It expands the acc_on_device builtin
>> earlier in the new oacc_xform pass.  This will allow more optimization earlier on.
>
> Thanks!
>
>> The existing expansion point now only needs to deal with the host-side case.
>
> Actually, no; committed to gomp-4_0-branch in r226498:

Please clarify.  This suggests a logic fault elsewhere.

nathan
Thomas Schwinge Aug. 3, 2015, 12:40 p.m. UTC | #2
Hi!

On Mon, 3 Aug 2015 08:00:36 -0400, Nathan Sidwell <nathan@codesourcery.com> wrote:
> On 08/03/15 07:37, Thomas Schwinge wrote:
> > On Sun, 2 Aug 2015 21:23:30 -0400, Nathan Sidwell <nathan@acm.org> wrote:
> >> I've committed this to gomp4 branch.  It expands the acc_on_device builtin
> >> earlier in the new oacc_xform pass.  This will allow more optimization earlier on.
> >
> > Thanks!
> >
> >> The existing expansion point now only needs to deal with the host-side case.
> >
> > Actually, no; committed to gomp-4_0-branch in r226498:
> 
> Please clarify.  This suggests a logic fault elsewhere.

Hmm, I had the following in the commit log:

    When building an x86_64-intelmicemul-linux-gnu offloading compiler with
    r226484, the __builtin_acc_on_device usage in libgomp/oacc-init.c:acc_on_device
    makes it blow up:

	[...]/source-gcc/libgomp/oacc-init.c: In function 'acc_on_device':
	[...]/source-gcc/libgomp/oacc-init.c:643:10: internal compiler error: in expand_builtin_acc_on_device, at builtins.c:5891
	   return __builtin_acc_on_device (dev);
		  ^
	0x6d9632 expand_builtin_acc_on_device
		[...]/source-gcc/gcc/builtins.c:5891

In case that has been too terse (sorry), here's a bit more detail.

GCC can be configured to use nvptx-none as well as
x86_64-intelmicemul-linux-gnu offloading compilers.  In offloading
compilers' configurations, ACCEL_COMPILER is defined.  The nvptx-none
offloading compilers' libgcc/libgomp build will use the specific
libgcc/config/nvptx/gomp-acc_on_device.c:acc_on_device implementation,
but the x86_64-intelmicemul-linux-gnu one will use the default
libgomp/oacc-init.c:acc_on_device implementation (which is implemented in
terms of __builtin_acc_on_device).  Thus, in the latter case, we'd run
into this gcc_unreachable during the x86_64-intelmicemul-linux-gnu
offloading compiler's libgomp build:

    /* Expand OpenACC acc_on_device.  This is expanded in the openacc
       transform pass, but if the user has this outside of an offloaded
       region, we'll find it here.  In that case we must be host or none.  */
    
    static rtx
    expand_builtin_acc_on_device (tree exp, rtx target)
    {
    #ifdef ACCEL_COMPILER
      gcc_unreachable ();
    #else
      gcc_assert (!get_oacc_fn_attrib (current_function_decl));

In r226498, I then restored the earlier logic of
gcc/builtins.c:expand_builtin_acc_on_device, and could then also simplify
libgcc/config/nvptx/gomp-acc_on_device.c:acc_on_device to the very same
generic/simple implementation in terms of __builtin_acc_on_device.

Does that clarify?


Grüße,
 Thomas
diff mbox

Patch

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index a30f6a3..d5cc1ed 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,8 @@ 
 2015-08-03  Thomas Schwinge  <thomas@codesourcery.com>
 
+	* builtins.c (expand_builtin_on_device): Don't expect to be
+	expanded on host compiler.
+
 	* gimplify.c (oacc_default_clause): Remove in_code formal
 	parameter.  Adjust all users.
 
diff --git gcc/builtins.c gcc/builtins.c
index 9bc08f6..ce369a1 100644
--- gcc/builtins.c
+++ gcc/builtins.c
@@ -5880,39 +5880,45 @@  expand_stack_save (void)
 }
 
 
-/* Expand OpenACC acc_on_device.  This is expanded in the openacc
-   transform pass, but if the user has this outside of an offloaded
-   region, we'll find it here.  In that case we must be host or none.  */
+/* Expand OpenACC acc_on_device.  This is usually expanded in the
+   oacc_transform pass, earlier on, but if used outside of an offloaded region,
+   we'll find it here.  */
 
 static rtx
 expand_builtin_acc_on_device (tree exp, rtx target)
 {
-#ifdef ACCEL_COMPILER
-  gcc_unreachable ();
-#else
+#ifndef ACCEL_COMPILER
   gcc_assert (!get_oacc_fn_attrib (current_function_decl));
+#endif
   
   if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE))
     return NULL_RTX;
 
   tree arg = CALL_EXPR_ARG (exp, 0);
-  rtx val = expand_normal (arg);
+
+  /* Return (arg == v1 || arg == v2) ? 1 : 0.  */
+  machine_mode v_mode = TYPE_MODE (TREE_TYPE (arg));
+  rtx v = expand_normal (arg), v1, v2;
+#ifdef ACCEL_COMPILER
+  v1 = GEN_INT (GOMP_DEVICE_NOT_HOST);
+  v2 = GEN_INT (ACCEL_COMPILER_acc_device);
+#else
+  v1 = GEN_INT (GOMP_DEVICE_NONE);
+  v2 = GEN_INT (GOMP_DEVICE_HOST);
+#endif
   machine_mode target_mode = TYPE_MODE (integer_type_node);
   if (!target || !register_operand (target, target_mode))
     target = gen_reg_rtx (target_mode);
   emit_move_insn (target, const1_rtx);
   rtx_code_label *done_label = gen_label_rtx ();
-  do_compare_rtx_and_jump (val, GEN_INT (GOMP_DEVICE_HOST), EQ,
-			   false, GET_MODE (val), NULL_RTX,
+  do_compare_rtx_and_jump (v, v1, EQ, false, v_mode, NULL_RTX,
 			   NULL, done_label, PROB_EVEN);
-  do_compare_rtx_and_jump (val, GEN_INT (GOMP_DEVICE_NONE), EQ,
-			   false, GET_MODE (val), NULL_RTX,
+  do_compare_rtx_and_jump (v, v2, EQ, false, v_mode, NULL_RTX,
 			   NULL, done_label, PROB_EVEN);
   emit_move_insn (target, const0_rtx);
   emit_label (done_label);
 
   return target;
-#endif
 }
 
 /* Expand a thread-id/thread-count builtin for OpenACC.  */
diff --git gcc/omp-low.c gcc/omp-low.c
index ebf333f..701e8da 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -14510,12 +14510,9 @@  make_pass_late_lower_omp (gcc::context *ctxt)
   return new pass_late_lower_omp (ctxt);
 }
 
-/* Transform an acc_on_device call.  The std requires this folded at
+/* Transform an acc_on_device call.  OpenACC 2.0a requires this folded at
    compile time for constant operands.  We always fold it.  In an
-   offloaded function we're never 'none'.  We cannot detect
-   host_nonshm here, as that's a dynamic feature of the runtime.
-   However, users shouldn't be using host_nonshm anyway, only the
-   test harness.  */
+   offloaded function we're never 'none'.  */
 
 static void
 oacc_xform_on_device (gimple_stmt_iterator *gsi, gimple stmt)
diff --git libgcc/ChangeLog.gomp libgcc/ChangeLog.gomp
index 4aa45e7..085bfda 100644
--- libgcc/ChangeLog.gomp
+++ libgcc/ChangeLog.gomp
@@ -1,6 +1,12 @@ 
-2015-07-17  Nathan Sidwell  <nathan@codesourcery.com>
+2015-08-03  Thomas Schwinge  <thomas@codesourcery.com>
 
-	* config/nvptx/comp-acc_on_device.c: Include gomp-constants.h.
+	* config/nvptx/gomp-acc_on_device.c: Don't include
+	"gomp-constants.h".
+	(acc_on_device): Don't code directly here.
+
+2015-08-02  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* config/nvptx/gomp-acc_on_device.c: Include gomp-constants.h.
 	(acc_on_device): Code directly here.
 
 2015-07-17  Nathan Sidwell  <nathan@codesourcery.com>
diff --git libgcc/config/nvptx/gomp-acc_on_device.c libgcc/config/nvptx/gomp-acc_on_device.c
index e89cd9b..db94350 100644
--- libgcc/config/nvptx/gomp-acc_on_device.c
+++ libgcc/config/nvptx/gomp-acc_on_device.c
@@ -1,14 +1,12 @@ 
-#include "gomp-constants.h"
+/* The compiler always attempts to expand acc_on_device, but if the
+   user disables the builtin, or calls it via a pointer, we have this
+   version.  */
 
-/* For when the builtin is explicitly disabled.  */
-int acc_on_device (int d)
+int
+acc_on_device (int dev)
 {
-  /* We can't use the builtin itself here, because that only expands
-     to device-like things inside offloaded compute regions, which
-     this isn't.  Even though it'll be executed on the device --
-     unless someone builds a host-side PTX compiler, which would be
-     very strange.  */
-  return d == GOMP_DEVICE_NOT_HOST || d == GOMP_DEVICE_NVIDIA_PTX;
+  /* Just rely on the compiler builtin.  */
+  return __builtin_acc_on_device (dev);
 }
 
 int acc_on_device_h_(int *d)
diff --git libgomp/oacc-init.c libgomp/oacc-init.c
index 79a643c..8dfe4a7 100644
--- libgomp/oacc-init.c
+++ libgomp/oacc-init.c
@@ -639,7 +639,7 @@  ialias (acc_set_device_num)
 int
 acc_on_device (int dev)
 {
-  /* It is safe to use the compiler builtin, as we're the host.  */
+  /* Just rely on the compiler builtin.  */
   return __builtin_acc_on_device (dev);
 }