diff mbox series

OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs [PR100280, PR104132, PR104133]

Message ID 877d99omoe.fsf@euler.schwinge.homeip.net
State New
Headers show
Series OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs [PR100280, PR104132, PR104133] | expand

Commit Message

Thomas Schwinge March 4, 2022, 1:46 p.m. UTC
Hi!

On 2022-03-01T17:46:20+0100, I wrote:
> On 2022-01-13T10:54:16+0100, I wrote:
>> On 2019-05-08T14:51:57+0100, Julian Brown <julian@codesourcery.com> wrote:
>>>  - The "addressable" bit is set during the kernels conversion pass for
>>>    variables that have "create" (alloc) clauses created for them in the
>>>    synthesised outer data region (instead of in the front-end, etc.,
>>>    where it can't be done accurately). Such variables actually have
>>>    their address taken during transformations made in a later pass
>>>    (omp-low, I think), but there's a phase-ordering problem that means
>>>    the flag should be set earlier.
>>
>> The actual issue is a bit different, but yes, there is a problem.
>> The related ICE has also been reported as <https://gcc.gnu.org/PR100280>
>> "ICE in lower_omp_target, at omp-low.c:12287".  (And I'm confused why we
>> didn't run into that with the OpenACC 'kernels' decomposition
>> originally.)  I've pushed to master branch
>> commit 9b32c1669aad5459dd053424f9967011348add83
>> "OpenACC 'kernels' decomposition: Mark variables used in synthesized data
>> clauses as addressable [PR100280]"

>> --- a/gcc/omp-oacc-kernels-decompose.cc
>> +++ b/gcc/omp-oacc-kernels-decompose.cc
>> @@ -793,7 +793,8 @@ make_data_region_try_statement (location_t loc, gimple *body)
>>
>>  /* If INNER_BIND_VARS holds variables, build an OpenACC data region with
>>     location LOC containing BODY and having 'create (var)' clauses for each
>> -   variable.  If INNER_CLEANUP is present, add a try-finally statement with
>> +   variable (as a side effect, such variables also get TREE_ADDRESSABLE set).
>> +   If INNER_CLEANUP is present, add a try-finally statement with
>>     this cleanup code in the finally block.  Return the new data region, or
>>     the original BODY if no data region was needed.  */
>>
>> @@ -842,6 +843,9 @@ maybe_build_inner_data_region (location_t loc, gimple *body,
>>         inner_data_clauses = new_clause;
>>
>>         prev_mapped_var = v;
>> +
>> +       /* See <https://gcc.gnu.org/PR100280>.  */
>> +       TREE_ADDRESSABLE (v) = 1;
>>       }
>>      }
>
> So, that's too simple.  ;-) [...]

> We're after gimplification, and must not just set 'TREE_ADDRESSABLE',
> because that may easily violate GIMPLE invariants, leading to ICEs later.
> There are a few open PRs, which my following changes are addressing.  To
> make "late" 'TREE_ADDRESSABLE' work, we have a precedent in OpenMP's
> 'gcc/omp-low.cc:task_shared_vars' handling, as Jakub had pointed to in
> discussion of <https://gcc.gnu.org/PR102330>.

> I'm thus proposing to generalize 'gcc/omp-low.cc:task_shared_vars' into
> 'make_addressable_vars', plus new 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE'
> that we then may use instead of the 'TREE_ADDRESSABLE (v) = 1;' quoted
> above (plus one or two additional ones to be introduced in later
> patches), and wire that up in 'gcc/omp-low.cc:scan_sharing_clauses', for
> 'OMP_CLAUSE_MAP': set 'TREE_ADDRESSABLE' and put into
> 'make_addressable_vars' for later fix-up.

Pushed to master branch commit 8935589b496f755e08cadf26d8ceddf0dd6e0968
"OMP lowering: Regimplify 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs
[PR100280, PR104132, PR104133]", see attached.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
diff mbox series

Patch

From 8935589b496f755e08cadf26d8ceddf0dd6e0968 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Tue, 15 Feb 2022 23:31:34 +0100
Subject: [PATCH] OMP lowering: Regimplify
 'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs [PR100280, PR104132, PR104133]

... by generalizing the existing 'gcc/omp-low.cc:task_shared_vars'.

Fix-up for commit 9b32c1669aad5459dd053424f9967011348add83
"OpenACC 'kernels' decomposition: Mark variables used in
synthesized data clauses as addressable [PR100280]".

	PR middle-end/100280
	PR middle-end/104132
	PR middle-end/104133
	gcc/
	* omp-low.cc (task_shared_vars): Rename to
	'make_addressable_vars'.  Adjust all users.
	(scan_sharing_clauses) <OMP_CLAUSE_MAP> Use it for
	'OMP_CLAUSE_MAP_DECL_MAKE_ADDRESSABLE' DECLs, too.
	gcc/testsuite/
	* c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Adjust.
	* c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104132-1.c: Likewise.
	* c-c++-common/goacc/kernels-decompose-pr104133-1.c: Likewise.
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c:
	Extend.
---
 gcc/omp-low.cc                                | 47 +++++++-------
 .../goacc/kernels-decompose-pr104061-1-3.c    | 11 +---
 .../goacc/kernels-decompose-pr104061-1-4.c    | 17 ++---
 .../goacc/kernels-decompose-pr104132-1.c      |  9 +--
 .../goacc/kernels-decompose-pr104133-1.c      |  9 +--
 .../kernels-decompose-1.c                     | 62 +++++++++++++++++--
 6 files changed, 95 insertions(+), 60 deletions(-)

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 6654bfd426e..5ce3a50709a 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -188,7 +188,7 @@  struct omp_context
 static splay_tree all_contexts;
 static int taskreg_nesting_level;
 static int target_nesting_level;
-static bitmap task_shared_vars;
+static bitmap make_addressable_vars;
 static bitmap global_nonaddressable_vars;
 static vec<omp_context *> taskreg_contexts;
 static vec<gomp_task *> task_cpyfns;
@@ -572,9 +572,9 @@  use_pointer_for_field (tree decl, omp_context *shared_ctx)
 	      /* Taking address of OUTER in lower_send_shared_vars
 		 might need regimplification of everything that uses the
 		 variable.  */
-	      if (!task_shared_vars)
-		task_shared_vars = BITMAP_ALLOC (NULL);
-	      bitmap_set_bit (task_shared_vars, DECL_UID (outer));
+	      if (!make_addressable_vars)
+		make_addressable_vars = BITMAP_ALLOC (NULL);
+	      bitmap_set_bit (make_addressable_vars, DECL_UID (outer));
 	      TREE_ADDRESSABLE (outer) = 1;
 	    }
 	  return true;
@@ -601,13 +601,13 @@  omp_copy_decl_2 (tree var, tree name, tree type, omp_context *ctx)
   else
     record_vars (copy);
 
-  /* If VAR is listed in task_shared_vars, it means it wasn't
-     originally addressable and is just because task needs to take
-     it's address.  But we don't need to take address of privatizations
+  /* If VAR is listed in make_addressable_vars, it wasn't
+     originally addressable, but was only later made so.
+     We don't need to take address of privatizations
      from that var.  */
   if (TREE_ADDRESSABLE (var)
-      && ((task_shared_vars
-	   && bitmap_bit_p (task_shared_vars, DECL_UID (var)))
+      && ((make_addressable_vars
+	   && bitmap_bit_p (make_addressable_vars, DECL_UID (var)))
 	  || (global_nonaddressable_vars
 	      && bitmap_bit_p (global_nonaddressable_vars, DECL_UID (var)))))
     TREE_ADDRESSABLE (copy) = 0;
@@ -1502,6 +1502,9 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	      gcc_checking_assert (DECL_P (decl));
 
 	      gcc_checking_assert (!TREE_ADDRESSABLE (decl));
+	      if (!make_addressable_vars)
+		make_addressable_vars = BITMAP_ALLOC (NULL);
+	      bitmap_set_bit (make_addressable_vars, DECL_UID (decl));
 	      TREE_ADDRESSABLE (decl) = 1;
 
 	      if (dump_enabled_p ())
@@ -2402,11 +2405,11 @@  finish_taskreg_scan (omp_context *ctx)
   if (ctx->record_type == NULL_TREE)
     return;
 
-  /* If any task_shared_vars were needed, verify all
+  /* If any make_addressable_vars were needed, verify all
      OMP_CLAUSE_SHARED clauses on GIMPLE_OMP_{PARALLEL,TASK,TEAMS}
      statements if use_pointer_for_field hasn't changed
      because of that.  If it did, update field types now.  */
-  if (task_shared_vars)
+  if (make_addressable_vars)
     {
       tree c;
 
@@ -2421,7 +2424,7 @@  finish_taskreg_scan (omp_context *ctx)
 	       the receiver side will use them directly.  */
 	    if (is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx)))
 	      continue;
-	    if (!bitmap_bit_p (task_shared_vars, DECL_UID (decl))
+	    if (!bitmap_bit_p (make_addressable_vars, DECL_UID (decl))
 		|| !use_pointer_for_field (decl, ctx))
 	      continue;
 	    tree field = lookup_field (decl, ctx);
@@ -14071,7 +14074,7 @@  lower_omp_teams (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
 /* Callback for lower_omp_1.  Return non-NULL if *tp needs to be
    regimplified.  If DATA is non-NULL, lower_omp_1 is outside
-   of OMP context, but with task_shared_vars set.  */
+   of OMP context, but with make_addressable_vars set.  */
 
 static tree
 lower_omp_regimplify_p (tree *tp, int *walk_subtrees,
@@ -14085,9 +14088,9 @@  lower_omp_regimplify_p (tree *tp, int *walk_subtrees,
       && DECL_HAS_VALUE_EXPR_P (t))
     return t;
 
-  if (task_shared_vars
+  if (make_addressable_vars
       && DECL_P (t)
-      && bitmap_bit_p (task_shared_vars, DECL_UID (t)))
+      && bitmap_bit_p (make_addressable_vars, DECL_UID (t)))
     return t;
 
   /* If a global variable has been privatized, TREE_CONSTANT on
@@ -14172,7 +14175,7 @@  lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   if (gimple_has_location (stmt))
     input_location = gimple_location (stmt);
 
-  if (task_shared_vars)
+  if (make_addressable_vars)
     memset (&wi, '\0', sizeof (wi));
 
   /* If we have issued syntax errors, avoid doing any heavy lifting.
@@ -14189,7 +14192,7 @@  lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
     case GIMPLE_COND:
       {
 	gcond *cond_stmt = as_a <gcond *> (stmt);
-	if ((ctx || task_shared_vars)
+	if ((ctx || make_addressable_vars)
 	    && (walk_tree (gimple_cond_lhs_ptr (cond_stmt),
 			   lower_omp_regimplify_p,
 			   ctx ? NULL : &wi, NULL)
@@ -14281,7 +14284,7 @@  lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
       lower_omp_critical (gsi_p, ctx);
       break;
     case GIMPLE_OMP_ATOMIC_LOAD:
-      if ((ctx || task_shared_vars)
+      if ((ctx || make_addressable_vars)
 	  && walk_tree (gimple_omp_atomic_load_rhs_ptr (
 			  as_a <gomp_atomic_load *> (stmt)),
 			lower_omp_regimplify_p, ctx ? NULL : &wi, NULL))
@@ -14402,7 +14405,7 @@  lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 
     default:
     regimplify:
-      if ((ctx || task_shared_vars)
+      if ((ctx || make_addressable_vars)
 	  && walk_gimple_op (stmt, lower_omp_regimplify_p,
 			     ctx ? NULL : &wi))
 	{
@@ -14466,10 +14469,10 @@  execute_lower_omp (void)
 
   if (all_contexts->root)
     {
-      if (task_shared_vars)
+      if (make_addressable_vars)
 	push_gimplify_context ();
       lower_omp (&body, NULL);
-      if (task_shared_vars)
+      if (make_addressable_vars)
 	pop_gimplify_context (NULL);
     }
 
@@ -14478,7 +14481,7 @@  execute_lower_omp (void)
       splay_tree_delete (all_contexts);
       all_contexts = NULL;
     }
-  BITMAP_FREE (task_shared_vars);
+  BITMAP_FREE (make_addressable_vars);
   BITMAP_FREE (global_nonaddressable_vars);
 
   /* If current function is a method, remove artificial dummy VAR_DECL created
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-3.c b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-3.c
index f41dda86122..e106fc32c4f 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-3.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-3.c
@@ -1,12 +1,6 @@ 
 /* { dg-additional-options "--param openacc-kernels=decompose" } */
 
-/* { dg-additional-options "-fchecking" }
-   { dg-ice TODO }
-   { dg-prune-output {D\.[0-9]+ = arr_0\.0 \+ k;} }
-   { dg-prune-output {during GIMPLE pass: lower} } */
-
 /* { dg-additional-options "-fcompare-debug" } -- w/o debug compiled first.
-   { dg-bogus {error: during '-fcompare-debug' recompilation} TODO { xfail *-*-* } 0 }
    { dg-bogus {error: [^\n\r]+: '-fcompare-debug' failure \(length\)} TODO { xfail *-*-* } 0 } */
 /* { dg-additional-options "-O1" } so that we may get some 'GIMPLE_DEBUG's.  */
 
@@ -35,11 +29,10 @@  foo (void)
     /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
 #pragma acc loop /* { dg-line l_loop_k1 } */
     /* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k1 } */
-    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k1 } */
-    /* { dg-bogus {note: variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {w/ debug} { xfail *-*-* } l_loop_k1 } */
+    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {w/o debug} { target *-*-* } l_loop_k1 }
+       { dg-bogus {note: variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {w/ debug} { xfail *-*-* } l_loop_k1 } */
     /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_k1 } */
     for (k = 0; k < 2; k++)
       arr_0 += k;
-      /* { dg-bogus {error: invalid operands in binary operation} {w/ debug} { xfail *-*-* } .-1 } */
   }
 }
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-4.c b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-4.c
index cde95a7b7ac..bedbb0a30eb 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-4.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104061-1-4.c
@@ -1,11 +1,7 @@ 
 /* { dg-additional-options "--param openacc-kernels=decompose" } */
 
-/* { dg-additional-options "-fchecking" }
-   { dg-ice TODO }
-   { dg-prune-output {D\.[0-9]+ = arr_0\.0 \+ k;} }
-   { dg-prune-output {during GIMPLE pass: lower} } */
-
-/* { dg-additional-options "-g -fcompare-debug" } -- w/ debug compiled first.  */
+/* { dg-additional-options "-g -fcompare-debug" } -- w/ debug compiled first.
+   { dg-bogus {error: [^\n\r]+: '-fcompare-debug' failure \(length\)} TODO { xfail *-*-* } 0 } */
 /* { dg-additional-options "-O1" } so that we may get some 'GIMPLE_DEBUG's.  */
 
 /* { dg-additional-options "-fopt-info-all-omp" } */
@@ -32,12 +28,11 @@  foo (void)
 
     /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
 #pragma acc loop /* { dg-line l_loop_k1 } */
-    /* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { xfail *-*-* } l_loop_k1 } */
-    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {} { xfail *-*-* } l_loop_k1 } */
-    /* { dg-bogus {note: variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {w/ debug} { xfail *-*-* } l_loop_k1 } */
-    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { xfail *-*-* } l_loop_k1 } */
+    /* { dg-note {variable 'k' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_loop_k1 } */
+    /* { dg-note {variable 'k' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} {w/o debug} { target *-*-* } l_loop_k1 }
+       { dg-bogus {note: variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {w/ debug} { xfail *-*-* } l_loop_k1 } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_k1 } */
     for (k = 0; k < 2; k++)
       arr_0 += k;
-      /* { dg-bogus {error: invalid operands in binary operation} {w/ debug} { xfail *-*-* } .-1 } */
   }
 }
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104132-1.c b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104132-1.c
index 4f38a83bb19..42ec4418e40 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104132-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104132-1.c
@@ -1,11 +1,5 @@ 
 /* { dg-additional-options "--param openacc-kernels=decompose" } */
 
-/* { dg-additional-options "-fchecking" }
-   { dg-ice TODO }
-   { dg-prune-output {k = 0 \+ \.offset\.[0-9]+;} }
-   { dg-prune-output {k = 0 \+ 2;} }
-   { dg-prune-output {during IPA pass: \*free_lang_data} } */
-
 /* { dg-additional-options "-fopt-info-all-omp" } */
 
 /* { dg-additional-options "--param=openacc-privatization=noisy" }
@@ -27,14 +21,15 @@  foo (void)
     /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
 #pragma acc loop /* { dg-line l_loop_k1 } */
     /* { dg-note {variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_k1 } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_k1 } */
     for (k = 0; k < 2; k++)
       arr_0 = k;
 
     /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
 #pragma acc loop /* { dg-line l_loop_k2 } */
     /* { dg-note {variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_k2 } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_k2 } */
     for (k = 0; k < 2; k++)
       arr_0 = k;
   }
 }
-/* { dg-bogus {error: non-register as LHS of binary operation} {} { xfail *-*-* } .-1 } */
diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104133-1.c b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104133-1.c
index 0499665777d..47ea2b92959 100644
--- a/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104133-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/kernels-decompose-pr104133-1.c
@@ -1,11 +1,5 @@ 
 /* { dg-additional-options "--param openacc-kernels=decompose" } */
 
-/* { dg-additional-options "-fchecking" }
-   { dg-ice TODO }
-   { dg-prune-output {D\.[0-9]+ = arr_0\.0 \+ k;} }
-   { dg-prune-output {D\.[0-9]+ = arr_0\.1 \+ k;} }
-   { dg-prune-output {during GIMPLE pass: lower} } */
-
 /* { dg-additional-options "-fopt-info-all-omp" } */
 
 /* { dg-additional-options "--param=openacc-privatization=noisy" }
@@ -29,14 +23,15 @@  foo (void)
     /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
 #pragma acc loop /* { dg-line l_loop_k1 } */
     /* { dg-note {variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_k1 } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_k1 } */
     for (k = 0; k < 2; k++)
       arr_0 += k;
 
     /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
 #pragma acc loop /* { dg-line l_loop_k2 } */
     /* { dg-note {variable 'k' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_k2 } */
+    /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_k2 } */
     for (k = 0; k < 2; k++)
       arr_0 += k;
-      /* { dg-bogus {error: invalid operands in binary operation} {} { xfail *-*-* } .-1 } */
   }
 }
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c
index 85c39871f94..049b3a44b03 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-decompose-1.c
@@ -7,19 +7,23 @@ 
 
 /* { dg-additional-options "--param=openacc-privatization=noisy" }
    { dg-additional-options "-foffload=--param=openacc-privatization=noisy" }
-   for testing/documenting aspects of that functionality.  */
+   Prune a few: uninteresting, and potentially varying depending on GCC configuration (data types):
+   { dg-prune-output {note: variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} } */
 
 /* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
    passed to 'incr' may be unset, and in that case, it will be set to [...]",
    so to maintain compatibility with earlier Tcl releases, we manually
    initialize counter variables:
-   { dg-line l_dummy[variable c_compute 0 c_loop_i 0] }
+   { dg-line l_dummy[variable c_compute 0 c_loop_c 0 c_loop_i 0] }
    { dg-message "dummy" "" { target iN-VAl-Id } l_dummy } to avoid
    "WARNING: dg-line var l_dummy defined, but not used".  */
 
 #undef NDEBUG
 #include <assert.h>
 
+static int g1;
+static int g2;
+
 int main()
 {
   int a = 0;
@@ -27,8 +31,12 @@  int main()
   (volatile int *) &a;
 #define N 123
   int b[N] = { 0 };
+  unsigned long long f1;
+  /*TODO See above.  */
+  (volatile void *) &f1;
 
 #pragma acc kernels /* { dg-line l_compute[incr c_compute] } */
+  /* { dg-note {variable 'g2\.0' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} {} { target *-*-* } l_compute$c_compute } */
   {
     /* { dg-note {beginning 'gang-single' part in OpenACC 'kernels' region} {} { target *-*-* } .+1 } */
     int c = 234;
@@ -46,11 +54,57 @@  int main()
 
     /* { dg-note {beginning 'gang-single' part in OpenACC 'kernels' region} {} { target *-*-* } .+1 } */
     a = c;
+
+    /* PR104132, PR104133 */
+    {
+      /* Use the 'kernels'-top-level 'int c' as loop variable.  */
+
+      /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
+#pragma acc loop /* { dg-line l_loop_c[incr c_loop_c] } */
+      /* { dg-note {variable 'c' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_c$c_loop_c } */
+      /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_c$c_loop_c } */
+      for (c = 0; c < N / 2; c++)
+	b[c] -= 10;
+
+      /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
+#pragma acc loop /* { dg-line l_loop_c[incr c_loop_c] } */
+      /* { dg-note {variable 'c' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_c$c_loop_c } */
+      /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_c$c_loop_c } */
+      for (c = 0; c < N / 2; c++)
+	g1 = c;
+
+      /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
+#pragma acc loop /* { dg-line l_loop_c[incr c_loop_c] } */
+      /* { dg-note {variable 'c' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_c$c_loop_c } */
+      /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_c$c_loop_c } */
+      for (c = 0; c <= N; c++)
+	g2 += c;
+
+    /* { dg-note {beginning 'gang-single' part in OpenACC 'kernels' region} {} { target *-*-* } .+1 } */
+      f1 = 1;
+      /* { dg-note {forwarded loop nest in OpenACC 'kernels' region to 'parloops' for analysis} {} { target *-*-* } .+1 } */
+#pragma acc loop /* { dg-line l_loop_c[incr c_loop_c] } */
+      /* { dg-note {variable 'c' in 'private' clause is candidate for adjusting OpenACC privatization level} {} { target *-*-* } l_loop_c$c_loop_c } */
+      /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_loop_c$c_loop_c } */
+      for (c = 20; c > 0; --c)
+	f1 *= c;
+
+      /* { dg-note {beginning 'parloops' part in OpenACC 'kernels' region} {} { target *-*-* } .+1 } */
+      if (c != 234)
+	__builtin_abort ();
+      /* { dg-optimized {assigned OpenACC seq loop parallelism} {} { target *-*-* } l_compute$c_compute } */
+    }
   }
 
-  for (int i = 0; i < N; ++i)
-    assert (b[i] == 234);
   assert (a == 234);
+  for (int i = 0; i < N; ++i)
+    if (i < N / 2)
+      assert (b[i] == 234 - 10);
+    else
+      assert (b[i] == 234);
+  assert (g1 == N / 2 - 1);
+  assert (g2 == N * (N + 1) / 2);
+  assert (f1 == 2432902008176640000ULL);
 
   return 0;
 }
-- 
2.34.1