diff mbox series

[libgomp] Fix chunk_size<1 for dynamic schedule

Message ID 13568991-7359-9149-04fa-cde2245f108c@codesourcery.com
State New
Headers show
Series [libgomp] Fix chunk_size<1 for dynamic schedule | expand

Commit Message

Chung-Lin Tang June 23, 2022, 3:47 p.m. UTC
Hi Jakub,
with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:

(1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
(2) chunk_size == 0:  infinite loop

The (2) behavior is obviously not desired. This patch fixes this by changing
the chunk_size initialization in gomp_loop_init to "max(1,chunk_size)"

The OMP_SCHEDULE parsing in libgomp/env.c has also been adjusted to reject
negative values.

Tested without regressions, and a new testcase for the infinite loop behavior added.
Okay for trunk?

Thanks,
Chung-Lin

libgomp/ChangeLog:
	* env.c (parse_schedule): Make negative values invalid for chunk_size.
	* loop.c (gomp_loop_init): For non-STATIC schedule and chunk_size <= 0,
	set initialized chunk_size to 1.

	* testsuite/libgomp.c/loop-28.c: New test.

Comments

Jakub Jelinek June 28, 2022, 2:06 p.m. UTC | #1
On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
> with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
> 
> (1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
> (2) chunk_size == 0:  infinite loop
> 
> The (2) behavior is obviously not desired. This patch fixes this by changing

Why?  It is a user error, undefined behavior, we shouldn't slow down valid
code for users who don't bother reading the standard.

E.g. OpenMP 5.1 [132:14] says clearly:
"chunk_size must be a loop invariant integer expression with a positive
value."
and omp_set_schedule for chunk_size < 1 should use a default value (which it
does).

For OMP_SCHEDULE the standard says it is implementation-defined what happens
if the format isn't the specified one, so I guess the env.c change
could be acceptable (though without it it is fine too), but the
loop.c change is wrong.  Note, if the loop.c change would be ok, you'd
need to also change loop_ull.c too.

	Jakub
Jakub Jelinek June 28, 2022, 3:07 p.m. UTC | #2
On Tue, Jun 28, 2022 at 04:06:14PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
> > with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
> > 
> > (1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
> > (2) chunk_size == 0:  infinite loop
> > 
> > The (2) behavior is obviously not desired. This patch fixes this by changing
> 
> Why?  It is a user error, undefined behavior, we shouldn't slow down valid
> code for users who don't bother reading the standard.
> 
> E.g. OpenMP 5.1 [132:14] says clearly:
> "chunk_size must be a loop invariant integer expression with a positive
> value."
> and omp_set_schedule for chunk_size < 1 should use a default value (which it
> does).
> 
> For OMP_SCHEDULE the standard says it is implementation-defined what happens
> if the format isn't the specified one, so I guess the env.c change
> could be acceptable (though without it it is fine too), but the

Though, seems we quietly transform the only problematic value (0) in there
to 1 for selected schedules which don't accept 0 as "unspecified" and for
the negative values, we'll have large ulong chunk sizes which is fine.

If we really want help people debugging their programs, we could introduce
something like -fsanitize=openmp that would add runtime instrumentation of a
lot of OpenMP restrictions and could diagnose it with nice diagnostics,
perhaps using some extra library and with runtime checks in generated code.

	Jakub
Chung-Lin Tang Aug. 4, 2022, 1:17 p.m. UTC | #3
On 2022/6/28 10:06 PM, Jakub Jelinek wrote:
> On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
>> with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
>>
>> (1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
>> (2) chunk_size == 0:  infinite loop
>>
>> The (2) behavior is obviously not desired. This patch fixes this by changing
> 
> Why?  It is a user error, undefined behavior, we shouldn't slow down valid
> code for users who don't bother reading the standard.

This is loop init code, not per-iteration. The overhead really isn't that much.

The question should be, if GCC having infinite loop behavior is reasonable,
even if it is undefined in the spec.

> E.g. OpenMP 5.1 [132:14] says clearly:
> "chunk_size must be a loop invariant integer expression with a positive
> value."
> and omp_set_schedule for chunk_size < 1 should use a default value (which it
> does).
> 
> For OMP_SCHEDULE the standard says it is implementation-defined what happens
> if the format isn't the specified one, so I guess the env.c change
> could be acceptable (though without it it is fine too), but the
> loop.c change is wrong.  Note, if the loop.c change would be ok, you'd
> need to also change loop_ull.c too.

I've updated the patch to add the same changes for libgomp/loop_ull.c and updated
the testcase too. Tested on mainline trunk without regressions.

Thanks,
Chung-Lin

libgomp/ChangeLog:

	* env.c (parse_schedule): Make negative values invalid for chunk_size.
	* loop.c (gomp_loop_init): For non-STATIC schedule and chunk_size <= 0,
	set initialized chunk_size to 1.
	* loop_ull.c (gomp_loop_ull_init): Likewise.

	* testsuite/libgomp.c/loop-28.c: New test.
diff --git a/libgomp/env.c b/libgomp/env.c
index 1c4ee894515..dff07617e15 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -182,6 +182,8 @@ parse_schedule (void)
     goto invalid;
 
   errno = 0;
+  if (*env == '-')
+    goto invalid;
   value = strtoul (env, &end, 10);
   if (errno || end == env)
     goto invalid;
diff --git a/libgomp/loop.c b/libgomp/loop.c
index be85162bb1e..018b4e9a8bd 100644
--- a/libgomp/loop.c
+++ b/libgomp/loop.c
@@ -41,7 +41,7 @@ gomp_loop_init (struct gomp_work_share *ws, long start, long end, long incr,
 		enum gomp_schedule_type sched, long chunk_size)
 {
   ws->sched = sched;
-  ws->chunk_size = chunk_size;
+  ws->chunk_size = (sched == GFS_STATIC || chunk_size > 1) ? chunk_size : 1;
   /* Canonicalize loops that have zero iterations to ->next == ->end.  */
   ws->end = ((incr > 0 && start > end) || (incr < 0 && start < end))
 	    ? start : end;
diff --git a/libgomp/loop_ull.c b/libgomp/loop_ull.c
index 602737296d4..74ddb1bd623 100644
--- a/libgomp/loop_ull.c
+++ b/libgomp/loop_ull.c
@@ -43,7 +43,7 @@ gomp_loop_ull_init (struct gomp_work_share *ws, bool up, gomp_ull start,
 		    gomp_ull chunk_size)
 {
   ws->sched = sched;
-  ws->chunk_size_ull = chunk_size;
+  ws->chunk_size_ull = (sched == GFS_STATIC || chunk_size > 1) ? chunk_size : 1;
   /* Canonicalize loops that have zero iterations to ->next == ->end.  */
   ws->end_ull = ((up && start > end) || (!up && start < end))
 		? start : end;
diff --git a/libgomp/testsuite/libgomp.c/loop-28.c b/libgomp/testsuite/libgomp.c/loop-28.c
new file mode 100644
index 00000000000..664842e27aa
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/loop-28.c
@@ -0,0 +1,21 @@
+/* { dg-do run } */
+/* { dg-timeout 10 } */
+
+void __attribute__((noinline))
+foo (int a[], int n, int chunk_size)
+{
+  #pragma omp parallel for schedule (dynamic,chunk_size)
+  for (int i = 0; i < n; i++)
+    a[i] = i;
+
+  #pragma omp parallel for schedule (dynamic,chunk_size)
+  for (unsigned long long i = 0; i < n; i++)
+    a[i] = i;
+}
+
+int main (void)
+{
+  int a[100];
+  foo (a, 100, 0);
+  return 0;
+}
Li, Pan2 via Gcc-patches Aug. 4, 2022, 1:31 p.m. UTC | #4
> On Aug 4, 2022, at 9:17 AM, Chung-Lin Tang <cltang@codesourcery.com> wrote:
> 
> On 2022/6/28 10:06 PM, Jakub Jelinek wrote:
>> On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
>>> with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
>>> 
>>> (1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
>>> (2) chunk_size == 0:  infinite loop
>>> 
>>> The (2) behavior is obviously not desired. This patch fixes this by changing
>> Why?  It is a user error, undefined behavior, we shouldn't slow down valid
>> code for users who don't bother reading the standard.
> 
> This is loop init code, not per-iteration. The overhead really isn't that much.
> 
> The question should be, if GCC having infinite loop behavior is reasonable,
> even if it is undefined in the spec.

I wouldn't think so.  The way I see "undefined code" is that you can't complain about "wrong code" produced by the compiler.  But for the compiler to malfunction on wrong input is an entirely differerent matter.  For one thing, it's hard to fix your code if the compiler fails.  How would you locate the offending source line?

	paul
Chung-Lin Tang Aug. 26, 2022, 8:15 a.m. UTC | #5
On 2022/8/4 9:31 PM, Koning, Paul wrote:
> 
> 
>> On Aug 4, 2022, at 9:17 AM, Chung-Lin Tang <cltang@codesourcery.com> wrote:
>>
>> On 2022/6/28 10:06 PM, Jakub Jelinek wrote:
>>> On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
>>>> with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
>>>>
>>>> (1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
>>>> (2) chunk_size == 0:  infinite loop
>>>>
>>>> The (2) behavior is obviously not desired. This patch fixes this by changing
>>> Why?  It is a user error, undefined behavior, we shouldn't slow down valid
>>> code for users who don't bother reading the standard.
>>
>> This is loop init code, not per-iteration. The overhead really isn't that much.
>>
>> The question should be, if GCC having infinite loop behavior is reasonable,
>> even if it is undefined in the spec.
> 
> I wouldn't think so.  The way I see "undefined code" is that you can't complain about "wrong code" produced by the compiler.  But for the compiler to malfunction on wrong input is an entirely differerent matter.  For one thing, it's hard to fix your code if the compiler fails.  How would you locate the offending source line?
> 
> 	paul

Ping?
Chung-Lin Tang Sept. 9, 2022, 10:08 a.m. UTC | #6
On 2022/8/26 4:15 PM, Chung-Lin Tang wrote:
> On 2022/8/4 9:31 PM, Koning, Paul wrote:
>>
>>
>>> On Aug 4, 2022, at 9:17 AM, Chung-Lin Tang <cltang@codesourcery.com> wrote:
>>>
>>> On 2022/6/28 10:06 PM, Jakub Jelinek wrote:
>>>> On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
>>>>> with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
>>>>>
>>>>> (1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
>>>>> (2) chunk_size == 0:  infinite loop
>>>>>
>>>>> The (2) behavior is obviously not desired. This patch fixes this by changing
>>>> Why?  It is a user error, undefined behavior, we shouldn't slow down valid
>>>> code for users who don't bother reading the standard.
>>>
>>> This is loop init code, not per-iteration. The overhead really isn't that much.
>>>
>>> The question should be, if GCC having infinite loop behavior is reasonable,
>>> even if it is undefined in the spec.
>>
>> I wouldn't think so.  The way I see "undefined code" is that you can't complain about "wrong code" produced by the compiler.  But for the compiler to malfunction on wrong input is an entirely differerent matter.  For one thing, it's hard to fix your code if the compiler fails.  How would you locate the offending source line?
>>
>> 	paul
> 
> Ping?

Ping x2.
Jakub Jelinek Sept. 13, 2022, 2:07 p.m. UTC | #7
On Thu, Aug 04, 2022 at 09:17:09PM +0800, Chung-Lin Tang wrote:
> On 2022/6/28 10:06 PM, Jakub Jelinek wrote:
> > On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote:
> > > with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next:
> > > 
> > > (1) chunk_size <= -1: wraps into large unsigned value, seems to work though.
> > > (2) chunk_size == 0:  infinite loop
> > > 
> > > The (2) behavior is obviously not desired. This patch fixes this by changing
> > 
> > Why?  It is a user error, undefined behavior, we shouldn't slow down valid
> > code for users who don't bother reading the standard.
> 
> This is loop init code, not per-iteration. The overhead really isn't that much.

But still, we slow down valid code for the sake of garbage.
That is a task for sanitizers, not production libraries.
> 
> The question should be, if GCC having infinite loop behavior is reasonable,
> even if it is undefined in the spec.

Sure, it is perfectly valid implementation of the undefined behavior.
UB can leads to tons of surprising behavior, hangs etc. and this is exactly
the same category.

On Thu, Aug 04, 2022 at 01:31:50PM +0000, Koning, Paul via Gcc-patches wrote:
> I wouldn't think so.  The way I see "undefined code" is that you can't complain
> about "wrong code" produced by the compiler.  But for the compiler to malfunction
> on wrong input is an entirely differerent matter.  For one thing, it's hard to fix
> your code if the compiler fails.  How would you locate the offending source line?

The compiler isn't malfunctioning here.  It is similar to calling a library
function with bogus arguments, say memcpy with NULL source or destination or
some invalid pointer not pointing anywhere valid, etc.
The spec clearly says zero or negative chunk size is not valid, you use it,
you get what you ask for.  Furthermore, it is easy to find out when it hangs
on which construct it is and check if that construct is valid.

I'm strongly against slowing valid code for this.

If you want to implement -fsanitize=openmp and either in that case perform
checks on the generated code side or link with an instrumented version of
libgomp that explains users what errors they do, that is fine.

	Jakub
diff mbox series

Patch

diff --git a/libgomp/env.c b/libgomp/env.c
index 1c4ee894515..dff07617e15 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -182,6 +182,8 @@  parse_schedule (void)
     goto invalid;
 
   errno = 0;
+  if (*env == '-')
+    goto invalid;
   value = strtoul (env, &end, 10);
   if (errno || end == env)
     goto invalid;
diff --git a/libgomp/loop.c b/libgomp/loop.c
index be85162bb1e..018b4e9a8bd 100644
--- a/libgomp/loop.c
+++ b/libgomp/loop.c
@@ -41,7 +41,7 @@  gomp_loop_init (struct gomp_work_share *ws, long start, long end, long incr,
 		enum gomp_schedule_type sched, long chunk_size)
 {
   ws->sched = sched;
-  ws->chunk_size = chunk_size;
+  ws->chunk_size = (sched == GFS_STATIC || chunk_size > 1) ? chunk_size : 1;
   /* Canonicalize loops that have zero iterations to ->next == ->end.  */
   ws->end = ((incr > 0 && start > end) || (incr < 0 && start < end))
 	    ? start : end;
diff --git a/libgomp/testsuite/libgomp.c/loop-28.c b/libgomp/testsuite/libgomp.c/loop-28.c
new file mode 100644
index 00000000000..e3f852046f4
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/loop-28.c
@@ -0,0 +1,17 @@ 
+/* { dg-do run } */
+/* { dg-timeout 10 } */
+
+void __attribute__((noinline))
+foo (int a[], int n, int chunk_size)
+{
+  #pragma omp parallel for schedule (dynamic,chunk_size)
+  for (int i = 0; i < n; i++)
+    a[i] = i;
+}
+
+int main (void)
+{
+  int a[100];
+  foo (a, 100, 0);
+  return 0;
+}