diff mbox

[RFC,PR68580] Handle pthread_create error in tsan testsuite

Message ID HE1PR07MB0905578859D3EC1C8AFFD5CFE4AC0@HE1PR07MB0905.eurprd07.prod.outlook.com
State New
Headers show

Commit Message

Bernd Edlinger Feb. 15, 2016, 11:29 a.m. UTC
On 15/02/16 08:18, Dmitry Vyukov wrote: 
> llvm tsan tests contain test.h file (probably what's called
> test_barrier.h in gcc), you can put the macro there. test.h should
> already be included into all tests.

Hmm.. as the person who introduced test_barrer.h (well before llvm had a test.h ;)
I must say, that although if gcc was first here, we will  probably change that to
match llvm's implementation for gcc-7.

I would not like to add more differences here without a very good reason.
I'd say, if Dmitry sees a reason to improve the error handling in test.h, that
is a good thing, and should go into llvm's tree first.

And independently of that I am looking at using llvm's test.h framework instead
of gcc's test_barrier.h for gcc-7 soon.

On 15/02/16 11:56, Tom de Vries wrote:
> On Mon, Feb 15, 2016 at 11:45 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>
>> I've tried to be as clear as possible in the RFC submission that I'm not
>> certain about the cause of the failure, and that the patch is proposing a
>> fix that would make that guessed failure cause explicit.
>>
>>> Sure pthread_create can fail, as malloc and mmap.
>>> But if that is the reason for the failure it would happen
>>> just randomly, everywhere.
>>>
>>> Why do you think that only this test case shows the problem?
>>>
>>
>> As I explained in the RFC submission, my reasoning there was that the test
>> is one of the very few test cases that tests the result of pthread_create
>> and then returns 0, which causes the failure in combination with
>> dg-shouldfail.
>>
>> But thinking about it some more, even if pthread_create would fail, causing
>> the testcase to fail in execution, allowing the execution test to pass due
>> to dg-shouldfail, presumably the dg-output test would still fail in that
>> case, so my reasoning was not sound.
>>
>> So I suppose you're right, indeed the pthread_create fail hypothesis is not
>> the most logical one.
>>
>> Still, the patch is an improvement irrespective of the PR that inspired it,
>> and perhaps a lot more library calls should be checked for errors that just
>> pthread_create.
>>
>>> I think Dmitry's comment may be right on the point.
>>
>>
>> If someone proposes that as a patch for the testcase, great. I'm more that
>> willing to test that in my setup to be able to claim 'bootstrapped and
>> reg-tested on x86_64' in the submission.
>>

No problem.  PR65400 was a GCC wrong code bug, so it makes no
sense to have the same test in llvm's tree, thus we are free to fix it on
our own, as we like.

Here is a patch that puts each value on it's own 8-byte aligned memory
location.  From my experience with tsan tests, sharing shadow memory
slots between v and q or o is the most likely explanation for the occasional
inability to spot the race condition on v, thus the test case fails, because
the return code is 0, and the expected output is not found.


Boot-strapped/regression tested on x86_64-linux-gnu.

OK for trunk?


Thanks
Bernd.

Comments

Dmitry Vyukov Feb. 15, 2016, 12:05 p.m. UTC | #1
On Mon, Feb 15, 2016 at 12:29 PM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> On 15/02/16 08:18, Dmitry Vyukov wrote:
>> llvm tsan tests contain test.h file (probably what's called
>> test_barrier.h in gcc), you can put the macro there. test.h should
>> already be included into all tests.
>
> Hmm.. as the person who introduced test_barrer.h (well before llvm had a test.h ;)
> I must say, that although if gcc was first here, we will  probably change that to
> match llvm's implementation for gcc-7.
>
> I would not like to add more differences here without a very good reason.
> I'd say, if Dmitry sees a reason to improve the error handling in test.h, that
> is a good thing, and should go into llvm's tree first.
>
> And independently of that I am looking at using llvm's test.h framework instead
> of gcc's test_barrier.h for gcc-7 soon.
>
> On 15/02/16 11:56, Tom de Vries wrote:
>> On Mon, Feb 15, 2016 at 11:45 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>>
>>> I've tried to be as clear as possible in the RFC submission that I'm not
>>> certain about the cause of the failure, and that the patch is proposing a
>>> fix that would make that guessed failure cause explicit.
>>>
>>>> Sure pthread_create can fail, as malloc and mmap.
>>>> But if that is the reason for the failure it would happen
>>>> just randomly, everywhere.
>>>>
>>>> Why do you think that only this test case shows the problem?
>>>>
>>>
>>> As I explained in the RFC submission, my reasoning there was that the test
>>> is one of the very few test cases that tests the result of pthread_create
>>> and then returns 0, which causes the failure in combination with
>>> dg-shouldfail.
>>>
>>> But thinking about it some more, even if pthread_create would fail, causing
>>> the testcase to fail in execution, allowing the execution test to pass due
>>> to dg-shouldfail, presumably the dg-output test would still fail in that
>>> case, so my reasoning was not sound.
>>>
>>> So I suppose you're right, indeed the pthread_create fail hypothesis is not
>>> the most logical one.
>>>
>>> Still, the patch is an improvement irrespective of the PR that inspired it,
>>> and perhaps a lot more library calls should be checked for errors that just
>>> pthread_create.
>>>
>>>> I think Dmitry's comment may be right on the point.
>>>
>>>
>>> If someone proposes that as a patch for the testcase, great. I'm more that
>>> willing to test that in my setup to be able to claim 'bootstrapped and
>>> reg-tested on x86_64' in the submission.
>>>
>
> No problem.  PR65400 was a GCC wrong code bug, so it makes no
> sense to have the same test in llvm's tree, thus we are free to fix it on
> our own, as we like.
>
> Here is a patch that puts each value on it's own 8-byte aligned memory
> location.  From my experience with tsan tests, sharing shadow memory
> slots between v and q or o is the most likely explanation for the occasional
> inability to spot the race condition on v, thus the test case fails, because
> the return code is 0, and the expected output is not found.
>
>
> Boot-strapped/regression tested on x86_64-linux-gnu.
>
> OK for trunk?


I don't know whether it will fire or not, but 4-byte variables that
are 8-byte aligned can still be collocated with something else. Making
vars 8-byte should be safer.
Bernd Edlinger Feb. 15, 2016, 12:44 p.m. UTC | #2
On 15/02/16 13:05, Dmitry Vyukov wrote:
> On Mon, Feb 15, 2016 at 12:29 PM, Bernd Edlinger
> <bernd.edlinger@hotmail.de> wrote:
>>
>> No problem.  PR65400 was a GCC wrong code bug, so it makes no
>> sense to have the same test in llvm's tree, thus we are free to fix it on
>> our own, as we like.
>>
>> Here is a patch that puts each value on it's own 8-byte aligned memory
>> location.  From my experience with tsan tests, sharing shadow memory
>> slots between v and q or o is the most likely explanation for the occasional
>> inability to spot the race condition on v, thus the test case fails, because
>> the return code is 0, and the expected output is not found.
>>
>>
>> Boot-strapped/regression tested on x86_64-linux-gnu.
>>
>> OK for trunk?
>
>
> I don't know whether it will fire or not, but 4-byte variables that
> are 8-byte aligned can still be collocated with something else. Making
> vars 8-byte should be safer.

Yes, but as PR65400 is a wrong code bug, I would not like to change more
than absolutely necessary to the test case, in order not to loose the ability
to check the original regression.

The test case does not have more than the 3 global values that you see.
nm a.out | sort shows that these are now separate:

0000000000601400 b barrier
0000000000601420 b barrier_wait
0000000000601428 B v
0000000000601430 B o
0000000000601438 B q
0000000000601440 B _end



Regards,
Bernd.
Dmitry Vyukov Feb. 15, 2016, 12:45 p.m. UTC | #3
On Mon, Feb 15, 2016 at 1:44 PM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> On 15/02/16 13:05, Dmitry Vyukov wrote:
>> On Mon, Feb 15, 2016 at 12:29 PM, Bernd Edlinger
>> <bernd.edlinger@hotmail.de> wrote:
>>>
>>> No problem.  PR65400 was a GCC wrong code bug, so it makes no
>>> sense to have the same test in llvm's tree, thus we are free to fix it on
>>> our own, as we like.
>>>
>>> Here is a patch that puts each value on it's own 8-byte aligned memory
>>> location.  From my experience with tsan tests, sharing shadow memory
>>> slots between v and q or o is the most likely explanation for the occasional
>>> inability to spot the race condition on v, thus the test case fails, because
>>> the return code is 0, and the expected output is not found.
>>>
>>>
>>> Boot-strapped/regression tested on x86_64-linux-gnu.
>>>
>>> OK for trunk?
>>
>>
>> I don't know whether it will fire or not, but 4-byte variables that
>> are 8-byte aligned can still be collocated with something else. Making
>> vars 8-byte should be safer.
>
> Yes, but as PR65400 is a wrong code bug, I would not like to change more
> than absolutely necessary to the test case, in order not to loose the ability
> to check the original regression.
>
> The test case does not have more than the 3 global values that you see.
> nm a.out | sort shows that these are now separate:
>
> 0000000000601400 b barrier
> 0000000000601420 b barrier_wait
> 0000000000601428 B v
> 0000000000601430 B o
> 0000000000601438 B q
> 0000000000601440 B _end


Looks good to me then.
Mike Stump Feb. 15, 2016, 7:22 p.m. UTC | #4
On Feb 15, 2016, at 3:29 AM, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
> And independently of that I am looking at using llvm's test.h framework instead
> of gcc's test_barrier.h for gcc-7 soon.

Here’s to hoping that we don’t back slide on:

  https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00436.html

Did they ever adopt a reliable scheme to test?
Dmitry Vyukov Feb. 15, 2016, 7:34 p.m. UTC | #5
On Mon, Feb 15, 2016 at 8:22 PM, Mike Stump <mikestump@comcast.net> wrote:
> On Feb 15, 2016, at 3:29 AM, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
>> And independently of that I am looking at using llvm's test.h framework instead
>> of gcc's test_barrier.h for gcc-7 soon.
>
> Here’s to hoping that we don’t back slide on:
>
>   https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00436.html
>
> Did they ever adopt a reliable scheme to test?

Yes, they did.
Btw, the pthread_barrier solution did not work on macos as it does not
implement pthread_barrier and libpthread.so.0 does not exist. That was
replaced with spin loop with usleep(100). That caused spurious "as if
synchronized via sleep" messages that broke tests. That was replaced
with busy loop. That caused timeouts of weak test bots. That was
replaced with loop with sched_yield. That caused tsan trace overflows
and "failed to restore stack trace" error messages. And that was
replaced with special support in tsan runtime which seems to work in
all contexts so far...
Tom de Vries Feb. 18, 2016, 11:36 a.m. UTC | #6
On 15/02/16 12:29, Bernd Edlinger wrote:
> Here is a patch that puts each value on it's own 8-byte aligned memory
> location.  From my experience with tsan tests, sharing shadow memory
> slots between v and q or o is the most likely explanation for the occasional
> inability to spot the race condition on v, thus the test case fails, because
> the return code is 0, and the expected output is not found.
>
> Boot-strapped/regression tested on x86_64-linux-gnu.
>
> OK for trunk?
>

Hi,

Could you add 'PR testsuite/68580' to the log entry when committing?

Thanks,
- Tom

> patch-pr68580.diff
>
>
> 2016-02-15  Bernd Edlinger<bernd.edlinger@hotmail.de>
>
> 	* c-c++-common/tsan/pr65400-1.c (v, q, o): Make 8-byte aligned.
>
> --- gcc/testsuite/c-c++-common/tsan/pr65400-1.c.jj	2015-03-19 08:53:38.000000000 +0100
> +++ gcc/testsuite/c-c++-common/tsan/pr65400-1.c	2016-02-15 11:09:18.852320827 +0100
> @@ -7,9 +7,9 @@
>   #include "tsan_barrier.h"
>
>   static pthread_barrier_t barrier;
> -int v;
> -int q;
> -int o;
> +int v __attribute__((aligned(8)));
> +int q __attribute__((aligned(8)));
> +int o __attribute__((aligned(8)));
>   extern void baz4 (int *);
>
>   __attribute__((noinline, noclone)) int
Bernd Edlinger Feb. 18, 2016, 8:19 p.m. UTC | #7
On 18.02.2016 12:36, Tom de Vries wrote:
> On 15/02/16 12:29, Bernd Edlinger wrote:
>> Here is a patch that puts each value on it's own 8-byte aligned memory
>> location.  From my experience with tsan tests, sharing shadow memory
>> slots between v and q or o is the most likely explanation for the
>> occasional
>> inability to spot the race condition on v, thus the test case fails,
>> because
>> the return code is 0, and the expected output is not found.
>>
>> Boot-strapped/regression tested on x86_64-linux-gnu.
>>
>> OK for trunk?
>>
>
> Hi,
>
> Could you add 'PR testsuite/68580' to the log entry when committing?
>

Yes, of course, thanks.

Could someone take the time and review this patch?
I don't think it can cause any trouble for gcc-6 and/or gcc-5
even at stage 4.

Is it OK for trunk and gcc-5-branch?

Thanks
Bernd.

> Thanks,
> - Tom
>
>> patch-pr68580.diff
>>
>>
>> 2016-02-15  Bernd Edlinger<bernd.edlinger@hotmail.de>
>>
>>     * c-c++-common/tsan/pr65400-1.c (v, q, o): Make 8-byte aligned.
>>
>> --- gcc/testsuite/c-c++-common/tsan/pr65400-1.c.jj    2015-03-19
>> 08:53:38.000000000 +0100
>> +++ gcc/testsuite/c-c++-common/tsan/pr65400-1.c    2016-02-15
>> 11:09:18.852320827 +0100
>> @@ -7,9 +7,9 @@
>>   #include "tsan_barrier.h"
>>
>>   static pthread_barrier_t barrier;
>> -int v;
>> -int q;
>> -int o;
>> +int v __attribute__((aligned(8)));
>> +int q __attribute__((aligned(8)));
>> +int o __attribute__((aligned(8)));
>>   extern void baz4 (int *);
>>
>>   __attribute__((noinline, noclone)) int
>
Jakub Jelinek Feb. 18, 2016, 9:31 p.m. UTC | #8
On Thu, Feb 18, 2016 at 08:19:22PM +0000, Bernd Edlinger wrote:
> > Could you add 'PR testsuite/68580' to the log entry when committing?
> >
> 
> Yes, of course, thanks.
> 
> Could someone take the time and review this patch?
> I don't think it can cause any trouble for gcc-6 and/or gcc-5
> even at stage 4.
> 
> Is it OK for trunk and gcc-5-branch?

Ok.

> >> 2016-02-15  Bernd Edlinger<bernd.edlinger@hotmail.de>
> >>
> >>     * c-c++-common/tsan/pr65400-1.c (v, q, o): Make 8-byte aligned.
> >>
> >> --- gcc/testsuite/c-c++-common/tsan/pr65400-1.c.jj    2015-03-19
> >> 08:53:38.000000000 +0100
> >> +++ gcc/testsuite/c-c++-common/tsan/pr65400-1.c    2016-02-15
> >> 11:09:18.852320827 +0100
> >> @@ -7,9 +7,9 @@
> >>   #include "tsan_barrier.h"
> >>
> >>   static pthread_barrier_t barrier;
> >> -int v;
> >> -int q;
> >> -int o;
> >> +int v __attribute__((aligned(8)));
> >> +int q __attribute__((aligned(8)));
> >> +int o __attribute__((aligned(8)));
> >>   extern void baz4 (int *);
> >>
> >>   __attribute__((noinline, noclone)) int
> >

	Jakub
diff mbox

Patch

2016-02-15  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	* c-c++-common/tsan/pr65400-1.c (v, q, o): Make 8-byte aligned.

--- gcc/testsuite/c-c++-common/tsan/pr65400-1.c.jj	2015-03-19 08:53:38.000000000 +0100
+++ gcc/testsuite/c-c++-common/tsan/pr65400-1.c	2016-02-15 11:09:18.852320827 +0100
@@ -7,9 +7,9 @@ 
 #include "tsan_barrier.h"
 
 static pthread_barrier_t barrier;
-int v;
-int q;
-int o;
+int v __attribute__((aligned(8)));
+int q __attribute__((aligned(8)));
+int o __attribute__((aligned(8)));
 extern void baz4 (int *);
 
 __attribute__((noinline, noclone)) int