Message ID | 9eb00ba1-e01e-3b47-7ac5-8092021d1114@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rs6000: Retry tbegin. instructions that can fail intermittently | expand |
Hi! On Tue, Feb 15, 2022 at 01:03:09PM -0600, Peter Bergner wrote: > The HTM tbegin. instruction can fail intermittently due to many reasons. Just a few really. But, if the transaction fails it will appear as if the tbegin. had an error as well, is that what you are seeing? > This can lead to the htm-1.c testsuite test case FAILing from time to time. > The solution is to allow retrying the instruction a few times before aborting. That is the way any HTM code should be written in the first place (except for rollback-only transactions, but let's not go there -- besides, it is normal for those to fail as well, and there needs to be a fallback there as well :-) ) The patch is fine. Okay for trunk and backports (after soak time ofc). Thanks! Segher > gcc/testsuite/ > * gcc.target/powerpc/htm-1.c: Retry intermittent failing tbegins.
On 2/15/22 3:54 PM, Segher Boessenkool wrote: > On Tue, Feb 15, 2022 at 01:03:09PM -0600, Peter Bergner wrote: >> The HTM tbegin. instruction can fail intermittently due to many reasons. > > Just a few really. But, if the transaction fails it will appear as if > the tbegin. had an error as well, is that what you are seeing? Yeah, exactly. The transaction itself fails, which rolls us back to the tbegin. with an error return value so we fall into the else/error clause. > That is the way any HTM code should be written in the first place > (except for rollback-only transactions, but let's not go there -- > besides, it is normal for those to fail as well, and there needs to be a > fallback there as well :-) ) Agreed and I'm not sure why I didn't write it that way to begin with. Maybe I thought it was so simple that the likelihood of it failing was so small we'd never see it? Anyway, we do now, so... > The patch is fine. Okay for trunk and backports (after soak time ofc). > Thanks! Thanks, pushed. Peter
On Tue, Feb 15, 2022 at 04:59:45PM -0600, Peter Bergner wrote: > > That is the way any HTM code should be written in the first place > > (except for rollback-only transactions, but let's not go there -- > > besides, it is normal for those to fail as well, and there needs to be a > > fallback there as well :-) ) > > Agreed and I'm not sure why I didn't write it that way to begin with. > Maybe I thought it was so simple that the likelihood of it failing was > so small we'd never see it? Anyway, we do now, so... Yeah, and you perhaps were misled by not seeing it fail in any testing (it fails only .02% of the time you said). For that reason it helps to make testcases fail *more* often. That isn't very trivial to do with HTM of course. Since we don't do HTM anymore it will all fade away, and let's not bother, why am I typing still :-) Segher
diff --git a/gcc/testsuite/gcc.target/powerpc/htm-1.c b/gcc/testsuite/gcc.target/powerpc/htm-1.c index f27e32ca281..399a7ec8812 100644 --- a/gcc/testsuite/gcc.target/powerpc/htm-1.c +++ b/gcc/testsuite/gcc.target/powerpc/htm-1.c @@ -12,14 +12,21 @@ main (void) { long i; unsigned long mask = 0; + unsigned long retry_count = 0; repeat: if (__builtin_tbegin (0)) { mask++; + retry_count = 0; } else - abort(); + { + /* Retry a limited number of times before aborting. */ + if (retry_count++ < 10) + goto repeat; + abort (); + } if (mask == 1) {