diff mbox series

Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.))

Message ID 87y1taencs.fsf@dem-tschwing-1.ger.mentorg.com
State New
Headers show
Series Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.)) | expand

Commit Message

Thomas Schwinge Oct. 20, 2022, 11:38 a.m. UTC
Hi!

On 2022-10-18T07:41:29+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
> On Mon, Oct 17, 2022 at 4:47 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>> On 2022-10-17T15:58:47+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
>> > On Mon, Oct 17, 2022 at 9:44 AM Thomas Schwinge <thomas@codesourcery.com> wrote:
>> >> On 2022-10-11T10:31:37+0200, Aldy Hernandez via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>> >> > When solving 0 = _15 & 1, we calculate _15 as:
>> >> >
>> >> >       [irange] int [-INF, -2][0, +INF] NONZERO 0xfffffffe
>> >> >
>> >> > The known value of _15 is [0, 1] NONZERO 0x1 which is intersected with
>> >> > the above, yielding:
>> >> >
>> >> >       [0, 1] NONZERO 0x0
>> >> >
>> >> > This eventually gets copied to a _Bool [0, 1] NONZERO 0x0.
>> >> >
>> >> > This is problematic because here we have a bool which is zero, but
>> >> > returns false for irange::zero_p, since the latter does not look at
>> >> > nonzero bits.  This causes logical_combine to assume the range is
>> >> > not-zero, and all hell breaks loose.
>> >> >
>> >> > I think we should just normalize a nonzero mask of 0 to [0, 0] at
>> >> > creation, thus avoiding all this.
>> >>
>> >> 1. This commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
>> >> "[PR107195] Set range to zero when nonzero mask is 0" broke a GCC/nvptx
>> >> offloading test case:
>> >>
>> >>     UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
>> >>     PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
>> >>     PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
>> >>     [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}"
>> >>
>> >> Same for C++.
>> >>
>> >> I'll later send a patch (for the test case!) to fix that up.
>> >>
>> >> 2. Looking into this, I found that this
>> >> commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
>> >> "[PR107195] Set range to zero when nonzero mask is 0" actually enables a
>> >> code transformation/optimization that GCC apparently has not been doing
>> >> before!  I've tried to capture that in the attached
>> >> "Add 'c-c++-common/torture/pr107195-1.c' [PR107195]".
>> >
>> > Nice.
>> >
>> >> Will you please verify that one?  In its current '#if 1' configuration,
>> >> it's all-PASS after commit
>> >> r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
>> >> "[PR107195] Set range to zero when nonzero mask is 0", whereas before, we
>> >> get two calls to 'foo', because GCC apparently didnn't understand the
>> >> relation (optimization opportunity) between 'r *= 2;' and the subsequent
>> >> 'if (r & 1)'.
>> >
>> > Yeah, that looks correct.  We keep better track of nonzero masks.
>>
>> OK, next observation: this also works for split-up expressions
>> 'if ((r & 2) && (r & 1))' (same rationale as for 'if (r & 1)' alone).
>> I've added such a variant in my test case.
>
> Unless I'm missing something, your testcase doesn't have a body for
> foo[123], so GCC has no way to know what any of those functions did or
> what bits are set/unset.

Ah, there seems to be some confusion what's happening here.  :-)

First, these functions, 'foo[...]', are '__attribute__((const))', and
their argument, 'r' doesn't change if the first 'foo[...]' call returns
zero.  Thus, GCC can infer that the second 'foo[...]' call also must
return zero, and thus may elide that second function call.  Second,
should the first 'foo[...]' call return non-zero, 'r *= 2;' is executed,
and thus GCC can infer that 'if (r & 1)' can never hold, and thus the
'if' branch is not executed, and thus it may elide the second function
call for that scenario, too.  Thus, the second function is completely
elided.

The attached "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" demonstrates
that this does work for 'if (r & 1)' in 'f1', 'foo1', and also does work
for 'if ((r & 2) && (r & 1))' in 'f2', 'foo2', but:

>> But: it doesn't work for logically equal 'if (r & 3)'.

... in 'f3', 'foo3'.

I understand 'r & 3' to be logically equivalent to '(r & 2) && (r & 1)',
right?

>> I've added such
>> an XFAILed variant in my test case.  Do you have guidance what needs to
>> be done to make such cases work, too?

Thus my question, where/how GCC would learn this?


Otherwise:

>> >> I've left in the other '#if' variants in case you'd like to experiment
>> >> with these, but would otherwise clean that up before pushing.
>> >>
>> >> Where does one put such a test case?
>> >>
>> >> Should the file be named 'pr107195' or something else?
>> >
>> > The aforementioned patch already has:
>> >
>> >             * gcc.dg/tree-ssa/pr107195-1.c: New test.
>> >             * gcc.dg/tree-ssa/pr107195-2.c: New test.
>> >
>> > So I would just add a pr107195-3.c test.
>>
>> But note that unlike yours in 'gcc.dg/tree-ssa/', I had put mine into
>> 'c-c++-common/torture/'.  That's so that we get C and C++ testing, and
>> all torture testing flag variants.  (... where we do see the optimization
>> happen starting at '-O1'.)  Do you think that is excessive, and a single
>> 'gcc.dg/tree-ssa/' test case, C only, '-O1' only is sufficient for this?
>> (I don't have much experience with test cases in such regions of GCC,
>> hence these questions.)
>
> My personal preference is tree-ssa since they are middle end tests.
> Also, since we're testing ranger, it primarily runs in DOM, VRP, evrp,
> and the backward threader, so no need to run it at multiple
> optimization levels.
>
> I suggested DOM, because I know ranger runs within DOM, so if the
> transformation is seen at -O1, it's likely to be done there.  Also,
> evrp/VRP don't run at -O1, so that's another hint it happened in DOM.
> This is a guess though, it could've been CCP setting a nonzero mask,
> which then ranger/DOM picked up.
>
> All in all, I'm in favor of putting tests as early as possible,
> otherwise any number of passes could perform a transformation that
> could lead to the same end result.  We are testing ranger, so the most
> likely place to put this test is in DOM at -O1, or in evrp/VRP[12] for
> -O2.
>
> Of course, this is my personal preference, and these are just general
> guidelines.  Perhaps others can opine.

Thanks, that's exactly what I needed to hear, and makes perfect sense to
me.  Updated "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" is attached.
OK to push?


Grüße
 Thomas


>> >> Do we scan 'optimized', or an earlier dump?
>> >>
>> >> At '-O1', the actual code transformation is visible already in the 'dom2'
>> >> dump:
>> >>
>> >>        <bb 3> [local count: 536870913]:
>> >>        gimple_assign <mult_expr, r_7, r_6(D), 2, NULL>
>> >>     +  gimple_assign <bit_and_expr, _11, r_7, 1, NULL>
>> >>     +  goto <bb 6>; [100.00%]
>> >>
>> >>     -  <bb 4> [local count: 1073741824]:
>> >>     -  # gimple_phi <r_4, r_6(D)(2), r_7(3)>
>> >>     +  <bb 4> [local count: 536870912]:
>> >>     +  # gimple_phi <r_4, r_6(D)(2)>
>> >>        gimple_assign <bit_and_expr, _2, r_4, 1, NULL>
>> >>        gimple_cond <ne_expr, _2, 0, NULL, NULL>
>> >>     -    goto <bb 5>; [50.00%]
>> >>     +    goto <bb 5>; [100.00%]
>> >>        else
>> >>     -    goto <bb 6>; [50.00%]
>> >>     +    goto <bb 6>; [0.00%]
>> >>
>> >>        <bb 5> [local count: 536870913]:
>> >>        gimple_call <foo, _3, r_4>
>> >>        gimple_assign <plus_expr, r_8, _3, r_4, NULL>
>> >>
>> >>        <bb 6> [local count: 1073741824]:
>> >>     -  # gimple_phi <r_5, r_4(4), r_8(5)>
>> >>     +  # gimple_phi <r_5, r_4(4), r_8(5), r_7(3)>
>> >>        gimple_return <r_5>
>> >>
>> >> And, the actual "avoid second call 'foo'" optimization is visiable
>> >> starting 'dom3':
>> >>
>> >>        <bb 3> [local count: 536870913]:
>> >>        gimple_assign <mult_expr, r_7, r_6(D), 2, NULL>
>> >>     +  goto <bb 6>; [100.00%]
>> >>
>> >>     -  <bb 4> [local count: 1073741824]:
>> >>     -  # gimple_phi <r_4, r_6(D)(2), r_7(3)>
>> >>     -  gimple_assign <bit_and_expr, _2, r_4, 1, NULL>
>> >>     +  <bb 4> [local count: 536870912]:
>> >>     +  gimple_assign <bit_and_expr, _2, r_6(D), 1, NULL>
>> >>        gimple_cond <ne_expr, _2, 0, NULL, NULL>
>> >>     -    goto <bb 5>; [50.00%]
>> >>     +    goto <bb 5>; [100.00%]
>> >>        else
>> >>     -    goto <bb 6>; [50.00%]
>> >>     +    goto <bb 6>; [0.00%]
>> >>
>> >>        <bb 5> [local count: 536870913]:
>> >>     -  gimple_call <foo, _3, r_4>
>> >>     -  gimple_assign <plus_expr, r_8, _3, r_4, NULL>
>> >>     +  gimple_assign <integer_cst, _3, 0, NULL, NULL>
>> >>     +  gimple_assign <ssa_name, r_8, r_6(D), NULL, NULL>
>> >>
>> >>        <bb 6> [local count: 1073741824]:
>> >>     -  # gimple_phi <r_5, r_4(4), r_8(5)>
>> >>     +  # gimple_phi <r_5, r_6(D)(4), r_6(D)(5), r_7(3)>
>> >>        gimple_return <r_5>
>> >>
>> >> ..., but I don't know if either of those would be stable/appropriate to
>> >> scan instead of 'optimized'?
>> >
>> > IMO, either dom3 or optimized is fine.
>>
>> OK, I left it at 'optimized', as I don't have any rationale why exactly
>> it should happen in 'dom3' already.  ;-)
>>
>>
>> Grüße
>>  Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

Comments

Aldy Hernandez Oct. 20, 2022, 12:23 p.m. UTC | #1
> I understand 'r & 3' to be logically equivalent to '(r & 2) && (r & 1)',
> right?

For r == 2, r & 3 == 2, whereas (r & 2) && (r & 1) == 0, so no?

Aldy
Thomas Schwinge Oct. 20, 2022, 7:22 p.m. UTC | #2
Hi!

On 2022-10-20T14:23:33+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
>> I understand 'r & 3' to be logically equivalent to '(r & 2) && (r & 1)',
>> right?
>
> For r == 2, r & 3 == 2, whereas (r & 2) && (r & 1) == 0, so no?

Thanks, and now please let me crawl back under my stone, embarassing...
That'd rather be '(r & 2) || (r & 1)'.

Well, with that now clarified, how about the again updated
"Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?

Have I done something stupid again re 'f4b', XFAILed?


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Aldy Hernandez Oct. 20, 2022, 10:44 p.m. UTC | #3
On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>
> Hi!
>
> On 2022-10-20T14:23:33+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
> >> I understand 'r & 3' to be logically equivalent to '(r & 2) && (r & 1)',
> >> right?
> >
> > For r == 2, r & 3 == 2, whereas (r & 2) && (r & 1) == 0, so no?
>
> Thanks, and now please let me crawl back under my stone, embarassing...
> That'd rather be '(r & 2) || (r & 1)'.

No worries.  If there was a tally of how many times a GCC hacker has
to crawl under a stone, I'd have the record ;-).

>
> Well, with that now clarified, how about the again updated
> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?

I see 7 different tests in this patch.  Did the 6 that pass, fail
before my patch for PR107195 and are now working?   Cause unless
that's the case, they shouldn't be in a test named pr107195-3.c, but
somewhere else.

I see there's one XFAILed test in your patch, and this certainly
doesn't look like something that has anything to do with the patch I
submitted.  Perhaps you could open a PR with an enhancement request
for this one?

That being said...

/* { dg-additional-options -O1 } */
extern int
__attribute__((const))
foo4b (int);

int f4b (unsigned int r)
{
  if (foo4b (r))
    r *= 8U;

  if ((r / 2U) & 2U)
    r += foo4b (r);

  return r;
}
/* { dg-final { scan-tree-dump-times {gimple_call <foo4b,} 1 dom3 {
xfail *-*-* } } } */

At -O2, this is something PRE is doing,  so GCC already handles this.
However, you are suggesting this isn't handled at -O1 and should be??
None of the VRPs run at -O1 so ranger-vrp won't even get a chance.
However, DOM runs at -O1 and it uses ranger to do simple copy
propagation and some jump threading...so technically we could do
something...

DOM should be able to thread from the r *= 8U to the return because
the nonzero mask (known zeros) after the multiplication is 0xfffffff8,
which it could use to solve the second conditional as false.  This
would leave us with:

if (foo4b (r))
  {
    r *= 8U;
   return r;
  }
else
  {
     if ((r / 2U) & 2U)
       r += foo4b (r);
  }

...which exposes the fact that the second call to foo4b() has the same
"r" as the first one, so it could be folded.  I don't know whose job
it is to notice that two const calls have the same arguments, but ISTM
that if we thread the above correctly, someone should be able to clean
this up.  No clue whether this happens at -O1.

However... we're not threading this.  It looks like we're not keeping
track of nonzero bits (known zeros) through the division.  The
multiplication gives us 0xfffffff8 and we should be able to divide
that by 2 and get 0x7ffffffc which solves the second conditional to 0.

So...maybe DOM+ranger could set things up for another pass to clean this up?

Either way, you could open an enhancement request, if anything to keep
the nonzero mask up to date through the division.

Aldy
Thomas Schwinge Oct. 21, 2022, 8:38 a.m. UTC | #4
Hi!

On 2022-10-21T00:44:30+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
> On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?
>
> I see 7 different tests in this patch.  Did the 6 that pass, fail
> before my patch for PR107195 and are now working?   Cause unless
> that's the case, they shouldn't be in a test named pr107195-3.c, but
> somewhere else.

That's correct; I should've mentioned that I had verified this.  With the
code changes of commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0" reverted, we get:

    PASS: gcc.dg/tree-ssa/pr107195-3.c (test for excess errors)
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo1," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo2," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo3," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo4," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo5," 1
    FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo6," 1

..., and in 'pr107195-3.c.196t.dom3' instead see two calls of each
'foo[...]' function.

That's with this...

> I see there's one XFAILed test in your patch

... XFAILed test case removed, see the attached
"Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]";
OK now to push that version?


> and this certainly
> doesn't look like something that has anything to do with the patch I
> submitted.  Perhaps you could open a PR with an enhancement request
> for this one?
>
> That being said...
>
> /* { dg-additional-options -O1 } */
> extern int
> __attribute__((const))
> foo4b (int);
>
> int f4b (unsigned int r)
> {
>   if (foo4b (r))
>     r *= 8U;
>
>   if ((r / 2U) & 2U)
>     r += foo4b (r);
>
>   return r;
> }
> /* { dg-final { scan-tree-dump-times {gimple_call <foo4b,} 1 dom3 {
> xfail *-*-* } } } */
>
> At -O2, this is something PRE is doing,  so GCC already handles this.
> However, you are suggesting this isn't handled at -O1 and should be??

My thinking was that this optimization does work for 'r >> 1', but it
doesn't work for 'r / 2'.

> None of the VRPs run at -O1 so ranger-vrp won't even get a chance.
> However, DOM runs at -O1 and it uses ranger to do simple copy
> propagation and some jump threading...so technically we could do
> something...
>
> DOM should be able to thread from the r *= 8U to the return because
> the nonzero mask (known zeros) after the multiplication is 0xfffffff8,
> which it could use to solve the second conditional as false.  This
> would leave us with:
>
> if (foo4b (r))
>   {
>     r *= 8U;
>    return r;
>   }
> else
>   {
>      if ((r / 2U) & 2U)
>        r += foo4b (r);
>   }
>
> ...which exposes the fact that the second call to foo4b() has the same
> "r" as the first one, so it could be folded.  I don't know whose job
> it is to notice that two const calls have the same arguments, but ISTM
> that if we thread the above correctly, someone should be able to clean
> this up.  No clue whether this happens at -O1.
>
> However... we're not threading this.  It looks like we're not keeping
> track of nonzero bits (known zeros) through the division.  The
> multiplication gives us 0xfffffff8 and we should be able to divide
> that by 2 and get 0x7ffffffc which solves the second conditional to 0.
>
> So...maybe DOM+ranger could set things up for another pass to clean this up?
>
> Either way, you could open an enhancement request, if anything to keep
> the nonzero mask up to date through the division.

I've thus filed <https://gcc.gnu.org/PR107342>
"Optimization opportunity where integer '/' corresponds to '>>'" for
continuing that investigation.


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Aldy Hernandez Oct. 21, 2022, 8:51 a.m. UTC | #5
On Fri, Oct 21, 2022 at 10:38 AM Thomas Schwinge
<thomas@codesourcery.com> wrote:
>
> Hi!
>
> On 2022-10-21T00:44:30+0200, Aldy Hernandez <aldyh@redhat.com> wrote:
> > On Thu, Oct 20, 2022 at 9:22 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
> >> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" attached?
> >
> > I see 7 different tests in this patch.  Did the 6 that pass, fail
> > before my patch for PR107195 and are now working?   Cause unless
> > that's the case, they shouldn't be in a test named pr107195-3.c, but
> > somewhere else.
>
> That's correct; I should've mentioned that I had verified this.  With the
> code changes of commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
> "[PR107195] Set range to zero when nonzero mask is 0" reverted, we get:
>
>     PASS: gcc.dg/tree-ssa/pr107195-3.c (test for excess errors)
>     FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo1," 1
>     FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo2," 1
>     FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo3," 1
>     FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo4," 1
>     FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo5," 1
>     FAIL: gcc.dg/tree-ssa/pr107195-3.c scan-tree-dump-times dom3 "gimple_call <foo6," 1
>
> ..., and in 'pr107195-3.c.196t.dom3' instead see two calls of each
> 'foo[...]' function.
>
> That's with this...
>
> > I see there's one XFAILed test in your patch
>
> ... XFAILed test case removed, see the attached
> "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]";
> OK now to push that version?

OK, thanks.
diff mbox series

Patch

From 1e7943646059c13713b0b3f1e667be9de2c03d0f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Mon, 17 Oct 2022 09:10:03 +0200
Subject: [PATCH] Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]

... to display optimization performed as of recent
commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
"[PR107195] Set range to zero when nonzero mask is 0".

	PR tree-optimization/107195
	gcc/testsuite/
	* gcc.dg/tree-ssa/pr107195-3.c: New.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c | 73 ++++++++++++++++++++++
 1 file changed, 73 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
new file mode 100644
index 00000000000..d3c5a31a904
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
@@ -0,0 +1,73 @@ 
+/* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'.  */
+
+/* { dg-additional-options -O1 } */
+/* { dg-additional-options -fdump-tree-dom3-raw } */
+
+
+extern int
+__attribute__((const))
+foo1 (int);
+
+int f1 (int r)
+{
+  if (foo1 (r)) /* If this first 'if' holds...  */
+    r *= 2;  /* ..., 'r' now has a zero-value lower-most bit...  */
+
+  if (r & 1) /* ..., so this second 'if' can never hold...  */
+    { /* ..., so this is unreachable.  */
+      /* In constrast, if the first 'if' does not hold ('foo1 (r) == 0'), the
+	 second 'if' may hold, but we know ('foo1' being 'const') that
+	 'foo1 (r) == 0', so don't have to re-evaluate it here: */
+      r += foo1 (r);
+      /* Thus, if optimizing, we only ever expect one call of 'foo1'.
+	 { dg-final { scan-tree-dump-times {gimple_call <foo1,} 1 dom3 } } */
+    }
+
+  return r;
+}
+
+
+extern int
+__attribute__((const))
+foo2 (int);
+
+int f2 (int r)
+{
+  if (foo2 (r)) /* If this first 'if' holds...  */
+    r *= 2;  /* ..., 'r' now has a zero-value lower-most bit...  */
+
+  if ((r & 2) && (r & 1)) /* ..., so 'r & 1' in this second 'if' can never hold...  */
+    { /* ..., so this is unreachable.  */
+      /* In constrast, if the first 'if' does not hold ('foo2 (r) == 0'), the
+	 second 'if' may hold, but we know ('foo2' being 'const') that
+	 'foo2 (r) == 0', so don't have to re-evaluate it here: */
+      r += foo2 (r);
+      /* Thus, if optimizing, we only ever expect one call of 'foo2'.
+	 { dg-final { scan-tree-dump-times {gimple_call <foo2,} 1 dom3 } } */
+    }
+
+  return r;
+}
+
+
+extern int
+__attribute__((const))
+foo3 (int);
+
+int f3 (int r)
+{
+  if (foo3 (r)) /* If this first 'if' holds...  */
+    r *= 2;  /* ..., 'r' now has a zero-value lower-most bit...  */
+
+  if (r & 3) /* ..., so this second 'if' can never hold...  */
+    { /* ..., so this is unreachable.  */
+      /* In constrast, if the first 'if' does not hold ('foo3 (r) == 0'), the
+	 second 'if' may hold, but we know ('foo3' being 'const') that
+	 'foo3 (r) == 0', so don't have to re-evaluate it here: */
+      r += foo3 (r);
+      /* Thus, if optimizing, we only ever expect one call of 'foo3'.
+	 { dg-final { scan-tree-dump-times {gimple_call <foo3,} 1 dom3 { xfail *-*-* } } } */
+    }
+
+  return r;
+}
-- 
2.25.1