diff mbox

[1/2] Feed bound computation to folder in loop split

Message ID VI1PR0802MB2176DBE6A7DBF9A434104979E7C30@VI1PR0802MB2176.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Bin Cheng June 14, 2017, 1:07 p.m. UTC
Hi,
Loop split forces intermediate computation to gimple operands all the time when
computing bound information.  This is not good since folding opportunities are
missed.  This patch fixes the issue by feeding all computation to folder and only
forcing to gimple operand at last.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin
2017-06-12  Bin Cheng  <bin.cheng@arm.com>

	* tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
	computation to folder, rather than force to gimple operands too
	early.
From 372dc98aa91fd495c98c2326f854eb5f2c76500b Mon Sep 17 00:00:00 2001
From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com>
Date: Fri, 2 Jun 2017 18:05:03 +0100
Subject: [PATCH 1/2] feed-bound-computation-to-folder-20170601.txt

---
 gcc/tree-ssa-loop-split.c | 77 ++++++++++++++++++-----------------------------
 1 file changed, 30 insertions(+), 47 deletions(-)

Comments

Richard Biener June 16, 2017, 10:49 a.m. UTC | #1
On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> Loop split forces intermediate computation to gimple operands all the time when
> computing bound information.  This is not good since folding opportunities are
> missed.  This patch fixes the issue by feeding all computation to folder and only
> forcing to gimple operand at last.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

Hm?  It uses gimple_build () which should do the same as fold_buildN in terms
of simplification.

So where does that not work?  It is supposed to be the prefered way and no
new code should use force_gimple_operand (unless dealing with generic
coming from other middle-end infrastructure like SCEV or niter analysis)

Richard.

>
> Thanks,
> bin
> 2017-06-12  Bin Cheng  <bin.cheng@arm.com>
>
>         * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
>         computation to folder, rather than force to gimple operands too
>         early.
Bin.Cheng June 16, 2017, 1:06 p.m. UTC | #2
On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> Loop split forces intermediate computation to gimple operands all the time when
>> computing bound information.  This is not good since folding opportunities are
>> missed.  This patch fixes the issue by feeding all computation to folder and only
>> forcing to gimple operand at last.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> Hm?  It uses gimple_build () which should do the same as fold_buildN in terms
> of simplification.
>
> So where does that not work?  It is supposed to be the prefered way and no
> new code should use force_gimple_operand (unless dealing with generic
> coming from other middle-end infrastructure like SCEV or niter analysis)
Hmm, current code calls force_gimpele operand several times which
causes the inefficiency.  The patch avoids that and does one call at
the end.

Thanks,
bin
>
> Richard.
>
>>
>> Thanks,
>> bin
>> 2017-06-12  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
>>         computation to folder, rather than force to gimple operands too
>>         early.
Richard Biener June 16, 2017, 1:10 p.m. UTC | #3
On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>> Hi,
>>> Loop split forces intermediate computation to gimple operands all the time when
>>> computing bound information.  This is not good since folding opportunities are
>>> missed.  This patch fixes the issue by feeding all computation to folder and only
>>> forcing to gimple operand at last.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>
>> Hm?  It uses gimple_build () which should do the same as fold_buildN in terms
>> of simplification.
>>
>> So where does that not work?  It is supposed to be the prefered way and no
>> new code should use force_gimple_operand (unless dealing with generic
>> coming from other middle-end infrastructure like SCEV or niter analysis)
> Hmm, current code calls force_gimpele operand several times which
> causes the inefficiency.  The patch avoids that and does one call at
> the end.

But it forces to the same sequence that is used for extending the expression
so folding should work.  Where do you see that it does not?  Note the
code uses gimple_build (), not gimple_build_assign ().

Richard.

> Thanks,
> bin
>>
>> Richard.
>>
>>>
>>> Thanks,
>>> bin
>>> 2017-06-12  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
>>>         computation to folder, rather than force to gimple operands too
>>>         early.
Bin.Cheng June 16, 2017, 1:31 p.m. UTC | #4
On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>> Hi,
>>>> Loop split forces intermediate computation to gimple operands all the time when
>>>> computing bound information.  This is not good since folding opportunities are
>>>> missed.  This patch fixes the issue by feeding all computation to folder and only
>>>> forcing to gimple operand at last.
>>>>
>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>> Hm?  It uses gimple_build () which should do the same as fold_buildN in terms
>>> of simplification.
>>>
>>> So where does that not work?  It is supposed to be the prefered way and no
>>> new code should use force_gimple_operand (unless dealing with generic
>>> coming from other middle-end infrastructure like SCEV or niter analysis)
>> Hmm, current code calls force_gimpele operand several times which
>> causes the inefficiency.  The patch avoids that and does one call at
>> the end.
>
> But it forces to the same sequence that is used for extending the expression
> so folding should work.  Where do you see that it does not?  Note the
> code uses gimple_build (), not gimple_build_assign ().
In spec2k6/hmmer, when building fast_algorithms.c with below command line:
./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
-fdump-tree-lsplit
The lsplit dump contains:
  <bb 11> [12.75%]:
  _124 = _197 + 1;
  _123 = _124 + -1;
  _115 = MIN_EXPR <_197, _124>;
Which is generated here.

Thanks,
bin
>
> Richard.
>
>> Thanks,
>> bin
>>>
>>> Richard.
>>>
>>>>
>>>> Thanks,
>>>> bin
>>>> 2017-06-12  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * tree-ssa-loop-split.c (compute_new_first_bound): Feed bound
>>>>         computation to folder, rather than force to gimple operands too
>>>>         early.
Richard Biener June 16, 2017, 4:16 p.m. UTC | #5
On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
><richard.guenther@gmail.com> wrote:
>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng <amker.cheng@gmail.com>
>wrote:
>>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com>
>wrote:
>>>>> Hi,
>>>>> Loop split forces intermediate computation to gimple operands all
>the time when
>>>>> computing bound information.  This is not good since folding
>opportunities are
>>>>> missed.  This patch fixes the issue by feeding all computation to
>folder and only
>>>>> forcing to gimple operand at last.
>>>>>
>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>
>>>> Hm?  It uses gimple_build () which should do the same as
>fold_buildN in terms
>>>> of simplification.
>>>>
>>>> So where does that not work?  It is supposed to be the prefered way
>and no
>>>> new code should use force_gimple_operand (unless dealing with
>generic
>>>> coming from other middle-end infrastructure like SCEV or niter
>analysis)
>>> Hmm, current code calls force_gimpele operand several times which
>>> causes the inefficiency.  The patch avoids that and does one call at
>>> the end.
>>
>> But it forces to the same sequence that is used for extending the
>expression
>> so folding should work.  Where do you see that it does not?  Note the
>> code uses gimple_build (), not gimple_build_assign ().
>In spec2k6/hmmer, when building fast_algorithms.c with below command
>line:
>./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>-fdump-tree-lsplit
>The lsplit dump contains:
>  <bb 11> [12.75%]:
>  _124 = _197 + 1;
>  _123 = _124 + -1;
>  _115 = MIN_EXPR <_197, _124>;
>Which is generated here.

That means we miss a pattern in match.PD to handle this case.

Richard.

>Thanks,
>bin
>>
>> Richard.
>>
>>> Thanks,
>>> bin
>>>>
>>>> Richard.
>>>>
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>> 2017-06-12  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>         * tree-ssa-loop-split.c (compute_new_first_bound): Feed
>bound
>>>>>         computation to folder, rather than force to gimple
>operands too
>>>>>         early.
Bin.Cheng June 16, 2017, 4:23 p.m. UTC | #6
On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>>On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
>><richard.guenther@gmail.com> wrote:
>>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng <amker.cheng@gmail.com>
>>wrote:
>>>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com>
>>wrote:
>>>>>> Hi,
>>>>>> Loop split forces intermediate computation to gimple operands all
>>the time when
>>>>>> computing bound information.  This is not good since folding
>>opportunities are
>>>>>> missed.  This patch fixes the issue by feeding all computation to
>>folder and only
>>>>>> forcing to gimple operand at last.
>>>>>>
>>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>>
>>>>> Hm?  It uses gimple_build () which should do the same as
>>fold_buildN in terms
>>>>> of simplification.
>>>>>
>>>>> So where does that not work?  It is supposed to be the prefered way
>>and no
>>>>> new code should use force_gimple_operand (unless dealing with
>>generic
>>>>> coming from other middle-end infrastructure like SCEV or niter
>>analysis)
>>>> Hmm, current code calls force_gimpele operand several times which
>>>> causes the inefficiency.  The patch avoids that and does one call at
>>>> the end.
>>>
>>> But it forces to the same sequence that is used for extending the
>>expression
>>> so folding should work.  Where do you see that it does not?  Note the
>>> code uses gimple_build (), not gimple_build_assign ().
>>In spec2k6/hmmer, when building fast_algorithms.c with below command
>>line:
>>./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>>-fdump-tree-lsplit
>>The lsplit dump contains:
>>  <bb 11> [12.75%]:
>>  _124 = _197 + 1;
>>  _123 = _124 + -1;
>>  _115 = MIN_EXPR <_197, _124>;
>>Which is generated here.
>
> That means we miss a pattern in match.PD to handle this case.
I see.  I will withdraw this patch and look in that direction.

Thanks,
bin
>
> Richard.
>
>>Thanks,
>>bin
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> bin
>>>>>
>>>>> Richard.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> bin
>>>>>> 2017-06-12  Bin Cheng  <bin.cheng@arm.com>
>>>>>>
>>>>>>         * tree-ssa-loop-split.c (compute_new_first_bound): Feed
>>bound
>>>>>>         computation to folder, rather than force to gimple
>>operands too
>>>>>>         early.
>
Marc Glisse June 16, 2017, 4:48 p.m. UTC | #7
On Fri, 16 Jun 2017, Bin.Cheng wrote:

> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>>> On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng <amker.cheng@gmail.com>
>>> wrote:
>>>>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>>>> <richard.guenther@gmail.com> wrote:
>>>>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com>
>>> wrote:
>>>>>>> Hi,
>>>>>>> Loop split forces intermediate computation to gimple operands all
>>> the time when
>>>>>>> computing bound information.  This is not good since folding
>>> opportunities are
>>>>>>> missed.  This patch fixes the issue by feeding all computation to
>>> folder and only
>>>>>>> forcing to gimple operand at last.
>>>>>>>
>>>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>>>
>>>>>> Hm?  It uses gimple_build () which should do the same as
>>> fold_buildN in terms
>>>>>> of simplification.
>>>>>>
>>>>>> So where does that not work?  It is supposed to be the prefered way
>>> and no
>>>>>> new code should use force_gimple_operand (unless dealing with
>>> generic
>>>>>> coming from other middle-end infrastructure like SCEV or niter
>>> analysis)
>>>>> Hmm, current code calls force_gimpele operand several times which
>>>>> causes the inefficiency.  The patch avoids that and does one call at
>>>>> the end.
>>>>
>>>> But it forces to the same sequence that is used for extending the
>>> expression
>>>> so folding should work.  Where do you see that it does not?  Note the
>>>> code uses gimple_build (), not gimple_build_assign ().
>>> In spec2k6/hmmer, when building fast_algorithms.c with below command
>>> line:
>>> ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>>> -fdump-tree-lsplit
>>> The lsplit dump contains:
>>>  <bb 11> [12.75%]:
>>>  _124 = _197 + 1;
>>>  _123 = _124 + -1;
>>>  _115 = MIN_EXPR <_197, _124>;
>>> Which is generated here.
>>
>> That means we miss a pattern in match.PD to handle this case.
> I see.  I will withdraw this patch and look in that direction.

For _123, we have

   /* (A +- CST1) +- CST2 -> A + CST3
or
/* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */


For _115, we have

/* min (a, a + CST) -> a where CST is positive.  */
/* min (a, a + CST) -> a + CST where CST is negative. */
(simplify
  (min:c @0 (plus@2 @0 INTEGER_CST@1))
   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
    (if (tree_int_cst_sgn (@1) > 0)
     @0
     @2)))

What is the type of all those SSA_NAMEs?
Bin.Cheng June 16, 2017, 4:58 p.m. UTC | #8
On Fri, Jun 16, 2017 at 5:48 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 16 Jun 2017, Bin.Cheng wrote:
>
>> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>>
>>> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"
>>> <amker.cheng@gmail.com> wrote:
>>>>
>>>> On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>>
>>>>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng <amker.cheng@gmail.com>
>>>>
>>>> wrote:
>>>>>>
>>>>>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>
>>>>>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com>
>>>>
>>>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>> Loop split forces intermediate computation to gimple operands all
>>>>
>>>> the time when
>>>>>>>>
>>>>>>>> computing bound information.  This is not good since folding
>>>>
>>>> opportunities are
>>>>>>>>
>>>>>>>> missed.  This patch fixes the issue by feeding all computation to
>>>>
>>>> folder and only
>>>>>>>>
>>>>>>>> forcing to gimple operand at last.
>>>>>>>>
>>>>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>>>>
>>>>>>>
>>>>>>> Hm?  It uses gimple_build () which should do the same as
>>>>
>>>> fold_buildN in terms
>>>>>>>
>>>>>>> of simplification.
>>>>>>>
>>>>>>> So where does that not work?  It is supposed to be the prefered way
>>>>
>>>> and no
>>>>>>>
>>>>>>> new code should use force_gimple_operand (unless dealing with
>>>>
>>>> generic
>>>>>>>
>>>>>>> coming from other middle-end infrastructure like SCEV or niter
>>>>
>>>> analysis)
>>>>>>
>>>>>> Hmm, current code calls force_gimpele operand several times which
>>>>>> causes the inefficiency.  The patch avoids that and does one call at
>>>>>> the end.
>>>>>
>>>>>
>>>>> But it forces to the same sequence that is used for extending the
>>>>
>>>> expression
>>>>>
>>>>> so folding should work.  Where do you see that it does not?  Note the
>>>>> code uses gimple_build (), not gimple_build_assign ().
>>>>
>>>> In spec2k6/hmmer, when building fast_algorithms.c with below command
>>>> line:
>>>> ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>>>> -fdump-tree-lsplit
>>>> The lsplit dump contains:
>>>>  <bb 11> [12.75%]:
>>>>  _124 = _197 + 1;
>>>>  _123 = _124 + -1;
>>>>  _115 = MIN_EXPR <_197, _124>;
>>>> Which is generated here.
>>>
>>>
>>> That means we miss a pattern in match.PD to handle this case.
>>
>> I see.  I will withdraw this patch and look in that direction.
>
>
> For _123, we have
>
>   /* (A +- CST1) +- CST2 -> A + CST3
> or
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>
>
> For _115, we have
>
> /* min (a, a + CST) -> a where CST is positive.  */
> /* min (a, a + CST) -> a + CST where CST is negative. */
> (simplify
>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>    (if (tree_int_cst_sgn (@1) > 0)
>     @0
>     @2)))
>
> What is the type of all those SSA_NAMEs?
Hi Marc,
Thanks for pointing out the exact patterns.  The variables are of int
type.  The redundant operation disappears in reduced test case though.

Thanks,
bin
>
> --
> Marc Glisse
Andrew Pinski June 16, 2017, 5:04 p.m. UTC | #9
On Fri, Jun 16, 2017 at 9:48 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 16 Jun 2017, Bin.Cheng wrote:
>
>> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>>
>>> On June 16, 2017 3:31:32 PM GMT+02:00, "Bin.Cheng"
>>> <amker.cheng@gmail.com> wrote:
>>>>
>>>> On Fri, Jun 16, 2017 at 2:10 PM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>>
>>>>> On Fri, Jun 16, 2017 at 3:06 PM, Bin.Cheng <amker.cheng@gmail.com>
>>>>
>>>> wrote:
>>>>>>
>>>>>> On Fri, Jun 16, 2017 at 11:49 AM, Richard Biener
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>
>>>>>>> On Wed, Jun 14, 2017 at 3:07 PM, Bin Cheng <Bin.Cheng@arm.com>
>>>>
>>>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>> Loop split forces intermediate computation to gimple operands all
>>>>
>>>> the time when
>>>>>>>>
>>>>>>>> computing bound information.  This is not good since folding
>>>>
>>>> opportunities are
>>>>>>>>
>>>>>>>> missed.  This patch fixes the issue by feeding all computation to
>>>>
>>>> folder and only
>>>>>>>>
>>>>>>>> forcing to gimple operand at last.
>>>>>>>>
>>>>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>>>>
>>>>>>>
>>>>>>> Hm?  It uses gimple_build () which should do the same as
>>>>
>>>> fold_buildN in terms
>>>>>>>
>>>>>>> of simplification.
>>>>>>>
>>>>>>> So where does that not work?  It is supposed to be the prefered way
>>>>
>>>> and no
>>>>>>>
>>>>>>> new code should use force_gimple_operand (unless dealing with
>>>>
>>>> generic
>>>>>>>
>>>>>>> coming from other middle-end infrastructure like SCEV or niter
>>>>
>>>> analysis)
>>>>>>
>>>>>> Hmm, current code calls force_gimpele operand several times which
>>>>>> causes the inefficiency.  The patch avoids that and does one call at
>>>>>> the end.
>>>>>
>>>>>
>>>>> But it forces to the same sequence that is used for extending the
>>>>
>>>> expression
>>>>>
>>>>> so folding should work.  Where do you see that it does not?  Note the
>>>>> code uses gimple_build (), not gimple_build_assign ().
>>>>
>>>> In spec2k6/hmmer, when building fast_algorithms.c with below command
>>>> line:
>>>> ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>>>> -fdump-tree-lsplit
>>>> The lsplit dump contains:
>>>>  <bb 11> [12.75%]:
>>>>  _124 = _197 + 1;
>>>>  _123 = _124 + -1;
>>>>  _115 = MIN_EXPR <_197, _124>;
>>>> Which is generated here.
>>>
>>>
>>> That means we miss a pattern in match.PD to handle this case.
>>
>> I see.  I will withdraw this patch and look in that direction.
>
>
> For _123, we have
>
>   /* (A +- CST1) +- CST2 -> A + CST3
> or
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>
>
> For _115, we have
>
> /* min (a, a + CST) -> a where CST is positive.  */
> /* min (a, a + CST) -> a + CST where CST is negative. */
> (simplify
>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>    (if (tree_int_cst_sgn (@1) > 0)
>     @0
>     @2)))
>
> What is the type of all those SSA_NAMEs?

https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01352.html
which added the min/max patterns.  I forgot to get Naveen to mention I
saw this while looking into loop splitting and why I was adding them.

Thanks,
Andrew Pinski

>
> --
> Marc Glisse
Bin.Cheng July 24, 2017, 11:45 a.m. UTC | #10
On Fri, Jun 16, 2017 at 5:48 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 16 Jun 2017, Bin.Cheng wrote:
>
>> On Fri, Jun 16, 2017 at 5:16 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>>
>>>
>>> That means we miss a pattern in match.PD to handle this case.
>>
>> I see.  I will withdraw this patch and look in that direction.
>
>
> For _123, we have
>
>   /* (A +- CST1) +- CST2 -> A + CST3
> or
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>
>
> For _115, we have
>
> /* min (a, a + CST) -> a where CST is positive.  */
> /* min (a, a + CST) -> a + CST where CST is negative. */
> (simplify
>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>    (if (tree_int_cst_sgn (@1) > 0)
>     @0
>     @2)))
>
> What is the type of all those SSA_NAMEs?
Hi,
From the debugging process, there are two issues preventing "(A +-
CST1) +- CST2 -> A + CST3" from being applied:
A) before we reach this pattern, there is pattern:

/* A - B -> A + (-B) if B is easily negatable.  */
(simplify
 (minus @0 negate_expr_p@1)
 (if (!FIXED_POINT_TYPE_P (type))
 (plus @0 (negate @1))))

which is matched and returned in gimple_simplify_MINUS_EXPR.  So does
pattern order matter here?

B) When folding "_124 - 1" on the basis of existing stmts sequence
like "_124 = _197 + 1;".  The corresponding gimple-match.c code is
like:
  if (gimple_nop_convert (op0, op0_pops, valueize))
    {
      tree o20 = op0_pops[0];
      switch (TREE_CODE (o20))
        {
    case SSA_NAME:
      if (do_valueize (valueize, o20) != NULL_TREE)
        {
          gimple *def_stmt = SSA_NAME_DEF_STMT (o20);
          if (gassign *def = dyn_cast <gassign *> (def_stmt))
            switch (gimple_assign_rhs_code (def))
          {
          case PLUS_EXPR:
            {
              tree o30 = gimple_assign_rhs1 (def);
              if ((o30 = do_valueize (valueize, o30)))
                {
              tree o31 = gimple_assign_rhs2 (def);
              if ((o31 = do_valueize (valueize, o31)))
                {
                  if (tree_swap_operands_p (o30, o31))
                    std::swap (o30, o31);
                  if (CONSTANT_CLASS_P (o31))
                {
                  if (CONSTANT_CLASS_P (op1))
                    {
                      {
/* #line 1392 "../../gcc/gcc/match.pd" */
                    tree captures[3] ATTRIBUTE_UNUSED = { o30, o31, op1 };
                    if (gimple_simplify_194 (res_code, res_ops, seq,
valueize, type, captures, PLUS_EXPR, MINUS_EXPR, MINUS_EXPR))
                      return true;
                      }
                    }
                    }
                }
                }
              break;
            }

Note we have:
(gdb) call debug_generic_expr(op0)
_124
(gdb) call debug_generic_expr(op1)
1
(gdb) call debug_gimple_stmt(def_stmt)
_124 = _197 + 1;
(gdb) call debug_generic_expr(o30)
_197

But since definition of _197 isn't in current stmt sequence, call "o31
= do_valueize (valueize, o31)" will return NULL.  As a result, it's
not matched.
Any ideas?  Thanks.

Thanks,
bin
Marc Glisse July 24, 2017, 12:16 p.m. UTC | #11
On Mon, 24 Jul 2017, Bin.Cheng wrote:

>> For _123, we have
>>
>>   /* (A +- CST1) +- CST2 -> A + CST3
>> or
>> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>>
>>
>> For _115, we have
>>
>> /* min (a, a + CST) -> a where CST is positive.  */
>> /* min (a, a + CST) -> a + CST where CST is negative. */
>> (simplify
>>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>>    (if (tree_int_cst_sgn (@1) > 0)
>>     @0
>>     @2)))
>>
>> What is the type of all those SSA_NAMEs?
> Hi,
> From the debugging process, there are two issues preventing "(A +-
> CST1) +- CST2 -> A + CST3" from being applied:
> A) before we reach this pattern, there is pattern:
>
> /* A - B -> A + (-B) if B is easily negatable.  */
> (simplify
> (minus @0 negate_expr_p@1)
> (if (!FIXED_POINT_TYPE_P (type))
> (plus @0 (negate @1))))
>
> which is matched and returned in gimple_simplify_MINUS_EXPR.  So does
> pattern order matter here?

That shouldn't be a problem, normally we always try to resimplify the 
result of the simplification, and the transformation should handle x+1+-1 
just as well as x+1-1. Is that not happening?

> B) When folding "_124 - 1" on the basis of existing stmts sequence
> like "_124 = _197 + 1;".  The corresponding gimple-match.c code is
> like:
[...]
> But since definition of _197 isn't in current stmt sequence, call "o31
> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
> not matched.

Ah, yes, that problem... Jakub was having a very similar issue a few
weeks ago, don't know if he found a solution. You could call
gimple_simplify directly with a different valueization function if
that's safe. Normally the simplification would wait until the next
forwprop pass.
Bin.Cheng July 24, 2017, 1:49 p.m. UTC | #12
On Mon, Jul 24, 2017 at 1:16 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>>> For _123, we have
>>>
>>>   /* (A +- CST1) +- CST2 -> A + CST3
>>> or
>>> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)).  */
>>>
>>>
>>> For _115, we have
>>>
>>> /* min (a, a + CST) -> a where CST is positive.  */
>>> /* min (a, a + CST) -> a + CST where CST is negative. */
>>> (simplify
>>>  (min:c @0 (plus@2 @0 INTEGER_CST@1))
>>>   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>>>    (if (tree_int_cst_sgn (@1) > 0)
>>>     @0
>>>     @2)))
>>>
>>> What is the type of all those SSA_NAMEs?
>>
>> Hi,
>> From the debugging process, there are two issues preventing "(A +-
>> CST1) +- CST2 -> A + CST3" from being applied:
>> A) before we reach this pattern, there is pattern:
>>
>> /* A - B -> A + (-B) if B is easily negatable.  */
>> (simplify
>> (minus @0 negate_expr_p@1)
>> (if (!FIXED_POINT_TYPE_P (type))
>> (plus @0 (negate @1))))
>>
>> which is matched and returned in gimple_simplify_MINUS_EXPR.  So does
>> pattern order matter here?
>
>
> That shouldn't be a problem, normally we always try to resimplify the result
> of the simplification, and the transformation should handle x+1+-1 just as
> well as x+1-1. Is that not happening?
Yes, it's doesn't matter.

>
>> B) When folding "_124 - 1" on the basis of existing stmts sequence
>> like "_124 = _197 + 1;".  The corresponding gimple-match.c code is
>> like:
>
> [...]
>>
>> But since definition of _197 isn't in current stmt sequence, call "o31
>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>> not matched.
>
>
> Ah, yes, that problem... Jakub was having a very similar issue a few
> weeks ago, don't know if he found a solution. You could call
> gimple_simplify directly with a different valueization function if
> that's safe. Normally the simplification would wait until the next
> forwprop pass.
Thanks for elaboration.
It's too late for next forwprop pass since we are in between loop
optimizations and need the simplified code for niter analysis.
Function compute_new_first_bound calls gimple_build several times,
it's not likely to replace all gimple_build with gimple_simplify?
Also gimple_simplify could return NULL_TREE, in which case I need to
call gimple_build_assign again.  At last, we don't have interface
fold_seq similar to fold_stmt either.
CCing Jakub if he found a solution.  Thanks.

Thanks,
bin
>
> --
> Marc Glisse
Marc Glisse July 24, 2017, 1:59 p.m. UTC | #13
On Mon, 24 Jul 2017, Bin.Cheng wrote:

> But since definition of _197 isn't in current stmt sequence, call "o31
> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
> not matched.

Wait, actually, how was your fold_build* version working? Why was the 
first addition "in the current generic tree" and why isn't it "in current 
stmt sequence"?
Bin.Cheng July 24, 2017, 2:06 p.m. UTC | #14
On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>> But since definition of _197 isn't in current stmt sequence, call "o31
>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>> not matched.
>
>
> Wait, actually, how was your fold_build* version working? Why was the first
> addition "in the current generic tree" and why isn't it "in current stmt
> sequence"?
Maybe I didn't express it clearly.  In compute_new_first_bound, we
have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
1" by calling gimple_build.  The definition of _197 is a PHI and isn't
in current stmt sequence.  For fold_build* version, it builds
expression "_197 + 1 - 1" and simplifies it.

Thanks,
bin
>
> --
> Marc Glisse
Marc Glisse July 24, 2017, 2:31 p.m. UTC | #15
On Mon, 24 Jul 2017, Bin.Cheng wrote:

> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>>
>>> But since definition of _197 isn't in current stmt sequence, call "o31
>>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>>> not matched.
>>
>>
>> Wait, actually, how was your fold_build* version working? Why was the first
>> addition "in the current generic tree" and why isn't it "in current stmt
>> sequence"?
> Maybe I didn't express it clearly.  In compute_new_first_bound, we
> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
> 1" by calling gimple_build.  The definition of _197 is a PHI and isn't
> in current stmt sequence.  For fold_build* version, it builds
> expression "_197 + 1 - 1" and simplifies it.

It seems like it shouldn't be relevant whether the definition of _197 is 
in the stmt sequence or not, but indeed we seem to generate a lot of calls 
to do_valueize... I had misunderstood the issue, sorry.

Strangely, for a pattern like
(simplify (plus @0 @1) @0)
we generate no call to valueize, while for the following
(simplify (plus @0 (minus @1 @2)) @0)
we generate 3 calls to do_valueize.

I think we need Richard to say what the intent is for the valueization 
function. It is used both to stop looking at defining stmt if the return 
is NULL, and to replace/optimize one SSA_NAME with another, but currently 
it seems hard to prevent looking at the defining statement without 
preventing from looking at the SSA_NAME at all.

I guess we'll need a fix in genmatch...
Bin.Cheng July 24, 2017, 2:37 p.m. UTC | #16
On Mon, Jul 24, 2017 at 3:31 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>>>
>>>> But since definition of _197 isn't in current stmt sequence, call "o31
>>>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>>>> not matched.
>>>
>>>
>>>
>>> Wait, actually, how was your fold_build* version working? Why was the
>>> first
>>> addition "in the current generic tree" and why isn't it "in current stmt
>>> sequence"?
>>
>> Maybe I didn't express it clearly.  In compute_new_first_bound, we
>> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
>> 1" by calling gimple_build.  The definition of _197 is a PHI and isn't
>> in current stmt sequence.  For fold_build* version, it builds
>> expression "_197 + 1 - 1" and simplifies it.
>
>
> It seems like it shouldn't be relevant whether the definition of _197 is in
> the stmt sequence or not, but indeed we seem to generate a lot of calls to
> do_valueize... I had misunderstood the issue, sorry.
Oh, no need at all, and thanks very much for all the explanation.
>
> Strangely, for a pattern like
> (simplify (plus @0 @1) @0)
> we generate no call to valueize, while for the following
> (simplify (plus @0 (minus @1 @2)) @0)
> we generate 3 calls to do_valueize.
>
> I think we need Richard to say what the intent is for the valueization
> function. It is used both to stop looking at defining stmt if the return is
> NULL, and to replace/optimize one SSA_NAME with another, but currently it
> seems hard to prevent looking at the defining statement without preventing
> from looking at the SSA_NAME at all.
Looks we don't really expand into def_stmt on leaf nodes, maybe
valueization can be saved in the case?
>
> I guess we'll need a fix in genmatch...
Yeah, then the original patch becomes unnecessary.  Thanks again!

Thanks,
bin
>
> --
> Marc Glisse
Marc Glisse July 24, 2017, 2:52 p.m. UTC | #17
On Mon, 24 Jul 2017, Bin.Cheng wrote:

> On Mon, Jul 24, 2017 at 3:31 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>>
>>> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>>
>>>> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>>>>
>>>>> But since definition of _197 isn't in current stmt sequence, call "o31
>>>>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>>>>> not matched.
>>>>
>>>>
>>>>
>>>> Wait, actually, how was your fold_build* version working? Why was the
>>>> first
>>>> addition "in the current generic tree" and why isn't it "in current stmt
>>>> sequence"?
>>>
>>> Maybe I didn't express it clearly.  In compute_new_first_bound, we
>>> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
>>> 1" by calling gimple_build.  The definition of _197 is a PHI and isn't
>>> in current stmt sequence.  For fold_build* version, it builds
>>> expression "_197 + 1 - 1" and simplifies it.
>>
>>
>> It seems like it shouldn't be relevant whether the definition of _197 is in
>> the stmt sequence or not, but indeed we seem to generate a lot of calls to
>> do_valueize... I had misunderstood the issue, sorry.
> Oh, no need at all, and thanks very much for all the explanation.
>>
>> Strangely, for a pattern like
>> (simplify (plus @0 @1) @0)
>> we generate no call to valueize, while for the following
>> (simplify (plus @0 (minus @1 @2)) @0)
>> we generate 3 calls to do_valueize.
>>
>> I think we need Richard to say what the intent is for the valueization
>> function. It is used both to stop looking at defining stmt if the return is
>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>> seems hard to prevent looking at the defining statement without preventing
>> from looking at the SSA_NAME at all.
> Looks we don't really expand into def_stmt on leaf nodes, maybe
> valueization can be saved in the case?

Passes with a lattice (say PRE) use valueization to replace an SSA_NAME 
with an equivalent one, not just to let it look through the defining 
statement, so I am not sure if we can get rid of those completely, or if 
we need two different do_valueize wrappers for the 2 types of uses.
Richard Biener July 25, 2017, 2:32 p.m. UTC | #18
On Mon, Jul 24, 2017 at 4:31 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>
>> On Mon, Jul 24, 2017 at 2:59 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> On Mon, 24 Jul 2017, Bin.Cheng wrote:
>>>
>>>> But since definition of _197 isn't in current stmt sequence, call "o31
>>>> = do_valueize (valueize, o31)" will return NULL.  As a result, it's
>>>> not matched.
>>>
>>>
>>>
>>> Wait, actually, how was your fold_build* version working? Why was the
>>> first
>>> addition "in the current generic tree" and why isn't it "in current stmt
>>> sequence"?
>>
>> Maybe I didn't express it clearly.  In compute_new_first_bound, we
>> have stmt sequence "_124 = _197 + 1", and we try to simplify "_124 -
>> 1" by calling gimple_build.  The definition of _197 is a PHI and isn't
>> in current stmt sequence.  For fold_build* version, it builds
>> expression "_197 + 1 - 1" and simplifies it.
>
>
> It seems like it shouldn't be relevant whether the definition of _197 is in
> the stmt sequence or not, but indeed we seem to generate a lot of calls to
> do_valueize... I had misunderstood the issue, sorry.
>
> Strangely, for a pattern like
> (simplify (plus @0 @1) @0)
> we generate no call to valueize,

Expected.  We expect the "toplevel" trees to be already valueized.

> while for the following
> (simplify (plus @0 (minus @1 @2)) @0)
> we generate 3 calls to do_valueize.
>
> I think we need Richard to say what the intent is for the valueization
> function. It is used both to stop looking at defining stmt if the return is
> NULL, and to replace/optimize one SSA_NAME with another, but currently it
> seems hard to prevent looking at the defining statement without preventing
> from looking at the SSA_NAME at all.

Yeah, this semantic overloading is an issue.  For gimple_build we have nothing
to "valueize" but we only use it to tell genmatch that it may not look at the
SSA_NAME_DEF_STMT.

> I guess we'll need a fix in genmatch...

I'll have a look tomorrow.

RIchard.

> --
> Marc Glisse
Marc Glisse July 25, 2017, 5:45 p.m. UTC | #19
On Tue, 25 Jul 2017, Richard Biener wrote:

>> I think we need Richard to say what the intent is for the valueization
>> function. It is used both to stop looking at defining stmt if the return is
>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>> seems hard to prevent looking at the defining statement without preventing
>> from looking at the SSA_NAME at all.
>
> Yeah, this semantic overloading is an issue.  For gimple_build we have nothing
> to "valueize" but we only use it to tell genmatch that it may not look at the
> SSA_NAME_DEF_STMT.
>
>> I guess we'll need a fix in genmatch...
>
> I'll have a look tomorrow.

My impression yesterday was that we could replace the current do_valueize 
wrapper by 2 wrappers (without touching the valueize callbacks):
- may_check_def_stmt, which returns a bool corresponding to the current 
do_valueize != NULL_TREE
- maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, 
it returns its argument unchanged.

Not very confident about it though.
Richard Biener July 26, 2017, 7:48 a.m. UTC | #20
On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Tue, 25 Jul 2017, Richard Biener wrote:
>
>>> I think we need Richard to say what the intent is for the valueization
>>> function. It is used both to stop looking at defining stmt if the return
>>> is
>>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>>> seems hard to prevent looking at the defining statement without
>>> preventing
>>> from looking at the SSA_NAME at all.
>>
>>
>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>> nothing
>> to "valueize" but we only use it to tell genmatch that it may not look at
>> the
>> SSA_NAME_DEF_STMT.
>>
>>> I guess we'll need a fix in genmatch...
>>
>>
>> I'll have a look tomorrow.
>
>
> My impression yesterday was that we could replace the current do_valueize
> wrapper by 2 wrappers (without touching the valueize callbacks):
> - may_check_def_stmt, which returns a bool corresponding to the current
> do_valueize != NULL_TREE
> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
> returns its argument unchanged.
>
> Not very confident about it though.

Note I've been there in the past (twice I think) but always ran into existing
latent issues.  So hopefully we've resolved those, I'm testing the following
simplified variant of what I had back in time.

It'll produce

  switch (TREE_CODE (op0))
    {
    case SSA_NAME:
      if (gimple *def_stmt = get_def (valueize, op0))
        {
          if (gassign *def = dyn_cast <gassign *> (def_stmt))
            switch (gimple_assign_rhs_code (def))
              {
              case MINUS_EXPR:
                {
                  tree o20 = gimple_assign_rhs1 (def);
                  o20 = do_valueize (valueize, o20);
                  tree o21 = gimple_assign_rhs2 (def);
                  o21 = do_valueize (valueize, o21);
                  if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
types_match (op1, o21)))
                    {

which also indents less which is nice.

Bootstrap/regtest running on x86_64-unknown-linux-gnu.

Richard.

2017-07-26  Richard Biener  <rguenther@suse.de>

        * gimple-match-head.c (do_valueize): Return OP if valueize
        returns NULL_TREE.
        (get_def): New helper to get at the def stmt of a SSA name
        if valueize allows.
        * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
        do_valueize to get at the def stmt.
        (dt_operand::gen_gimple_expr): Simplify do_valueize calls.


> --
> Marc Glisse
Richard Sandiford July 26, 2017, 9:08 a.m. UTC | #21
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>
>>>> I think we need Richard to say what the intent is for the valueization
>>>> function. It is used both to stop looking at defining stmt if the return
>>>> is
>>>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>>>> seems hard to prevent looking at the defining statement without
>>>> preventing
>>>> from looking at the SSA_NAME at all.
>>>
>>>
>>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>>> nothing
>>> to "valueize" but we only use it to tell genmatch that it may not look at
>>> the
>>> SSA_NAME_DEF_STMT.
>>>
>>>> I guess we'll need a fix in genmatch...
>>>
>>>
>>> I'll have a look tomorrow.
>>
>>
>> My impression yesterday was that we could replace the current do_valueize
>> wrapper by 2 wrappers (without touching the valueize callbacks):
>> - may_check_def_stmt, which returns a bool corresponding to the current
>> do_valueize != NULL_TREE
>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
>> returns its argument unchanged.
>>
>> Not very confident about it though.
>
> Note I've been there in the past (twice I think) but always ran into existing
> latent issues.  So hopefully we've resolved those, I'm testing the following
> simplified variant of what I had back in time.
>
> It'll produce
>
>   switch (TREE_CODE (op0))
>     {
>     case SSA_NAME:
>       if (gimple *def_stmt = get_def (valueize, op0))
>         {
>           if (gassign *def = dyn_cast <gassign *> (def_stmt))
>             switch (gimple_assign_rhs_code (def))
>               {
>               case MINUS_EXPR:
>                 {
>                   tree o20 = gimple_assign_rhs1 (def);
>                   o20 = do_valueize (valueize, o20);
>                   tree o21 = gimple_assign_rhs2 (def);
>                   o21 = do_valueize (valueize, o21);
>                   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
> types_match (op1, o21)))
>                     {
>
> which also indents less which is nice.
>
> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>
> Richard.
>
> 2017-07-26  Richard Biener  <rguenther@suse.de>
>
>         * gimple-match-head.c (do_valueize): Return OP if valueize
>         returns NULL_TREE.
>         (get_def): New helper to get at the def stmt of a SSA name
>         if valueize allows.
>         * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>         do_valueize to get at the def stmt.
>         (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
>
>> --
>> Marc Glisse
>
> 2017-07-26  Richard Biener  <rguenther@suse.de>
>
> 	* gimple-match-head.c (do_valueize): Return OP if valueize
> 	returns NULL_TREE.
> 	(get_def): New helper to get at the def stmt of a SSA name
> 	if valueize allows.
> 	* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
> 	do_valueize to get at the def stmt.
> 	(dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
> Index: gcc/gimple-match-head.c
> ===================================================================
> --- gcc/gimple-match-head.c	(revision 250518)
> +++ gcc/gimple-match-head.c	(working copy)
> @@ -779,10 +779,25 @@ inline tree
>  do_valueize (tree (*valueize)(tree), tree op)
>  {
>    if (valueize && TREE_CODE (op) == SSA_NAME)
> -    return valueize (op);
> +    {
> +      tree tem = valueize (op);
> +      if (tem)
> +	return tem;
> +    }
>    return op;
>  }
>  
> +/* Helper for the autogenerated code, get at the definition of NAME when
> +   VALUEIZE allows that.  */
> +
> +inline gimple *
> +get_def (tree (*valueize)(tree), tree name)
> +{
> +  if (valueize && ! valueize (name))
> +    return NULL;
> +  return SSA_NAME_DEF_STMT (name);
> +}

I realise this is preexisting, but why do we ignore the value returned
by valueize, even if it's different from NAME?  I struggled over that
with the old code when I hit the same problem with some SVE patches
(for which, thanks for the fix).

A comment might be good.

Thanks,
Richard
Marc Glisse July 26, 2017, 9:38 a.m. UTC | #22
On Wed, 26 Jul 2017, Richard Sandiford wrote:

> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>>
>>>>> I think we need Richard to say what the intent is for the valueization
>>>>> function. It is used both to stop looking at defining stmt if the return
>>>>> is
>>>>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>>>>> seems hard to prevent looking at the defining statement without
>>>>> preventing
>>>>> from looking at the SSA_NAME at all.
>>>>
>>>>
>>>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>>>> nothing
>>>> to "valueize" but we only use it to tell genmatch that it may not look at
>>>> the
>>>> SSA_NAME_DEF_STMT.
>>>>
>>>>> I guess we'll need a fix in genmatch...
>>>>
>>>>
>>>> I'll have a look tomorrow.
>>>
>>>
>>> My impression yesterday was that we could replace the current do_valueize
>>> wrapper by 2 wrappers (without touching the valueize callbacks):
>>> - may_check_def_stmt, which returns a bool corresponding to the current
>>> do_valueize != NULL_TREE
>>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
>>> returns its argument unchanged.
>>>
>>> Not very confident about it though.
>>
>> Note I've been there in the past (twice I think) but always ran into existing
>> latent issues.  So hopefully we've resolved those, I'm testing the following
>> simplified variant of what I had back in time.
>>
>> It'll produce
>>
>>   switch (TREE_CODE (op0))
>>     {
>>     case SSA_NAME:
>>       if (gimple *def_stmt = get_def (valueize, op0))
>>         {
>>           if (gassign *def = dyn_cast <gassign *> (def_stmt))
>>             switch (gimple_assign_rhs_code (def))
>>               {
>>               case MINUS_EXPR:
>>                 {
>>                   tree o20 = gimple_assign_rhs1 (def);
>>                   o20 = do_valueize (valueize, o20);
>>                   tree o21 = gimple_assign_rhs2 (def);
>>                   o21 = do_valueize (valueize, o21);
>>                   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
>> types_match (op1, o21)))
>>                     {
>>
>> which also indents less which is nice.
>>
>> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>>
>> Richard.
>>
>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>
>>         * gimple-match-head.c (do_valueize): Return OP if valueize
>>         returns NULL_TREE.
>>         (get_def): New helper to get at the def stmt of a SSA name
>>         if valueize allows.
>>         * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>         do_valueize to get at the def stmt.
>>         (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>
>>
>>> --
>>> Marc Glisse
>>
>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>
>> 	* gimple-match-head.c (do_valueize): Return OP if valueize
>> 	returns NULL_TREE.
>> 	(get_def): New helper to get at the def stmt of a SSA name
>> 	if valueize allows.
>> 	* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>> 	do_valueize to get at the def stmt.
>> 	(dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>
>> Index: gcc/gimple-match-head.c
>> ===================================================================
>> --- gcc/gimple-match-head.c	(revision 250518)
>> +++ gcc/gimple-match-head.c	(working copy)
>> @@ -779,10 +779,25 @@ inline tree
>>  do_valueize (tree (*valueize)(tree), tree op)
>>  {
>>    if (valueize && TREE_CODE (op) == SSA_NAME)
>> -    return valueize (op);
>> +    {
>> +      tree tem = valueize (op);
>> +      if (tem)
>> +	return tem;
>> +    }
>>    return op;
>>  }
>>
>> +/* Helper for the autogenerated code, get at the definition of NAME when
>> +   VALUEIZE allows that.  */
>> +
>> +inline gimple *
>> +get_def (tree (*valueize)(tree), tree name)
>> +{
>> +  if (valueize && ! valueize (name))
>> +    return NULL;
>> +  return SSA_NAME_DEF_STMT (name);
>> +}
>
> I realise this is preexisting, but why do we ignore the value returned
> by valueize, even if it's different from NAME?

My impression is that we expect it has already been valueized.
Richard Sandiford July 26, 2017, 9:45 a.m. UTC | #23
Marc Glisse <marc.glisse@inria.fr> writes:
> On Wed, 26 Jul 2017, Richard Sandiford wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>>>
>>>>>> I think we need Richard to say what the intent is for the valueization
>>>>>> function. It is used both to stop looking at defining stmt if the return
>>>>>> is
>>>>>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>>>>>> seems hard to prevent looking at the defining statement without
>>>>>> preventing
>>>>>> from looking at the SSA_NAME at all.
>>>>>
>>>>>
>>>>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>>>>> nothing
>>>>> to "valueize" but we only use it to tell genmatch that it may not look at
>>>>> the
>>>>> SSA_NAME_DEF_STMT.
>>>>>
>>>>>> I guess we'll need a fix in genmatch...
>>>>>
>>>>>
>>>>> I'll have a look tomorrow.
>>>>
>>>>
>>>> My impression yesterday was that we could replace the current do_valueize
>>>> wrapper by 2 wrappers (without touching the valueize callbacks):
>>>> - may_check_def_stmt, which returns a bool corresponding to the current
>>>> do_valueize != NULL_TREE
>>>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
>>>> returns its argument unchanged.
>>>>
>>>> Not very confident about it though.
>>>
>>> Note I've been there in the past (twice I think) but always ran into existing
>>> latent issues.  So hopefully we've resolved those, I'm testing the following
>>> simplified variant of what I had back in time.
>>>
>>> It'll produce
>>>
>>>   switch (TREE_CODE (op0))
>>>     {
>>>     case SSA_NAME:
>>>       if (gimple *def_stmt = get_def (valueize, op0))
>>>         {
>>>           if (gassign *def = dyn_cast <gassign *> (def_stmt))
>>>             switch (gimple_assign_rhs_code (def))
>>>               {
>>>               case MINUS_EXPR:
>>>                 {
>>>                   tree o20 = gimple_assign_rhs1 (def);
>>>                   o20 = do_valueize (valueize, o20);
>>>                   tree o21 = gimple_assign_rhs2 (def);
>>>                   o21 = do_valueize (valueize, o21);
>>>                   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
>>> types_match (op1, o21)))
>>>                     {
>>>
>>> which also indents less which is nice.
>>>
>>> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>>>
>>> Richard.
>>>
>>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>>
>>>         * gimple-match-head.c (do_valueize): Return OP if valueize
>>>         returns NULL_TREE.
>>>         (get_def): New helper to get at the def stmt of a SSA name
>>>         if valueize allows.
>>>         * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>>         do_valueize to get at the def stmt.
>>>         (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>
>>>
>>>> --
>>>> Marc Glisse
>>>
>>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>>
>>> 	* gimple-match-head.c (do_valueize): Return OP if valueize
>>> 	returns NULL_TREE.
>>> 	(get_def): New helper to get at the def stmt of a SSA name
>>> 	if valueize allows.
>>> 	* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>> 	do_valueize to get at the def stmt.
>>> 	(dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>
>>> Index: gcc/gimple-match-head.c
>>> ===================================================================
>>> --- gcc/gimple-match-head.c	(revision 250518)
>>> +++ gcc/gimple-match-head.c	(working copy)
>>> @@ -779,10 +779,25 @@ inline tree
>>>  do_valueize (tree (*valueize)(tree), tree op)
>>>  {
>>>    if (valueize && TREE_CODE (op) == SSA_NAME)
>>> -    return valueize (op);
>>> +    {
>>> +      tree tem = valueize (op);
>>> +      if (tem)
>>> +	return tem;
>>> +    }
>>>    return op;
>>>  }
>>>
>>> +/* Helper for the autogenerated code, get at the definition of NAME when
>>> +   VALUEIZE allows that.  */
>>> +
>>> +inline gimple *
>>> +get_def (tree (*valueize)(tree), tree name)
>>> +{
>>> +  if (valueize && ! valueize (name))
>>> +    return NULL;
>>> +  return SSA_NAME_DEF_STMT (name);
>>> +}
>>
>> I realise this is preexisting, but why do we ignore the value returned
>> by valueize, even if it's different from NAME?
>
> My impression is that we expect it has already been valueized.

But in that case, why do we try to valueize it again?  It feels like
we're conflating two things.

Richard
Marc Glisse July 26, 2017, 9:57 a.m. UTC | #24
On Wed, 26 Jul 2017, Richard Sandiford wrote:

> Marc Glisse <marc.glisse@inria.fr> writes:
>> On Wed, 26 Jul 2017, Richard Sandiford wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>>>>
>>>>>>> I think we need Richard to say what the intent is for the valueization
>>>>>>> function. It is used both to stop looking at defining stmt if the return
>>>>>>> is
>>>>>>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>>>>>>> seems hard to prevent looking at the defining statement without
>>>>>>> preventing
>>>>>>> from looking at the SSA_NAME at all.
>>>>>>
>>>>>>
>>>>>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>>>>>> nothing
>>>>>> to "valueize" but we only use it to tell genmatch that it may not look at
>>>>>> the
>>>>>> SSA_NAME_DEF_STMT.
>>>>>>
>>>>>>> I guess we'll need a fix in genmatch...
>>>>>>
>>>>>>
>>>>>> I'll have a look tomorrow.
>>>>>
>>>>>
>>>>> My impression yesterday was that we could replace the current do_valueize
>>>>> wrapper by 2 wrappers (without touching the valueize callbacks):
>>>>> - may_check_def_stmt, which returns a bool corresponding to the current
>>>>> do_valueize != NULL_TREE
>>>>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
>>>>> returns its argument unchanged.
>>>>>
>>>>> Not very confident about it though.
>>>>
>>>> Note I've been there in the past (twice I think) but always ran into existing
>>>> latent issues.  So hopefully we've resolved those, I'm testing the following
>>>> simplified variant of what I had back in time.
>>>>
>>>> It'll produce
>>>>
>>>>   switch (TREE_CODE (op0))
>>>>     {
>>>>     case SSA_NAME:
>>>>       if (gimple *def_stmt = get_def (valueize, op0))
>>>>         {
>>>>           if (gassign *def = dyn_cast <gassign *> (def_stmt))
>>>>             switch (gimple_assign_rhs_code (def))
>>>>               {
>>>>               case MINUS_EXPR:
>>>>                 {
>>>>                   tree o20 = gimple_assign_rhs1 (def);
>>>>                   o20 = do_valueize (valueize, o20);
>>>>                   tree o21 = gimple_assign_rhs2 (def);
>>>>                   o21 = do_valueize (valueize, o21);
>>>>                   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
>>>> types_match (op1, o21)))
>>>>                     {
>>>>
>>>> which also indents less which is nice.
>>>>
>>>> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>>>>
>>>> Richard.
>>>>
>>>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>>>
>>>>         * gimple-match-head.c (do_valueize): Return OP if valueize
>>>>         returns NULL_TREE.
>>>>         (get_def): New helper to get at the def stmt of a SSA name
>>>>         if valueize allows.
>>>>         * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>>>         do_valueize to get at the def stmt.
>>>>         (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>>
>>>>
>>>>> --
>>>>> Marc Glisse
>>>>
>>>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>>>
>>>> 	* gimple-match-head.c (do_valueize): Return OP if valueize
>>>> 	returns NULL_TREE.
>>>> 	(get_def): New helper to get at the def stmt of a SSA name
>>>> 	if valueize allows.
>>>> 	* genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>>> 	do_valueize to get at the def stmt.
>>>> 	(dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>>
>>>> Index: gcc/gimple-match-head.c
>>>> ===================================================================
>>>> --- gcc/gimple-match-head.c	(revision 250518)
>>>> +++ gcc/gimple-match-head.c	(working copy)
>>>> @@ -779,10 +779,25 @@ inline tree
>>>>  do_valueize (tree (*valueize)(tree), tree op)
>>>>  {
>>>>    if (valueize && TREE_CODE (op) == SSA_NAME)
>>>> -    return valueize (op);
>>>> +    {
>>>> +      tree tem = valueize (op);
>>>> +      if (tem)
>>>> +	return tem;
>>>> +    }
>>>>    return op;
>>>>  }
>>>>
>>>> +/* Helper for the autogenerated code, get at the definition of NAME when
>>>> +   VALUEIZE allows that.  */
>>>> +
>>>> +inline gimple *
>>>> +get_def (tree (*valueize)(tree), tree name)
>>>> +{
>>>> +  if (valueize && ! valueize (name))
>>>> +    return NULL;
>>>> +  return SSA_NAME_DEF_STMT (name);
>>>> +}
>>>
>>> I realise this is preexisting, but why do we ignore the value returned
>>> by valueize, even if it's different from NAME?
>>
>> My impression is that we expect it has already been valueized.
>
> But in that case, why do we try to valueize it again?

We don't know if valueize returned NULL and we kept the argument as is 
(should not look at def stmt) or if valueize actually returned something. 
This way we can also have valueize replace a value with another, and still 
forbid looking at the def stmt of the replacement value.

> It feels like we're conflating two things.

We are. The question is how much trouble that causes, compared to the 
complication of for instance having 2 callbacks.
Richard Biener July 26, 2017, 11:13 a.m. UTC | #25
On Wed, Jul 26, 2017 at 11:57 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Wed, 26 Jul 2017, Richard Sandiford wrote:
>
>> Marc Glisse <marc.glisse@inria.fr> writes:
>>>
>>> On Wed, 26 Jul 2017, Richard Sandiford wrote:
>>>>
>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>>
>>>>> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse <marc.glisse@inria.fr>
>>>>> wrote:
>>>>>>
>>>>>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>>>>>
>>>>>>>> I think we need Richard to say what the intent is for the
>>>>>>>> valueization
>>>>>>>> function. It is used both to stop looking at defining stmt if the
>>>>>>>> return
>>>>>>>> is
>>>>>>>> NULL, and to replace/optimize one SSA_NAME with another, but
>>>>>>>> currently it
>>>>>>>> seems hard to prevent looking at the defining statement without
>>>>>>>> preventing
>>>>>>>> from looking at the SSA_NAME at all.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yeah, this semantic overloading is an issue.  For gimple_build we
>>>>>>> have
>>>>>>> nothing
>>>>>>> to "valueize" but we only use it to tell genmatch that it may not
>>>>>>> look at
>>>>>>> the
>>>>>>> SSA_NAME_DEF_STMT.
>>>>>>>
>>>>>>>> I guess we'll need a fix in genmatch...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I'll have a look tomorrow.
>>>>>>
>>>>>>
>>>>>>
>>>>>> My impression yesterday was that we could replace the current
>>>>>> do_valueize
>>>>>> wrapper by 2 wrappers (without touching the valueize callbacks):
>>>>>> - may_check_def_stmt, which returns a bool corresponding to the
>>>>>> current
>>>>>> do_valueize != NULL_TREE
>>>>>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE,
>>>>>> it
>>>>>> returns its argument unchanged.
>>>>>>
>>>>>> Not very confident about it though.
>>>>>
>>>>>
>>>>> Note I've been there in the past (twice I think) but always ran into
>>>>> existing
>>>>> latent issues.  So hopefully we've resolved those, I'm testing the
>>>>> following
>>>>> simplified variant of what I had back in time.
>>>>>
>>>>> It'll produce
>>>>>
>>>>>   switch (TREE_CODE (op0))
>>>>>     {
>>>>>     case SSA_NAME:
>>>>>       if (gimple *def_stmt = get_def (valueize, op0))
>>>>>         {
>>>>>           if (gassign *def = dyn_cast <gassign *> (def_stmt))
>>>>>             switch (gimple_assign_rhs_code (def))
>>>>>               {
>>>>>               case MINUS_EXPR:
>>>>>                 {
>>>>>                   tree o20 = gimple_assign_rhs1 (def);
>>>>>                   o20 = do_valueize (valueize, o20);
>>>>>                   tree o21 = gimple_assign_rhs2 (def);
>>>>>                   o21 = do_valueize (valueize, o21);
>>>>>                   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
>>>>> types_match (op1, o21)))
>>>>>                     {
>>>>>
>>>>> which also indents less which is nice.
>>>>>
>>>>> Bootstrap/regtest running on x86_64-unknown-linux-gnu.
>>>>>
>>>>> Richard.
>>>>>
>>>>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>>>>
>>>>>         * gimple-match-head.c (do_valueize): Return OP if valueize
>>>>>         returns NULL_TREE.
>>>>>         (get_def): New helper to get at the def stmt of a SSA name
>>>>>         if valueize allows.
>>>>>         * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>>>>         do_valueize to get at the def stmt.
>>>>>         (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>>>
>>>>>
>>>>>> --
>>>>>> Marc Glisse
>>>>>
>>>>>
>>>>> 2017-07-26  Richard Biener  <rguenther@suse.de>
>>>>>
>>>>>         * gimple-match-head.c (do_valueize): Return OP if valueize
>>>>>         returns NULL_TREE.
>>>>>         (get_def): New helper to get at the def stmt of a SSA name
>>>>>         if valueize allows.
>>>>>         * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>>>>>         do_valueize to get at the def stmt.
>>>>>         (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>>>>>
>>>>> Index: gcc/gimple-match-head.c
>>>>> ===================================================================
>>>>> --- gcc/gimple-match-head.c     (revision 250518)
>>>>> +++ gcc/gimple-match-head.c     (working copy)
>>>>> @@ -779,10 +779,25 @@ inline tree
>>>>>  do_valueize (tree (*valueize)(tree), tree op)
>>>>>  {
>>>>>    if (valueize && TREE_CODE (op) == SSA_NAME)
>>>>> -    return valueize (op);
>>>>> +    {
>>>>> +      tree tem = valueize (op);
>>>>> +      if (tem)
>>>>> +       return tem;
>>>>> +    }
>>>>>    return op;
>>>>>  }
>>>>>
>>>>> +/* Helper for the autogenerated code, get at the definition of NAME
>>>>> when
>>>>> +   VALUEIZE allows that.  */
>>>>> +
>>>>> +inline gimple *
>>>>> +get_def (tree (*valueize)(tree), tree name)
>>>>> +{
>>>>> +  if (valueize && ! valueize (name))
>>>>> +    return NULL;
>>>>> +  return SSA_NAME_DEF_STMT (name);
>>>>> +}
>>>>
>>>>
>>>> I realise this is preexisting, but why do we ignore the value returned
>>>> by valueize, even if it's different from NAME?
>>>
>>>
>>> My impression is that we expect it has already been valueized.
>>
>>
>> But in that case, why do we try to valueize it again?
>
>
> We don't know if valueize returned NULL and we kept the argument as is
> (should not look at def stmt) or if valueize actually returned something.
> This way we can also have valueize replace a value with another, and still
> forbid looking at the def stmt of the replacement value.
>
>> It feels like we're conflating two things.

Yes we are...

> We are. The question is how much trouble that causes, compared to the
> complication of for instance having 2 callbacks.

Or doing (have a patch for that) magics on the return value (smuggling
in a bit on whether we may look at the def).

Doing that and adjusting the few callers we have that a) valueize
and b) return NULL sometimes ran into some bugs so I never
finished this off.  The current patch is fixing a related issue, so if it
goes through (testing still runs) I can maybe dust-off that "proper" fix...

Richard.

> --
> Marc Glisse
Richard Biener July 26, 2017, 11:45 a.m. UTC | #26
On Wed, Jul 26, 2017 at 9:48 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 25, 2017 at 7:45 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> On Tue, 25 Jul 2017, Richard Biener wrote:
>>
>>>> I think we need Richard to say what the intent is for the valueization
>>>> function. It is used both to stop looking at defining stmt if the return
>>>> is
>>>> NULL, and to replace/optimize one SSA_NAME with another, but currently it
>>>> seems hard to prevent looking at the defining statement without
>>>> preventing
>>>> from looking at the SSA_NAME at all.
>>>
>>>
>>> Yeah, this semantic overloading is an issue.  For gimple_build we have
>>> nothing
>>> to "valueize" but we only use it to tell genmatch that it may not look at
>>> the
>>> SSA_NAME_DEF_STMT.
>>>
>>>> I guess we'll need a fix in genmatch...
>>>
>>>
>>> I'll have a look tomorrow.
>>
>>
>> My impression yesterday was that we could replace the current do_valueize
>> wrapper by 2 wrappers (without touching the valueize callbacks):
>> - may_check_def_stmt, which returns a bool corresponding to the current
>> do_valueize != NULL_TREE
>> - maybe_valueize, which tries to valueize, but if it gets a NULL_TREE, it
>> returns its argument unchanged.
>>
>> Not very confident about it though.
>
> Note I've been there in the past (twice I think) but always ran into existing
> latent issues.  So hopefully we've resolved those, I'm testing the following
> simplified variant of what I had back in time.
>
> It'll produce
>
>   switch (TREE_CODE (op0))
>     {
>     case SSA_NAME:
>       if (gimple *def_stmt = get_def (valueize, op0))
>         {
>           if (gassign *def = dyn_cast <gassign *> (def_stmt))
>             switch (gimple_assign_rhs_code (def))
>               {
>               case MINUS_EXPR:
>                 {
>                   tree o20 = gimple_assign_rhs1 (def);
>                   o20 = do_valueize (valueize, o20);
>                   tree o21 = gimple_assign_rhs2 (def);
>                   o21 = do_valueize (valueize, o21);
>                   if (op1 == o21 || (operand_equal_p (op1, o21, 0) &&
> types_match (op1, o21)))
>                     {
>
> which also indents less which is nice.
>
> Bootstrap/regtest running on x86_64-unknown-linux-gnu.

Now applied with

        * gcc/testsuite/gcc.dg/pr70920-2.c: Adjust for transform already
        happening in ccp1.
        * gcc/testsuite/gcc.dg/pr70920-4.c: Likewise.

that shows it's working (CCP is one of the passes eventually running
into this issue).

Richard.

> Richard.
>
> 2017-07-26  Richard Biener  <rguenther@suse.de>
>
>         * gimple-match-head.c (do_valueize): Return OP if valueize
>         returns NULL_TREE.
>         (get_def): New helper to get at the def stmt of a SSA name
>         if valueize allows.
>         * genmatch.c (dt_node::gen_kids_1): Use get_def instead of
>         do_valueize to get at the def stmt.
>         (dt_operand::gen_gimple_expr): Simplify do_valueize calls.
>
>
>> --
>> Marc Glisse
diff mbox

Patch

diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
index e77f2bf..f8fe8e6 100644
--- a/gcc/tree-ssa-loop-split.c
+++ b/gcc/tree-ssa-loop-split.c
@@ -396,53 +396,38 @@  compute_new_first_bound (gimple_seq *stmts, struct tree_niter_desc *niter,
 {
   /* The niter structure contains the after-increment IV, we need
      the loop-enter base, so subtract STEP once.  */
-  tree controlbase = force_gimple_operand (niter->control.base,
-					   stmts, true, NULL_TREE);
+  tree controlbase = niter->control.base;
   tree controlstep = niter->control.step;
-  tree enddiff;
+  tree enddiff, end = niter->bound;
+  tree type;
+
+  /* Compute end-beg.  */
   if (POINTER_TYPE_P (TREE_TYPE (controlbase)))
     {
-      controlstep = gimple_build (stmts, NEGATE_EXPR,
-				  TREE_TYPE (controlstep), controlstep);
-      enddiff = gimple_build (stmts, POINTER_PLUS_EXPR,
-			      TREE_TYPE (controlbase),
-			      controlbase, controlstep);
+      controlstep = fold_build1 (NEGATE_EXPR,
+				 TREE_TYPE (controlstep), controlstep);
+      enddiff = fold_build_pointer_plus (controlbase, controlstep);
+
+      type = unsigned_type_for (enddiff);
+      enddiff = fold_build1 (NEGATE_EXPR, type, fold_convert (type, enddiff));
+      end = fold_convert (type, end);
+      enddiff = fold_build2 (PLUS_EXPR, type, end, enddiff);
+      enddiff = fold_convert (sizetype, enddiff);
     }
   else
-    enddiff = gimple_build (stmts, MINUS_EXPR,
-			    TREE_TYPE (controlbase),
-			    controlbase, controlstep);
-
-  /* Compute end-beg.  */
-  gimple_seq stmts2;
-  tree end = force_gimple_operand (niter->bound, &stmts2,
-					true, NULL_TREE);
-  gimple_seq_add_seq_without_update (stmts, stmts2);
-  if (POINTER_TYPE_P (TREE_TYPE (enddiff)))
     {
-      tree tem = gimple_convert (stmts, sizetype, enddiff);
-      tem = gimple_build (stmts, NEGATE_EXPR, sizetype, tem);
-      enddiff = gimple_build (stmts, POINTER_PLUS_EXPR,
-			      TREE_TYPE (enddiff),
-			      end, tem);
+      enddiff = fold_build2 (MINUS_EXPR, TREE_TYPE (controlbase),
+			     controlbase, controlstep);
+      enddiff = fold_build2 (MINUS_EXPR, TREE_TYPE (enddiff), end, enddiff);
     }
-  else
-    enddiff = gimple_build (stmts, MINUS_EXPR, TREE_TYPE (enddiff),
-			    end, enddiff);
 
   /* Compute guard_init + (end-beg).  */
   tree newbound;
-  enddiff = gimple_convert (stmts, TREE_TYPE (guard_init), enddiff);
   if (POINTER_TYPE_P (TREE_TYPE (guard_init)))
-    {
-      enddiff = gimple_convert (stmts, sizetype, enddiff);
-      newbound = gimple_build (stmts, POINTER_PLUS_EXPR,
-			       TREE_TYPE (guard_init),
-			       guard_init, enddiff);
-    }
+    newbound = fold_build_pointer_plus (guard_init, enddiff);
   else
-    newbound = gimple_build (stmts, PLUS_EXPR, TREE_TYPE (guard_init),
-			     guard_init, enddiff);
+    newbound = fold_build2 (PLUS_EXPR, TREE_TYPE (guard_init), guard_init,
+			    fold_convert (TREE_TYPE (guard_init), enddiff));
 
   /* Depending on the direction of the IVs the new bound for the first
      loop is the minimum or maximum of old bound and border.
@@ -467,20 +452,18 @@  compute_new_first_bound (gimple_seq *stmts, struct tree_niter_desc *niter,
 
   if (addbound)
     {
-      tree type2 = TREE_TYPE (newbound);
-      if (POINTER_TYPE_P (type2))
-	type2 = sizetype;
-      newbound = gimple_build (stmts,
-			       POINTER_TYPE_P (TREE_TYPE (newbound))
-			       ? POINTER_PLUS_EXPR : PLUS_EXPR,
-			       TREE_TYPE (newbound),
-			       newbound,
-			       build_int_cst (type2, addbound));
+      type = TREE_TYPE (newbound);
+      if (POINTER_TYPE_P (type))
+	type = sizetype;
+      newbound = fold_build2 (POINTER_TYPE_P (TREE_TYPE (newbound))
+				? POINTER_PLUS_EXPR : PLUS_EXPR,
+			      TREE_TYPE (newbound),
+			      newbound,
+			      build_int_cst (type, addbound));
     }
 
-  tree newend = gimple_build (stmts, minmax, TREE_TYPE (border),
-			      border, newbound);
-  return newend;
+  newbound = fold_build2 (minmax, TREE_TYPE (border), border, newbound);
+  return force_gimple_operand (newbound, stmts, true, NULL_TREE);
 }
 
 /* Checks if LOOP contains an conditional block whose condition