diff mbox series

[v3] rs6000: Fix some issues in rs6000_can_inline_p [PR102059]

Message ID 85f65aa7-84d3-0e57-ccfb-b95aa24bbe8b@linux.ibm.com
State New
Headers show
Series [v3] rs6000: Fix some issues in rs6000_can_inline_p [PR102059] | expand

Commit Message

Kewen.Lin Jan. 5, 2022, 7:34 a.m. UTC
Hi,

This patch is to fix the inconsistent behaviors for non-LTO mode
and LTO mode.  As Martin pointed out, currently the function
rs6000_can_inline_p simply makes it inlinable if callee_tree is
NULL, but it's unexpected, we should use the command line options
from target_option_default_node as default.

It replaces rs6000_isa_flags with target_option_default_node when
caller_tree is NULL since it's more straightforward and doesn't
suffer from some bug not to keep rs6000_isa_flags as default.

It also extends the scope of the check for the case that callee
has explicit set options, inlining in test case pr102059-5.c can
happen unexpectedly before, it's fixed accordingly.

As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
can be neglected for always inlining, this patch also takes some
flags when the callee is attributed by always_inline.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586112.html

This patch is one re-post of this updated version[1] and also
rebased and adjusted on top of the related commit r12-6219.

Bootstrapped and regtested on powerpc64-linux-gnu P8 and
powerpc64le-linux-gnu P9 and P10.

Is it ok for trunk?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586296.html

BR,
Kewen
-----
gcc/ChangeLog:

	PR target/102059
	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
	target_option_default_node and consider always_inline_safe flags.

gcc/testsuite/ChangeLog:

	PR target/102059
	* gcc.target/powerpc/pr102059-4.c: New test.
	* gcc.target/powerpc/pr102059-5.c: New test.
	* gcc.target/powerpc/pr102059-6.c: New test.
	* gcc.target/powerpc/pr102059-7.c: New test.
	* gcc.target/powerpc/pr102059-8.c: New test.
	* gcc.dg/lto/pr102059-1_0.c: Remove unneeded option.
---
 gcc/config/rs6000/rs6000.c                    | 110 +++++++++++-------
 gcc/testsuite/gcc.dg/lto/pr102059-1_0.c       |   2 +-
 gcc/testsuite/gcc.target/powerpc/pr102059-4.c |  24 ++++
 gcc/testsuite/gcc.target/powerpc/pr102059-5.c |  20 ++++
 gcc/testsuite/gcc.target/powerpc/pr102059-6.c |  95 +++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr102059-7.c |  22 ++++
 gcc/testsuite/gcc.target/powerpc/pr102059-8.c |  22 ++++
 7 files changed, 255 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-5.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-6.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-7.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-8.c

Comments

Kewen.Lin Jan. 19, 2022, 6:39 a.m. UTC | #1
Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587635.html

BR,
Kewen

on 2022/1/5 下午3:34, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> This patch is to fix the inconsistent behaviors for non-LTO mode
> and LTO mode.  As Martin pointed out, currently the function
> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> NULL, but it's unexpected, we should use the command line options
> from target_option_default_node as default.
> 
> It replaces rs6000_isa_flags with target_option_default_node when
> caller_tree is NULL since it's more straightforward and doesn't
> suffer from some bug not to keep rs6000_isa_flags as default.
> 
> It also extends the scope of the check for the case that callee
> has explicit set options, inlining in test case pr102059-5.c can
> happen unexpectedly before, it's fixed accordingly.
> 
> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
> can be neglected for always inlining, this patch also takes some
> flags when the callee is attributed by always_inline.
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586112.html
> 
> This patch is one re-post of this updated version[1] and also
> rebased and adjusted on top of the related commit r12-6219.
> 
> Bootstrapped and regtested on powerpc64-linux-gnu P8 and
> powerpc64le-linux-gnu P9 and P10.
> 
> Is it ok for trunk?
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586296.html
> 
> BR,
> Kewen
> -----
> gcc/ChangeLog:
> 
> 	PR target/102059
> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
> 	target_option_default_node and consider always_inline_safe flags.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR target/102059
> 	* gcc.target/powerpc/pr102059-4.c: New test.
> 	* gcc.target/powerpc/pr102059-5.c: New test.
> 	* gcc.target/powerpc/pr102059-6.c: New test.
> 	* gcc.target/powerpc/pr102059-7.c: New test.
> 	* gcc.target/powerpc/pr102059-8.c: New test.
> 	* gcc.dg/lto/pr102059-1_0.c: Remove unneeded option.
> 
>
Kewen.Lin Jan. 26, 2022, 2:21 a.m. UTC | #2
Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587635.html

BR,
Kewen

> 
> on 2022/1/5 下午3:34, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> This patch is to fix the inconsistent behaviors for non-LTO mode
>> and LTO mode.  As Martin pointed out, currently the function
>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>> NULL, but it's unexpected, we should use the command line options
>> from target_option_default_node as default.
>>
>> It replaces rs6000_isa_flags with target_option_default_node when
>> caller_tree is NULL since it's more straightforward and doesn't
>> suffer from some bug not to keep rs6000_isa_flags as default.
>>
>> It also extends the scope of the check for the case that callee
>> has explicit set options, inlining in test case pr102059-5.c can
>> happen unexpectedly before, it's fixed accordingly.
>>
>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>> can be neglected for always inlining, this patch also takes some
>> flags when the callee is attributed by always_inline.
>>
>> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
>> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586112.html
>>
>> This patch is one re-post of this updated version[1] and also
>> rebased and adjusted on top of the related commit r12-6219.
>>
>> Bootstrapped and regtested on powerpc64-linux-gnu P8 and
>> powerpc64le-linux-gnu P9 and P10.
>>
>> Is it ok for trunk?
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586296.html
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>> 	PR target/102059
>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>> 	target_option_default_node and consider always_inline_safe flags.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR target/102059
>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>> 	* gcc.target/powerpc/pr102059-5.c: New test.
>> 	* gcc.target/powerpc/pr102059-6.c: New test.
>> 	* gcc.target/powerpc/pr102059-7.c: New test.
>> 	* gcc.target/powerpc/pr102059-8.c: New test.
>> 	* gcc.dg/lto/pr102059-1_0.c: Remove unneeded option.
>>
>>
>
Kewen.Lin Feb. 7, 2022, 5:59 a.m. UTC | #3
Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587635.html

BR,
Kewen

>> on 2022/1/5 下午3:34, Kewen.Lin via Gcc-patches wrote:
>>> Hi,
>>>
>>> This patch is to fix the inconsistent behaviors for non-LTO mode
>>> and LTO mode.  As Martin pointed out, currently the function
>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>>> NULL, but it's unexpected, we should use the command line options
>>> from target_option_default_node as default.
>>>
>>> It replaces rs6000_isa_flags with target_option_default_node when
>>> caller_tree is NULL since it's more straightforward and doesn't
>>> suffer from some bug not to keep rs6000_isa_flags as default.
>>>
>>> It also extends the scope of the check for the case that callee
>>> has explicit set options, inlining in test case pr102059-5.c can
>>> happen unexpectedly before, it's fixed accordingly.
>>>
>>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>>> can be neglected for always inlining, this patch also takes some
>>> flags when the callee is attributed by always_inline.
>>>
>>> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
>>> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586112.html
>>>
>>> This patch is one re-post of this updated version[1] and also
>>> rebased and adjusted on top of the related commit r12-6219.
>>>
>>> Bootstrapped and regtested on powerpc64-linux-gnu P8 and
>>> powerpc64le-linux-gnu P9 and P10.
>>>
>>> Is it ok for trunk?
>>>
>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586296.html
>>>
>>> BR,
>>> Kewen
>>> -----
>>> gcc/ChangeLog:
>>>
>>> 	PR target/102059
>>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>>> 	target_option_default_node and consider always_inline_safe flags.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 	PR target/102059
>>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>>> 	* gcc.target/powerpc/pr102059-5.c: New test.
>>> 	* gcc.target/powerpc/pr102059-6.c: New test.
>>> 	* gcc.target/powerpc/pr102059-7.c: New test.
>>> 	* gcc.target/powerpc/pr102059-8.c: New test.
>>> 	* gcc.dg/lto/pr102059-1_0.c: Remove unneeded option.
>>>
>>>
Segher Boessenkool Feb. 7, 2022, 7:53 a.m. UTC | #4
Hi!

On Wed, Jan 05, 2022 at 03:34:41PM +0800, Kewen.Lin wrote:
> This patch is to fix the inconsistent behaviors for non-LTO mode
> and LTO mode.  As Martin pointed out, currently the function
> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> NULL, but it's unexpected, we should use the command line options
> from target_option_default_node as default.

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -25379,55 +25379,87 @@ rs6000_update_ipa_fn_target_info (unsigned int &info, const gimple *stmt)
>  static bool
>  rs6000_can_inline_p (tree caller, tree callee)
>  {
> -  bool ret = false;
>    tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>    tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
>  
> -  /* If the callee has no option attributes, then it is ok to inline.  */
> +  /* If the caller/callee has option attributes, then use them.
> +     Otherwise, use the command line options.  */
>    if (!callee_tree)
> -    ret = true;
> -
> -  else
> -    {
> -      HOST_WIDE_INT caller_isa;
> -      struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
> -      HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
> -      HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
> +    callee_tree = target_option_default_node;
> +  if (!caller_tree)
> +    caller_tree = target_option_default_node;
> +
> +  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
> +  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
> +  HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
> +  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
> +
> +  bool always_inline
> +    = DECL_DISREGARD_INLINE_LIMITS (callee)
> +      && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee));
> +
> +  /* Some features can be tolerated for always inlines.  */
> +  unsigned HOST_WIDE_INT always_inline_safe_mask
> +    /* Fusion option masks.  */
> +    = OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION
> +      | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION
> +      | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL
> +      | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG
> +      | OPTION_MASK_P10_FUSION_2ADD
> +      /* Like fusion, some option masks which are just for optimization.  */
> +      | OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT;

Why is this?  I would expect only two or three options to be *not* safe!

You have a strange mix of internal options (the FUSION* things) and some
other options that do not change the semantics at all.  But this is true
for almost *all* options we have!  What would not allow a callee that is
allowed to use some newer ISA's insns into a caller that does not allow
them?  Both when it ends up using those insns and when it does not, the
end result is valid.  If there is something internal to GCC that causes
ICEs we need to do something about *that*!

> +  /* Some features are totally safe for inlining (or always inlines),
> +     let's exclude them from the following checkings.  */
> +  HOST_WIDE_INT safe_mask = always_inline ? always_inline_safe_mask : 0;
> +  cgraph_node *callee_node = cgraph_node::get (callee);
> +  if (ipa_fn_summaries && ipa_fn_summaries->get (callee_node) != NULL)
> +    {
> +      unsigned int info = ipa_fn_summaries->get (callee_node)->target_info;
> +      if ((info & RS6000_FN_TARGET_INFO_HTM) == 0)
> +	safe_mask |= OPTION_MASK_HTM;
> +    }

always_inline means *always*.  If the compiler cannot do this it should
not, but then error; in all other cases it should do it.

This patch is pretty much equal to not allowing any inlining if the
caller and callee have not identical compilation options.  So sure, it
causes us to not have any unwanted inlining, but it does that by
preventing all other inlining at the same time.


Segher
Kewen.Lin Feb. 7, 2022, 9:42 a.m. UTC | #5
Hi Segher,

Thanks for the comments!

on 2022/2/7 下午3:53, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jan 05, 2022 at 03:34:41PM +0800, Kewen.Lin wrote:
>> This patch is to fix the inconsistent behaviors for non-LTO mode
>> and LTO mode.  As Martin pointed out, currently the function
>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>> NULL, but it's unexpected, we should use the command line options
>> from target_option_default_node as default.
> 
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -25379,55 +25379,87 @@ rs6000_update_ipa_fn_target_info (unsigned int &info, const gimple *stmt)
>>  static bool
>>  rs6000_can_inline_p (tree caller, tree callee)
>>  {
>> -  bool ret = false;
>>    tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>>    tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
>>  
>> -  /* If the callee has no option attributes, then it is ok to inline.  */
>> +  /* If the caller/callee has option attributes, then use them.
>> +     Otherwise, use the command line options.  */
>>    if (!callee_tree)
>> -    ret = true;
>> -
>> -  else
>> -    {
>> -      HOST_WIDE_INT caller_isa;
>> -      struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
>> -      HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
>> -      HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
>> +    callee_tree = target_option_default_node;
>> +  if (!caller_tree)
>> +    caller_tree = target_option_default_node;
>> +
>> +  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
>> +  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
>> +  HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
>> +  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
>> +
>> +  bool always_inline
>> +    = DECL_DISREGARD_INLINE_LIMITS (callee)
>> +      && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee));
>> +
>> +  /* Some features can be tolerated for always inlines.  */
>> +  unsigned HOST_WIDE_INT always_inline_safe_mask
>> +    /* Fusion option masks.  */
>> +    = OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION
>> +      | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION
>> +      | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL
>> +      | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG
>> +      | OPTION_MASK_P10_FUSION_2ADD
>> +      /* Like fusion, some option masks which are just for optimization.  */
>> +      | OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT;
> 
> Why is this?  I would expect only two or three options to be *not* safe!
> 
> You have a strange mix of internal options (the FUSION* things) and some
> other options that do not change the semantics at all.  But this is true
> for almost *all* options we have!  What would not allow a callee that is
> allowed to use some newer ISA's insns into a caller that does not allow
> them?  Both when it ends up using those insns and when it does not, the
> end result is valid.  If there is something internal to GCC that causes
> ICEs we need to do something about *that*!
> 

The reason why I used these options is that all of them are mainly for
performance purpose, they are not to control if some insns are available
to generate.  Sorry that if they look strangely mixed.

IMHO it's not "true for almost *all* options we have".  For the
options at the beginning of rs6000 options manual page.

  -mpowerpc-gpopt  [guarding bifs]
  -mpowerpc-gfxopt [guarding bifs]
  -mpowerpc64
  -mmfcrf
  -mpopcntb  [guarding bifs]
  -mpopcntd  [guarding bifs]
  -mfprnd
  -mcmpb  [guarding bifs]
  -mhard-dfp  [guarding bifs]
  ...

Imagining that if the callee has some built-in functions or some inline
assembly which requires the related hardware feature provided, I think we
shouldn't claim they are safe even with always_inline.  For example, for
built-in functions, there will be some error messages without related
features supported.

>> +  /* Some features are totally safe for inlining (or always inlines),
>> +     let's exclude them from the following checkings.  */
>> +  HOST_WIDE_INT safe_mask = always_inline ? always_inline_safe_mask : 0;
>> +  cgraph_node *callee_node = cgraph_node::get (callee);
>> +  if (ipa_fn_summaries && ipa_fn_summaries->get (callee_node) != NULL)
>> +    {
>> +      unsigned int info = ipa_fn_summaries->get (callee_node)->target_info;
>> +      if ((info & RS6000_FN_TARGET_INFO_HTM) == 0)
>> +	safe_mask |= OPTION_MASK_HTM;
>> +    }
> 
> always_inline means *always*.  If the compiler cannot do this it should
> not, but then error; in all other cases it should do it.
> 

Maybe there are some user cases we want to tolerate?  Like this PR?

> This patch is pretty much equal to not allowing any inlining if the
> caller and callee have not identical compilation options.  So sure, it
> causes us to not have any unwanted inlining, but it does that by
> preventing all other inlining at the same time.
> 

hmm, no, IMHO the current implementation is more strict and does what you
described here, this patch is to try to relax it a bit for always_inline
inlining, it makes us not count some flags into mismatch checking.

If you are talking about the code:

>>    if (!callee_tree)
>> -    ret = true;

This is one existing bug which causes the inconsistence between LTO and
non-LTO mode.

BR,
Kewen
Li, Pan2 via Gcc-patches Feb. 8, 2022, 8:29 p.m. UTC | #6
Hi!

From some discussion today, I think we want to limit the scope of
this patch to just the power8-fusion flag that's causing trouble for
now, given stage 4.  We've talked about making power8-fusion a do-
nothing flag, since it doesn't add much benefit now and probably
shouldn't be a separate flag anyway.  Having it as a meaningless
flag makes it more palatable to add an exception for it in the
inlining path.

Others, feel free to weigh in.

Thanks,
Bill

On 1/5/22 1:34 AM, Kewen.Lin wrote:
> Hi,
>
> This patch is to fix the inconsistent behaviors for non-LTO mode
> and LTO mode.  As Martin pointed out, currently the function
> rs6000_can_inline_p simply makes it inlinable if callee_tree is
> NULL, but it's unexpected, we should use the command line options
> from target_option_default_node as default.
>
> It replaces rs6000_isa_flags with target_option_default_node when
> caller_tree is NULL since it's more straightforward and doesn't
> suffer from some bug not to keep rs6000_isa_flags as default.
>
> It also extends the scope of the check for the case that callee
> has explicit set options, inlining in test case pr102059-5.c can
> happen unexpectedly before, it's fixed accordingly.
>
> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
> can be neglected for always inlining, this patch also takes some
> flags when the callee is attributed by always_inline.
>
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586112.html
>
> This patch is one re-post of this updated version[1] and also
> rebased and adjusted on top of the related commit r12-6219.
>
> Bootstrapped and regtested on powerpc64-linux-gnu P8 and
> powerpc64le-linux-gnu P9 and P10.
>
> Is it ok for trunk?
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586296.html
>
> BR,
> Kewen
> -----
> gcc/ChangeLog:
>
> 	PR target/102059
> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
> 	target_option_default_node and consider always_inline_safe flags.
>
> gcc/testsuite/ChangeLog:
>
> 	PR target/102059
> 	* gcc.target/powerpc/pr102059-4.c: New test.
> 	* gcc.target/powerpc/pr102059-5.c: New test.
> 	* gcc.target/powerpc/pr102059-6.c: New test.
> 	* gcc.target/powerpc/pr102059-7.c: New test.
> 	* gcc.target/powerpc/pr102059-8.c: New test.
> 	* gcc.dg/lto/pr102059-1_0.c: Remove unneeded option.
>
>
Kewen.Lin Feb. 9, 2022, 2:23 a.m. UTC | #7
Hi Bill,

on 2022/2/9 上午4:29, Bill Schmidt wrote:
> Hi!
> 
> From some discussion today, I think we want to limit the scope of
> this patch to just the power8-fusion flag that's causing trouble for
> now, given stage 4.  We've talked about making power8-fusion a do-
> nothing flag, since it doesn't add much benefit now and probably
> shouldn't be a separate flag anyway.  Having it as a meaningless
> flag makes it more palatable to add an exception for it in the
> inlining path.
> 
> Others, feel free to weigh in.
> 

Many thanks for caring about this.  It's a good idea to handle this
power8-fusion specially for now.

BR,
Kewen

> Thanks,
> Bill
> 
> On 1/5/22 1:34 AM, Kewen.Lin wrote:
>> Hi,
>>
>> This patch is to fix the inconsistent behaviors for non-LTO mode
>> and LTO mode.  As Martin pointed out, currently the function
>> rs6000_can_inline_p simply makes it inlinable if callee_tree is
>> NULL, but it's unexpected, we should use the command line options
>> from target_option_default_node as default.
>>
>> It replaces rs6000_isa_flags with target_option_default_node when
>> caller_tree is NULL since it's more straightforward and doesn't
>> suffer from some bug not to keep rs6000_isa_flags as default.
>>
>> It also extends the scope of the check for the case that callee
>> has explicit set options, inlining in test case pr102059-5.c can
>> happen unexpectedly before, it's fixed accordingly.
>>
>> As Richi/Mike pointed out, some tuning flags like MASK_P8_FUSION
>> can be neglected for always inlining, this patch also takes some
>> flags when the callee is attributed by always_inline.
>>
>> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578552.html
>> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586112.html
>>
>> This patch is one re-post of this updated version[1] and also
>> rebased and adjusted on top of the related commit r12-6219.
>>
>> Bootstrapped and regtested on powerpc64-linux-gnu P8 and
>> powerpc64le-linux-gnu P9 and P10.
>>
>> Is it ok for trunk?
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586296.html
>>
>> BR,
>> Kewen
>> -----
>> gcc/ChangeLog:
>>
>> 	PR target/102059
>> 	* config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with
>> 	target_option_default_node and consider always_inline_safe flags.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	PR target/102059
>> 	* gcc.target/powerpc/pr102059-4.c: New test.
>> 	* gcc.target/powerpc/pr102059-5.c: New test.
>> 	* gcc.target/powerpc/pr102059-6.c: New test.
>> 	* gcc.target/powerpc/pr102059-7.c: New test.
>> 	* gcc.target/powerpc/pr102059-8.c: New test.
>> 	* gcc.dg/lto/pr102059-1_0.c: Remove unneeded option.
>>
>>
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 7d07b47d9e3..60e131f2191 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -25379,55 +25379,87 @@  rs6000_update_ipa_fn_target_info (unsigned int &info, const gimple *stmt)
 static bool
 rs6000_can_inline_p (tree caller, tree callee)
 {
-  bool ret = false;
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
 
-  /* If the callee has no option attributes, then it is ok to inline.  */
+  /* If the caller/callee has option attributes, then use them.
+     Otherwise, use the command line options.  */
   if (!callee_tree)
-    ret = true;
-
-  else
-    {
-      HOST_WIDE_INT caller_isa;
-      struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
-      HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
-      HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
+    callee_tree = target_option_default_node;
+  if (!caller_tree)
+    caller_tree = target_option_default_node;
+
+  struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
+  struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+  HOST_WIDE_INT caller_isa = caller_opts->x_rs6000_isa_flags;
+  HOST_WIDE_INT callee_isa = callee_opts->x_rs6000_isa_flags;
+
+  bool always_inline
+    = DECL_DISREGARD_INLINE_LIMITS (callee)
+      && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee));
+
+  /* Some features can be tolerated for always inlines.  */
+  unsigned HOST_WIDE_INT always_inline_safe_mask
+    /* Fusion option masks.  */
+    = OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION
+      | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION
+      | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL
+      | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG
+      | OPTION_MASK_P10_FUSION_2ADD
+      /* Like fusion, some option masks which are just for optimization.  */
+      | OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT;
+
+  /* Some features are totally safe for inlining (or always inlines),
+     let's exclude them from the following checkings.  */
+  HOST_WIDE_INT safe_mask = always_inline ? always_inline_safe_mask : 0;
+  cgraph_node *callee_node = cgraph_node::get (callee);
+  if (ipa_fn_summaries && ipa_fn_summaries->get (callee_node) != NULL)
+    {
+      unsigned int info = ipa_fn_summaries->get (callee_node)->target_info;
+      if ((info & RS6000_FN_TARGET_INFO_HTM) == 0)
+	safe_mask |= OPTION_MASK_HTM;
+    }
+
+  callee_isa &= ~safe_mask;
+
+  /* The callee's options must be a subset of the caller's options, i.e.
+     a vsx function may inline an altivec function, but a no-vsx function
+     must not inline a vsx function.  */
+  if ((caller_isa & callee_isa) != callee_isa)
+    {
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
+    }
 
-      /* If the caller has option attributes, then use them.
-	 Otherwise, use the command line options.  */
-      if (caller_tree)
-	caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags;
-      else
-	caller_isa = rs6000_isa_flags;
+  /* For those options that the callee has explicitly enabled or disabled,
+     then we must enforce that the callee's and caller's options match
+     exactly; see PR70010.  */
+  HOST_WIDE_INT explicit_isa = callee_opts->x_rs6000_isa_flags_explicit;
+  explicit_isa &= ~safe_mask;
 
-      cgraph_node *callee_node = cgraph_node::get (callee);
-      if (ipa_fn_summaries && ipa_fn_summaries->get (callee_node) != NULL)
-	{
-	  unsigned int info = ipa_fn_summaries->get (callee_node)->target_info;
-	  if ((info & RS6000_FN_TARGET_INFO_HTM) == 0)
-	    {
-	      callee_isa &= ~OPTION_MASK_HTM;
-	      explicit_isa &= ~OPTION_MASK_HTM;
-	    }
-	}
-
-      /* The callee's options must be a subset of the caller's options, i.e.
-	 a vsx function may inline an altivec function, but a no-vsx function
-	 must not inline a vsx function.  However, for those options that the
-	 callee has explicitly enabled or disabled, then we must enforce that
-	 the callee's and caller's options match exactly; see PR70010.  */
-      if (((caller_isa & callee_isa) == callee_isa)
-	  && (caller_isa & explicit_isa) == (callee_isa & explicit_isa))
-	ret = true;
+  if ((caller_isa & explicit_isa) != (callee_isa & explicit_isa))
+    {
+      if (TARGET_DEBUG_TARGET)
+	fprintf (stderr,
+		 "rs6000_can_inline_p:, caller %s, callee %s, cannot "
+		 "inline since callee's options set isn't a subset of "
+		 "caller's options set by considering callee's "
+		 "explicitly set options.\n",
+		 get_decl_name (caller), get_decl_name (callee));
+      return false;
     }
 
   if (TARGET_DEBUG_TARGET)
-    fprintf (stderr, "rs6000_can_inline_p:, caller %s, callee %s, %s inline\n",
-	     get_decl_name (caller), get_decl_name (callee),
-	     (ret ? "can" : "cannot"));
+    fprintf (stderr,
+	     "rs6000_can_inline_p:, caller %s, callee %s, can inline.\n",
+	     get_decl_name (caller), get_decl_name (callee));
 
-  return ret;
+  return true;
 }
 
 /* Allocate a stack temp and fixup the address so it meets the particular
diff --git a/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c b/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
index e1004f5cfbf..b215b701097 100644
--- a/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
+++ b/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
@@ -1,7 +1,7 @@ 
 /* { dg-lto-do link } */
 /* { dg-skip-if "power10 and above only" { ! { power10_ok } } } */
 /* -Wno-attributes suppresses always_inline warnings.  */
-/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -Wno-attributes -mno-power8-fusion" } } */
+/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -Wno-attributes" } } */
 
 int __attribute__ ((always_inline))
 foo1 (int *b)
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-4.c b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
new file mode 100644
index 00000000000..d2a002cf141
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+
+/* Verify it emits inlining error msg at non-LTO mode.  */
+
+#include <htmintrin.h>
+
+static inline int __attribute__ ((always_inline))
+foo (int *b) /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  int res = _HTM_STATE(__builtin_ttest());
+  *b += res;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int
+bar (int *a)
+{
+  *a = foo (a); /* { dg-message "called from here" } */
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-5.c b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
new file mode 100644
index 00000000000..1d5d6c38bf3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-5.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-vsx" } */
+
+/* Verify it emits inlining error msg when the callee has explicit
+   disabling option from command line.  */
+
+vector int c, a, b;
+
+static inline void __attribute__ ((__always_inline__))
+foo () /* { dg-error "inlining failed in call to .* target specific option mismatch" } */
+{
+  c = a + b;
+}
+
+__attribute__ ((target ("vsx")))
+int main ()
+{
+  foo (); /* { dg-message "called from here" } */
+  c = a + b;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-6.c b/gcc/testsuite/gcc.target/powerpc/pr102059-6.c
new file mode 100644
index 00000000000..9684cab986a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-6.c
@@ -0,0 +1,95 @@ 
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since some mismatched
+   features are considered as safe for always_inline.  */
+
+/* 1. Callee enables Power8 fusion implicitly, while caller
+      with Power9 doesn't support power8 fusion at all.  */
+
+__attribute__ ((always_inline))
+int callee1 (int *b)
+{
+  *b += 1;
+  return *b;
+}
+
+#pragma GCC target "cpu=power9"
+int caller1 (int *a)
+{
+  *a = callee1 (a);
+  return 0;
+}
+
+/* 2. Caller enables indirect toc save feature while callee
+      disables it explicitly.  */
+
+#pragma GCC target "save-toc-indirect"
+__attribute__ ((always_inline))
+int callee2 (int *b)
+{
+  *b += 2;
+  return *b;
+}
+
+#pragma GCC target "no-save-toc-indirect"
+int caller2 (int *a)
+{
+  *a = callee2 (a);
+  return 0;
+}
+
+/* 3. Caller disables Power10 fusion explicitly, while callee
+      still supports it as Power10 turns it on by default.  */
+
+#pragma GCC target "cpu=power10"
+__attribute__ ((always_inline))
+int callee3 (int *b)
+{
+  *b += 3;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,no-power10-fusion"
+int caller3 (int *a)
+{
+  *a = callee3 (a);
+  return 0;
+}
+
+/* 4. Caller enables Power10 fusion implicitly, while callee
+      disables it explicitly.  */
+
+#pragma GCC target "no-power10-fusion"
+__attribute__ ((always_inline))
+int callee4 (int *b)
+{
+  *b += 4;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10"
+int caller4 (int *a)
+{
+  *a = callee4 (a);
+  return 0;
+}
+
+/* 5. Caller disables pcrel-opt while callee enables it explicitly.  */
+
+#pragma GCC target "cpu=power10,no-pcrel-opt"
+__attribute__ ((always_inline))
+int callee5 (int *b)
+{
+  *b += 5;
+  return *b;
+}
+
+#pragma GCC target "cpu=power10,pcrel-opt"
+int caller5 (int *a)
+{
+  *a = callee5 (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-7.c b/gcc/testsuite/gcc.target/powerpc/pr102059-7.c
new file mode 100644
index 00000000000..0f27f2ce7d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-7.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* -Wno-attributes suppresses always_inline warnings.  */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -Wno-attributes" } */
+
+/* Verify it doesn't emit inlining error msg since the flag power8
+   fusion is considered as safe for always_inline, it's still safe
+   even the flag is set explicitly.  */
+
+__attribute__ ((always_inline))
+int foo (int *b)
+{
+  *b += 10;
+  return *b;
+}
+
+#pragma GCC target "power8-fusion"
+int bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-8.c b/gcc/testsuite/gcc.target/powerpc/pr102059-8.c
new file mode 100644
index 00000000000..826ce88025f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-8.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_htm_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -mhtm" } */
+
+/* Verify the inlining won't perform when the callee requires
+   some target feature which isn't supported by caller, even
+   if the callee doesn't have any target attributes or pragmas.
+   If the inlining performs here, the compilation will fail.  */
+
+int
+foo (int *b)
+{
+  *b += __builtin_ttest ();
+  return *b;
+}
+
+__attribute__ ((target ("no-htm"))) int
+bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}