diff mbox series

rs6000: Handle pcrel sibcalls to longcall functions [PR104894]

Message ID 23cc030c-608a-8593-71a9-ec90f13938d0@linux.ibm.com
State New
Headers show
Series rs6000: Handle pcrel sibcalls to longcall functions [PR104894] | expand

Commit Message

Peter Bergner April 5, 2022, 10:06 p.m. UTC
Before PCREL in POWER10, we were not allowed to perform sibcalls to longcall
functions since callee's return would skip the TOC restore in the caller.
However, with PCREL we can now safely perform a sibling call to longcall
functions.  The problem with the current code in rs6000_sibcall_aix is that it
asserts we do not have a longcall and fixing that, it generates a direct call
to a PLT stub, when it should generate an indirect sibcall due to -fno-plt.
The solution here is to check for a pcrel longcall and emit an indirect sibcall
using an inline plt stub in that case.

This passed bootstrap and regtesting on powerpc64le-linux with no regressions.
Ok for trunk?

This is marked as a GCC 11/12 regression.  Ok for a GCC 11 backport after
some burn-in on trunk?

Peter


gcc/
	PR target/104894
	* config/rs6000/rs6000.cc (rs6000_sibcall_aix): Handle pcrel sibcalls
	to longcall functions.

gcc/testsuite/
	PR target/104894
	* gcc.target/powerpc/pr104894.c: New test.
	* gcc.target/powerpc/pr104894-2.c: New test.

Comments

Segher Boessenkool April 5, 2022, 10:32 p.m. UTC | #1
Hi!

On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
> Before PCREL in POWER10, we were not allowed to perform sibcalls to longcall
> functions since callee's return would skip the TOC restore in the caller.
> However, with PCREL we can now safely perform a sibling call to longcall
> functions.  The problem with the current code in rs6000_sibcall_aix is that it
> asserts we do not have a longcall and fixing that, it generates a direct call
> to a PLT stub, when it should generate an indirect sibcall due to -fno-plt.
> The solution here is to check for a pcrel longcall and emit an indirect sibcall
> using an inline plt stub in that case.

> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -25659,11 +25659,21 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
>    rtx r12 = NULL_RTX;
>    rtx func_addr = func_desc;
>  
> -  gcc_assert (INTVAL (cookie) == 0);
> -
>    if (global_tlsarg)
>      tlsarg = global_tlsarg;
>  
> +  /* Handle longcall attributes.  */
> +  if ((INTVAL (cookie) & CALL_LONG) != 0
> +      && GET_CODE (func_desc) == SYMBOL_REF)

Don't say "!= 0" here please.

  if (INTVAL (cookie) & CALL_LONG && SYMBOL_REF_P (func_desc))

You can put parens around that "A & B" if you don't trust the reader (or
the compiler ;-) ) to know the precedence rules

(Hrm, you just c'n'p-ed this, but :-) )

> +    {
> +      /* Only PCREL can do a sibling call to a longcall function
> +	 because we don't need to restore the TOC register.  */

Don't say "Only", it is the "New" syndrome: it is out of date before you
know it :-(  You can just leave it away here.

> +      gcc_assert (rs6000_pcrel_p ());
> +      func_desc = rs6000_longcall_ref (func_desc, tlsarg);
> +    }
> +  else
> +    gcc_assert (INTVAL (cookie) == 0);

So in the old code the cookie could *only* contain the CALL_LONG flag,
now it can contain any others as long as it has that flag as well.
Please fix.

Not every LONG_CALL needs a TOC restore though?  I won't ask you to
make it work in those cases of course, but change the comment to not be
as misleading?  Taking the "Only" out is good already I think :-)

You probably should have the same condition here as actually doing a
longcall as well, something involving SYMBOL_REF_FUNCTION_P?

Thanks,


Segher
Peter Bergner April 6, 2022, 3:33 a.m. UTC | #2
On 4/5/22 5:32 PM, Segher Boessenkool wrote:
> On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
>>  
>> +  /* Handle longcall attributes.  */
>> +  if ((INTVAL (cookie) & CALL_LONG) != 0
>> +      && GET_CODE (func_desc) == SYMBOL_REF)
> 
> Don't say "!= 0" here please.
> 
>   if (INTVAL (cookie) & CALL_LONG && SYMBOL_REF_P (func_desc))

Yes, cut & paste.  I'll change it.



>> +    {
>> +      /* Only PCREL can do a sibling call to a longcall function
>> +	 because we don't need to restore the TOC register.  */
> 
> Don't say "Only", it is the "New" syndrome: it is out of date before you
> know it :-(  You can just leave it away here.

Ok, I can drop "Only" and keep the rest the same.




>> +      gcc_assert (rs6000_pcrel_p ());
>> +      func_desc = rs6000_longcall_ref (func_desc, tlsarg);
>> +    }
>> +  else
>> +    gcc_assert (INTVAL (cookie) == 0);
> 
> So in the old code the cookie could *only* contain the CALL_LONG flag,
> now it can contain any others as long as it has that flag as well.
> Please fix.

No, the old code only allowed INTVAL (cookie) == 0, which means no
attributes are allowed.  The new code now allows the CALL_LONG attribute
iff the function is a SYMBOL_REF.  This is only allowed for pcrel calls
though.  I debated on whether to do a gcc_assert on rs6000_pcrel_p() or
fold the rs6000_pcrel_p() into the if () and let the original assert
on INTVAL (cookie) == 0 catch the illegal uses.  It's up to you on
what you prefer.



> Not every LONG_CALL needs a TOC restore though?  

I believe if the function we're calling has the CALL_LONG attribute
set, we have to assume that the TOC needs to be restored.  Now whether
the actual callee's TOC at run time is different or not from the callers
is another question.  We're just forced to assume they are different.


> I won't ask you to make it work in those cases of course, but change
> the comment to not be as misleading?  Taking the "Only" out is good
> already I think :-)

Yes, as I said above, I will drop "Only" from the comment.




> You probably should have the same condition here as actually doing a
> longcall as well, something involving SYMBOL_REF_FUNCTION_P?

I believe if we're here in rs6000_sibcall_aix() and func_desc is a
SYMBOL_REF, then it also must be SYMBOL_REF_FUNCTION_P, correct?
Otherwise, why would we be attempting to do a sibcall to it?

Peter
Peter Bergner April 6, 2022, 7:33 p.m. UTC | #3
On 4/5/22 10:33 PM, Peter Bergner via Gcc-patches wrote:
> On 4/5/22 5:32 PM, Segher Boessenkool wrote:
>> On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
>>>  
>>> +  /* Handle longcall attributes.  */
>>> +  if ((INTVAL (cookie) & CALL_LONG) != 0
>>> +      && GET_CODE (func_desc) == SYMBOL_REF)
>>
>> Don't say "!= 0" here please.
>>
>>   if (INTVAL (cookie) & CALL_LONG && SYMBOL_REF_P (func_desc))
> 
> Yes, cut & paste.  I'll change it.
[snip]
>> Don't say "Only", it is the "New" syndrome: it is out of date before you
>> know it :-(  You can just leave it away here.
> 
> Ok, I can drop "Only" and keep the rest the same.

So the updated change looks like below with the ChangeLog entry and tests being the same:

Is this better and ok for trunk?

Peter


@@ -25659,11 +25659,20 @@ rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
   rtx r12 = NULL_RTX;
   rtx func_addr = func_desc;
 
-  gcc_assert (INTVAL (cookie) == 0);
-
   if (global_tlsarg)
     tlsarg = global_tlsarg;
 
+  /* Handle longcall attributes.  */
+  if (INTVAL (cookie) & CALL_LONG && SYMBOL_REF_P (func_desc))
+    {
+      /* PCREL can do a sibling call to a longcall function
+	 because we don't need to restore the TOC register.  */
+      gcc_assert (rs6000_pcrel_p ());
+      func_desc = rs6000_longcall_ref (func_desc, tlsarg);
+    }
+  else
+    gcc_assert (INTVAL (cookie) == 0);
+
   /* For ELFv2, r12 and CTR need to hold the function address
      for an indirect call.  */
   if (GET_CODE (func_desc) != SYMBOL_REF && DEFAULT_ABI == ABI_ELFv2)
Segher Boessenkool April 11, 2022, 9:13 p.m. UTC | #4
On Wed, Apr 06, 2022 at 02:33:52PM -0500, Peter Bergner wrote:
> On 4/5/22 10:33 PM, Peter Bergner via Gcc-patches wrote:
> > On 4/5/22 5:32 PM, Segher Boessenkool wrote:
> >> On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
> So the updated change looks like below with the ChangeLog entry and tests being the same:

Please don't send patches in existing threads, it confuses things.

> Is this better and ok for trunk?

Assuming you write a good changelog for this, it is fine.  Thanks!

Btw.  This code now is harder to read and understand and change than it
has to be, because you want to make the code (or the changes to the
code) as small as possible.  This is not a good tradeoff.


Segher
Peter Bergner April 11, 2022, 10:08 p.m. UTC | #5
On 4/11/22 4:13 PM, Segher Boessenkool wrote:
> On Wed, Apr 06, 2022 at 02:33:52PM -0500, Peter Bergner wrote:
>> On 4/5/22 10:33 PM, Peter Bergner via Gcc-patches wrote:
>>> On 4/5/22 5:32 PM, Segher Boessenkool wrote:
>>>> On Tue, Apr 05, 2022 at 05:06:50PM -0500, Peter Bergner wrote:
>> So the updated change looks like below with the ChangeLog entry and tests being the same:
> 
> Please don't send patches in existing threads, it confuses things.

It wasn't really meant as a patch, but more of a confirmation I made
the changes you wanted...and a ping. :-) 




>> Is this better and ok for trunk?
> 
> Assuming you write a good changelog for this, it is fine.  Thanks!

Done and pushed.  Thanks!   We need this on GCC11 and GCC10 as well.
With GCC11 due soon, I'd like this in there.  Ok for backports after
a day or so of trunk burn-in?

Peter
Segher Boessenkool April 11, 2022, 10:19 p.m. UTC | #6
On Mon, Apr 11, 2022 at 05:08:04PM -0500, Peter Bergner wrote:
> Done and pushed.  Thanks!   We need this on GCC11 and GCC10 as well.
> With GCC11 due soon, I'd like this in there.  Ok for backports after
> a day or so of trunk burn-in?

Yes, thanks!  Please make sure you have tested things very thoroughly if
you do quick backports like this (although the only thing that seems to
be in danger are longcall things, and does anyone use those?) (don't
answer that :-) )


Segher
Segher Boessenkool April 11, 2022, 10:30 p.m. UTC | #7
On Tue, Apr 05, 2022 at 10:33:14PM -0500, Peter Bergner wrote:
> On 4/5/22 5:32 PM, Segher Boessenkool wrote:
> >> +      gcc_assert (rs6000_pcrel_p ());
> >> +      func_desc = rs6000_longcall_ref (func_desc, tlsarg);
> >> +    }
> >> +  else
> >> +    gcc_assert (INTVAL (cookie) == 0);
> > 
> > So in the old code the cookie could *only* contain the CALL_LONG flag,
> > now it can contain any others as long as it has that flag as well.
> > Please fix.
> 
> No, the old code only allowed INTVAL (cookie) == 0, which means no
> attributes are allowed.

>  The new code now allows the CALL_LONG attribute
> iff the function is a SYMBOL_REF.  This is only allowed for pcrel calls
> though.

Ah, tricky.

>  I debated on whether to do a gcc_assert on rs6000_pcrel_p() or
> fold the rs6000_pcrel_p() into the if () and let the original assert
> on INTVAL (cookie) == 0 catch the illegal uses.  It's up to you on
> what you prefer.

For future changes, likely it is best if you split the pcrel and
non-pcrel paths further.

> > Not every LONG_CALL needs a TOC restore though?  
> 
> I believe if the function we're calling has the CALL_LONG attribute
> set, we have to assume that the TOC needs to be restored.

Not if we know the called function is in the same object?  If we are
doing long calls anyway there isn't much point in optimising anything
anymore, but don't say "have to" or such then :-)

> > You probably should have the same condition here as actually doing a
> > longcall as well, something involving SYMBOL_REF_FUNCTION_P?
> 
> I believe if we're here in rs6000_sibcall_aix() and func_desc is a
> SYMBOL_REF, then it also must be SYMBOL_REF_FUNCTION_P, correct?
> Otherwise, why would we be attempting to do a sibcall to it?

It doesn't hurt to be a bit defensive in programming.  It helps making
future changes easier, too :-)

Maybe we should have a utility function for this?  That helps preventing
microoptimisations as well :-P


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index cb18db06a2d..d38a1d61cfe 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -25659,11 +25659,21 @@  rs6000_sibcall_aix (rtx value, rtx func_desc, rtx tlsarg, rtx cookie)
   rtx r12 = NULL_RTX;
   rtx func_addr = func_desc;
 
-  gcc_assert (INTVAL (cookie) == 0);
-
   if (global_tlsarg)
     tlsarg = global_tlsarg;
 
+  /* Handle longcall attributes.  */
+  if ((INTVAL (cookie) & CALL_LONG) != 0
+      && GET_CODE (func_desc) == SYMBOL_REF)
+    {
+      /* Only PCREL can do a sibling call to a longcall function
+	 because we don't need to restore the TOC register.  */
+      gcc_assert (rs6000_pcrel_p ());
+      func_desc = rs6000_longcall_ref (func_desc, tlsarg);
+    }
+  else
+    gcc_assert (INTVAL (cookie) == 0);
+
   /* For ELFv2, r12 and CTR need to hold the function address
      for an indirect call.  */
   if (GET_CODE (func_desc) != SYMBOL_REF && DEFAULT_ABI == ABI_ELFv2)
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104894.c b/gcc/testsuite/gcc.target/powerpc/pr104894.c
new file mode 100644
index 00000000000..f46fe88168f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104894.c
@@ -0,0 +1,20 @@ 
+/* PR target/104894 */
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fno-plt" } */
+
+/* Verify we do not ICE on the following test case and that we emit an
+   indirect sibcall, with r12 and CTR containing the function address.  */
+
+void foo (void);
+
+void
+bar (void)
+{
+  foo ();
+}
+
+/* { dg-final { scan-assembler-times {\mmtctr 12\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mbctr\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mbl\M} } } */
+/* { dg-final { scan-assembler-not {\mbctrl\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104894-2.c b/gcc/testsuite/gcc.target/powerpc/pr104894-2.c
new file mode 100644
index 00000000000..d1a011ef4d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104894-2.c
@@ -0,0 +1,22 @@ 
+/* PR target/104894 */
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fno-plt" } */
+
+/* Verify we do not ICE on the following test case and that we emit one
+   indirect call and one indirect sibcall, with r12 and CTR containing
+   the function addresses.  */
+
+void foo (void);
+
+void
+bar (void)
+{
+  foo ();
+  foo ();
+}
+
+/* { dg-final { scan-assembler-times {\mmtctr 12\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mbctrl\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mbctr\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mbl\M} } } */