diff mbox series

Darwini, X86: Adjust call clobbers to allow for lazy-binding [PR100152].

Message ID 29DF56B6-3186-40BD-A2DF-26279E73E7BB@sandoe.co.uk
State New
Headers show
Series Darwini, X86: Adjust call clobbers to allow for lazy-binding [PR100152]. | expand

Commit Message

Iain Sandoe July 4, 2021, 8:08 p.m. UTC
Hi,

(I’m not going to defend the status quo here, it seems a bit prone
 to confusing a user [different interposition behaviour between the
 inlined and non-inlined cases] however, this is what the platform
 compilers implement).

----

We allow public functions defined in a TU to bind locally for PIC
code (the default) on 64bit Mach-O.

If such functions are not inlined, we cannot tell at compile-time if
they might be called via the lazy symbol resolver (this can depend on
options given at link-time).  Therefore, we must assume that the lazy
resolver could be used which clobbers R11 and R10.

The solution here is similar in form to the one used for veneer regs
on Arm (but I’m open to alternate suggestions).

tested on X86_64-darwin, linux
OK for master?
Iain

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

PR target/100152 - [10/11/12 Regression] used caller-saved register not preserved across a call.

        PR target/100152

gcc/ChangeLog:

	* config/i386/i386-expand.c (ix86_expand_call): If a call is
	to a non-local-binding, or local but to a public symbol, then
	assume that it might be indirected via the lazy symbol binder.
	Mark R10 and R10 as clobbered in that case.
---
 gcc/config/i386/i386-expand.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Iain Sandoe July 9, 2021, 8:25 a.m. UTC | #1
(early) ping;
if possible I’d like to get this onto master in time to back-port for 11.2.

> On 4 Jul 2021, at 21:08, Iain Sandoe <iain@sandoe.co.uk> wrote:
> 
> Hi,
> 
> (I’m not going to defend the status quo here, it seems a bit prone
> to confusing a user [different interposition behaviour between the
> inlined and non-inlined cases] however, this is what the platform
> compilers implement).
> 
> ----
> 
> We allow public functions defined in a TU to bind locally for PIC
> code (the default) on 64bit Mach-O.
> 
> If such functions are not inlined, we cannot tell at compile-time if
> they might be called via the lazy symbol resolver (this can depend on
> options given at link-time).  Therefore, we must assume that the lazy
> resolver could be used which clobbers R11 and R10.
> 
> The solution here is similar in form to the one used for veneer regs
> on Arm (but I’m open to alternate suggestions).
> 
> tested on X86_64-darwin, linux
> OK for master?
> Iain
> 
> Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
> 
> PR target/100152 - [10/11/12 Regression] used caller-saved register not preserved across a call.
> 
>        PR target/100152
> 
> gcc/ChangeLog:
> 
> 	* config/i386/i386-expand.c (ix86_expand_call): If a call is
> 	to a non-local-binding, or local but to a public symbol, then
> 	assume that it might be indirected via the lazy symbol binder.
> 	Mark R10 and R10 as clobbered in that case.
> ---
> gcc/config/i386/i386-expand.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> index b37642e35ee..1b860e027b0 100644
> --- a/gcc/config/i386/i386-expand.c
> +++ b/gcc/config/i386/i386-expand.c
> @@ -8380,6 +8380,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
>     pop = NULL;
>   gcc_assert (!TARGET_64BIT || !pop);
> 
> +  rtx addr = XEXP (fnaddr, 0);
>   if (TARGET_MACHO && !TARGET_64BIT)
>     {
> #if TARGET_MACHO
> @@ -8392,7 +8393,6 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
>       /* Static functions and indirect calls don't need the pic register.  Also,
> 	 check if PLT was explicitly avoided via no-plt or "noplt" attribute, making
> 	 it an indirect call.  */
> -      rtx addr = XEXP (fnaddr, 0);
>       if (flag_pic
> 	  && GET_CODE (addr) == SYMBOL_REF
> 	  && !SYMBOL_REF_LOCAL_P (addr))
> @@ -8555,6 +8555,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
> 	}
>     }
> 
> +  if (TARGET_MACHO && TARGET_64BIT && !sibcall
> +      && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr))
> +	  || !fndecl || TREE_PUBLIC (fndecl)))
> +    {
> +      /* We allow public functions defined in a TU to bind locally for PIC
> +	 code (the default) on 64bit Mach-O.
> +	 If such functions are not inlined, we cannot tell at compile-time if
> +	 they will be called via the lazy symbol resolver (this can depend on
> +	 options given at link-time).  Therefore, we must assume that the lazy
> +	 resolver could be used which clobbers R11 and R10.  */
> +      clobber_reg (&use, gen_rtx_REG (DImode, R11_REG));
> +      clobber_reg (&use, gen_rtx_REG (DImode, R10_REG));
> +    }
> +
>   if (vec_len > 1)
>     call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec));
>   rtx_insn *call_insn = emit_call_insn (call);
> -- 
> 2.24.1
>
Uros Bizjak July 9, 2021, 8:57 a.m. UTC | #2
On Fri, Jul 9, 2021 at 10:25 AM Iain Sandoe <iain@sandoe.co.uk> wrote:
>
> (early) ping;
> if possible I’d like to get this onto master in time to back-port for 11.2.
>
> > On 4 Jul 2021, at 21:08, Iain Sandoe <iain@sandoe.co.uk> wrote:
> >
> > Hi,
> >
> > (I’m not going to defend the status quo here, it seems a bit prone
> > to confusing a user [different interposition behaviour between the
> > inlined and non-inlined cases] however, this is what the platform
> > compilers implement).
> >
> > ----
> >
> > We allow public functions defined in a TU to bind locally for PIC
> > code (the default) on 64bit Mach-O.
> >
> > If such functions are not inlined, we cannot tell at compile-time if
> > they might be called via the lazy symbol resolver (this can depend on
> > options given at link-time).  Therefore, we must assume that the lazy
> > resolver could be used which clobbers R11 and R10.
> >
> > The solution here is similar in form to the one used for veneer regs
> > on Arm (but I’m open to alternate suggestions).
> >
> > tested on X86_64-darwin, linux
> > OK for master?
> > Iain
> >
> > Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
> >
> > PR target/100152 - [10/11/12 Regression] used caller-saved register not preserved across a call.
> >
> >        PR target/100152
> >
> > gcc/ChangeLog:
> >
> >       * config/i386/i386-expand.c (ix86_expand_call): If a call is
> >       to a non-local-binding, or local but to a public symbol, then
> >       assume that it might be indirected via the lazy symbol binder.
> >       Mark R10 and R10 as clobbered in that case.

LGTM, but this is Darwin specific patch, so you could approve it yourself.

Uros.

> > ---
> > gcc/config/i386/i386-expand.c | 16 +++++++++++++++-
> > 1 file changed, 15 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
> > index b37642e35ee..1b860e027b0 100644
> > --- a/gcc/config/i386/i386-expand.c
> > +++ b/gcc/config/i386/i386-expand.c
> > @@ -8380,6 +8380,7 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
> >     pop = NULL;
> >   gcc_assert (!TARGET_64BIT || !pop);
> >
> > +  rtx addr = XEXP (fnaddr, 0);
> >   if (TARGET_MACHO && !TARGET_64BIT)
> >     {
> > #if TARGET_MACHO
> > @@ -8392,7 +8393,6 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
> >       /* Static functions and indirect calls don't need the pic register.  Also,
> >        check if PLT was explicitly avoided via no-plt or "noplt" attribute, making
> >        it an indirect call.  */
> > -      rtx addr = XEXP (fnaddr, 0);
> >       if (flag_pic
> >         && GET_CODE (addr) == SYMBOL_REF
> >         && !SYMBOL_REF_LOCAL_P (addr))
> > @@ -8555,6 +8555,20 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
> >       }
> >     }
> >
> > +  if (TARGET_MACHO && TARGET_64BIT && !sibcall
> > +      && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr))
> > +       || !fndecl || TREE_PUBLIC (fndecl)))
> > +    {
> > +      /* We allow public functions defined in a TU to bind locally for PIC
> > +      code (the default) on 64bit Mach-O.
> > +      If such functions are not inlined, we cannot tell at compile-time if
> > +      they will be called via the lazy symbol resolver (this can depend on
> > +      options given at link-time).  Therefore, we must assume that the lazy
> > +      resolver could be used which clobbers R11 and R10.  */
> > +      clobber_reg (&use, gen_rtx_REG (DImode, R11_REG));
> > +      clobber_reg (&use, gen_rtx_REG (DImode, R10_REG));
> > +    }
> > +
> >   if (vec_len > 1)
> >     call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec));
> >   rtx_insn *call_insn = emit_call_insn (call);
> > --
> > 2.24.1
> >
>
Iain Sandoe July 9, 2021, 9:09 a.m. UTC | #3
> On 9 Jul 2021, at 09:57, Uros Bizjak <ubizjak@gmail.com> wrote:
> 
> On Fri, Jul 9, 2021 at 10:25 AM Iain Sandoe <iain@sandoe.co.uk> wrote:
>> 
>> (early) ping;
>> if possible I’d like to get this onto master in time to back-port for 11.2.
>> 
>>> On 4 Jul 2021, at 21:08, Iain Sandoe <iain@sandoe.co.uk> wrote:
>>> 
>>> Hi,
>>> 
>>> (I’m not going to defend the status quo here, it seems a bit prone
>>> to confusing a user [different interposition behaviour between the
>>> inlined and non-inlined cases] however, this is what the platform
>>> compilers implement).
>>> 
>>> ----
>>> 
>>> We allow public functions defined in a TU to bind locally for PIC
>>> code (the default) on 64bit Mach-O.
>>> 
>>> If such functions are not inlined, we cannot tell at compile-time if
>>> they might be called via the lazy symbol resolver (this can depend on
>>> options given at link-time).  Therefore, we must assume that the lazy
>>> resolver could be used which clobbers R11 and R10.
>>> 
>>> The solution here is similar in form to the one used for veneer regs
>>> on Arm (but I’m open to alternate suggestions).
>>> 
>>> tested on X86_64-darwin, linux
>>> OK for master?
>>> Iain
>>> 
>>> Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
>>> 
>>> PR target/100152 - [10/11/12 Regression] used caller-saved register not preserved across a call.
>>> 
>>>       PR target/100152
>>> 
>>> gcc/ChangeLog:
>>> 
>>>      * config/i386/i386-expand.c (ix86_expand_call): If a call is
>>>      to a non-local-binding, or local but to a public symbol, then
>>>      assume that it might be indirected via the lazy symbol binder.
>>>      Mark R10 and R10 as clobbered in that case.
> 
> LGTM, but this is Darwin specific patch, so you could approve it yourself.

thanks, maybe I should have made it “RFC” (I was wondering if there was
any better way to do this).  Anyway, I’ll get it in and baking on master.

thanks again,
Iain
diff mbox series

Patch

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index b37642e35ee..1b860e027b0 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -8380,6 +8380,7 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
     pop = NULL;
   gcc_assert (!TARGET_64BIT || !pop);
 
+  rtx addr = XEXP (fnaddr, 0);
   if (TARGET_MACHO && !TARGET_64BIT)
     {
 #if TARGET_MACHO
@@ -8392,7 +8393,6 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
       /* Static functions and indirect calls don't need the pic register.  Also,
 	 check if PLT was explicitly avoided via no-plt or "noplt" attribute, making
 	 it an indirect call.  */
-      rtx addr = XEXP (fnaddr, 0);
       if (flag_pic
 	  && GET_CODE (addr) == SYMBOL_REF
 	  && !SYMBOL_REF_LOCAL_P (addr))
@@ -8555,6 +8555,20 @@  ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
 	}
     }
 
+  if (TARGET_MACHO && TARGET_64BIT && !sibcall
+      && ((GET_CODE (addr) == SYMBOL_REF && !SYMBOL_REF_LOCAL_P (addr))
+	  || !fndecl || TREE_PUBLIC (fndecl)))
+    {
+      /* We allow public functions defined in a TU to bind locally for PIC
+	 code (the default) on 64bit Mach-O.
+	 If such functions are not inlined, we cannot tell at compile-time if
+	 they will be called via the lazy symbol resolver (this can depend on
+	 options given at link-time).  Therefore, we must assume that the lazy
+	 resolver could be used which clobbers R11 and R10.  */
+      clobber_reg (&use, gen_rtx_REG (DImode, R11_REG));
+      clobber_reg (&use, gen_rtx_REG (DImode, R10_REG));
+    }
+
   if (vec_len > 1)
     call = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (vec_len, vec));
   rtx_insn *call_insn = emit_call_insn (call);