diff mbox series

i386: Skip ENDBR32 at nested function entry

Message ID 20200210192235.16926-1-hjl.tools@gmail.com
State New
Headers show
Series i386: Skip ENDBR32 at nested function entry | expand

Commit Message

H.J. Lu Feb. 10, 2020, 7:22 p.m. UTC
Since nested function isn't only called directly, there is ENDBR32 at
function entry and we need to skip it for direct jump in trampoline.

Tested on Linux/x86-64 CET machine with and without -m32.

gcc/

	PR target/93656
	* config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
	nested function entry.

gcc/testsuite/

	PR target/93656
	* gcc.target/i386/pr93656.c: New test.
---
 gcc/config/i386/i386.c                  | 7 ++++++-
 gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c

Comments

Uros Bizjak Feb. 10, 2020, 7:40 p.m. UTC | #1
On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> Since nested function isn't only called directly, there is ENDBR32 at
> function entry and we need to skip it for direct jump in trampoline.

Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?

Uros.

> Tested on Linux/x86-64 CET machine with and without -m32.
>
> gcc/
>
>         PR target/93656
>         * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at
>         nested function entry.
>
> gcc/testsuite/
>
>         PR target/93656
>         * gcc.target/i386/pr93656.c: New test.
> ---
>  gcc/config/i386/i386.c                  | 7 ++++++-
>  gcc/testsuite/gcc.target/i386/pr93656.c | 4 ++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr93656.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 44bc0e0176a..dbcae244acb 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -16839,9 +16839,14 @@ ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
>          the stack, we need to skip the first insn which pushes the
>          (call-saved) register static chain; this push is 1 byte.  */
>        offset += 5;
> +      int skip = MEM_P (chain) ? 1 : 0;
> +      /* Since nested function isn't only called directly, there is
> +        ENDBR32 at function entry and we need to skip it.  */
> +      if (need_endbr)
> +       skip += 4;
>        disp = expand_binop (SImode, sub_optab, fnaddr,
>                            plus_constant (Pmode, XEXP (m_tramp, 0),
> -                                         offset - (MEM_P (chain) ? 1 : 0)),
> +                                         offset - skip),
>                            NULL_RTX, 1, OPTAB_DIRECT);
>        emit_move_insn (mem, disp);
>      }
> diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c
> new file mode 100644
> index 00000000000..f0ac8c8edaa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr93656.c
> @@ -0,0 +1,4 @@
> +/* { dg-do run { target { ia32 && cet } } } */
> +/* { dg-options "-O2 -fcf-protection" } */
> +
> +#include "pr67770.c"
> --
> 2.24.1
>
H.J. Lu Feb. 10, 2020, 7:52 p.m. UTC | #2
On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > Since nested function isn't only called directly, there is ENDBR32 at
> > function entry and we need to skip it for direct jump in trampoline.
>
> Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?
>

ix86_trampoline_init has

     /* Compute offset from the end of the jmp to the target function.
         In the case in which the trampoline stores the static chain on
         the stack, we need to skip the first insn which pushes the
         (call-saved) register static chain; this push is 1 byte.  */
      offset += 5;
      disp = expand_binop (SImode, sub_optab, fnaddr,
                           plus_constant (Pmode, XEXP (m_tramp, 0),
                                          offset - (MEM_P (chain) ? 1 : 0)),
                           NULL_RTX, 1, OPTAB_DIRECT);
      emit_move_insn (mem, disp);

Without CET, we got

0000011 <bar.1878>:
  11: 56                    push   %esi
  12: 55                    push   %ebp   <<<<<< trampoline jumps here.
  13: 89 e5                mov    %esp,%ebp
  15: 83 ec 08              sub    $0x8,%esp

With CET, if bar isn't only called directly, we got

00000015 <bar.1878>:
  15: f3 0f 1e fb          endbr32
  19: 56                    push   %esi
  1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.
  1b: 89 e5                mov    %esp,%ebp
  1d: 83 ec 08              sub    $0x8,%esp

We need to add 4 bytes for trampoline to skip endbr32.

Here is the updated patch to check if nested function isn't only
called directly,
Uros Bizjak Feb. 10, 2020, 8:01 p.m. UTC | #3
On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > Since nested function isn't only called directly, there is ENDBR32 at
> > > function entry and we need to skip it for direct jump in trampoline.
> >
> > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?
> >
>
> ix86_trampoline_init has
>
>      /* Compute offset from the end of the jmp to the target function.
>          In the case in which the trampoline stores the static chain on
>          the stack, we need to skip the first insn which pushes the
>          (call-saved) register static chain; this push is 1 byte.  */
>       offset += 5;
>       disp = expand_binop (SImode, sub_optab, fnaddr,
>                            plus_constant (Pmode, XEXP (m_tramp, 0),
>                                           offset - (MEM_P (chain) ? 1 : 0)),
>                            NULL_RTX, 1, OPTAB_DIRECT);
>       emit_move_insn (mem, disp);
>
> Without CET, we got
>
> 0000011 <bar.1878>:
>   11: 56                    push   %esi
>   12: 55                    push   %ebp   <<<<<< trampoline jumps here.
>   13: 89 e5                mov    %esp,%ebp
>   15: 83 ec 08              sub    $0x8,%esp
>
> With CET, if bar isn't only called directly, we got
>
> 00000015 <bar.1878>:
>   15: f3 0f 1e fb          endbr32
>   19: 56                    push   %esi
>   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.
>   1b: 89 e5                mov    %esp,%ebp
>   1d: 83 ec 08              sub    $0x8,%esp
>
> We need to add 4 bytes for trampoline to skip endbr32.
>
> Here is the updated patch to check if nested function isn't only
> called directly,

Please figure out the final patch. I don't want to waste my time
reviewing different version every half hour. Ping me in a couple of
days.

Uros.

>
>
> --
> H.J.
H.J. Lu Feb. 12, 2020, 12:21 p.m. UTC | #4
On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubizjak@gmail.com> wrote:
>
> On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > >
> > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > Since nested function isn't only called directly, there is ENDBR32 at
> > > > function entry and we need to skip it for direct jump in trampoline.
> > >
> > > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?
> > >
> >
> > ix86_trampoline_init has
> >
> >      /* Compute offset from the end of the jmp to the target function.
> >          In the case in which the trampoline stores the static chain on
> >          the stack, we need to skip the first insn which pushes the
> >          (call-saved) register static chain; this push is 1 byte.  */
> >       offset += 5;
> >       disp = expand_binop (SImode, sub_optab, fnaddr,
> >                            plus_constant (Pmode, XEXP (m_tramp, 0),
> >                                           offset - (MEM_P (chain) ? 1 : 0)),
> >                            NULL_RTX, 1, OPTAB_DIRECT);
> >       emit_move_insn (mem, disp);
> >
> > Without CET, we got
> >
> > 0000011 <bar.1878>:
> >   11: 56                    push   %esi
> >   12: 55                    push   %ebp   <<<<<< trampoline jumps here.
> >   13: 89 e5                mov    %esp,%ebp
> >   15: 83 ec 08              sub    $0x8,%esp
> >
> > With CET, if bar isn't only called directly, we got
> >
> > 00000015 <bar.1878>:
> >   15: f3 0f 1e fb          endbr32
> >   19: 56                    push   %esi
> >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.
> >   1b: 89 e5                mov    %esp,%ebp
> >   1d: 83 ec 08              sub    $0x8,%esp
> >
> > We need to add 4 bytes for trampoline to skip endbr32.
> >
> > Here is the updated patch to check if nested function isn't only
> > called directly,
>
> Please figure out the final patch. I don't want to waste my time
> reviewing different version every half hour. Ping me in a couple of
> days.

This is the final version:

https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html

You can try the testcase in the patch on any machine with CET binutils
since ENDBR32 is nop on none-CET machines.  Without this patch,
the test will fail.

Thanks.
Uros Bizjak Feb. 13, 2020, 8:29 a.m. UTC | #5
On Wed, Feb 12, 2020 at 1:21 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Mon, Feb 10, 2020 at 12:01 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > On Mon, Feb 10, 2020 at 8:53 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Mon, Feb 10, 2020 at 11:40 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > >
> > > > On Mon, Feb 10, 2020 at 8:22 PM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > >
> > > > > Since nested function isn't only called directly, there is ENDBR32 at
> > > > > function entry and we need to skip it for direct jump in trampoline.
> > > >
> > > > Hm, I'm afraid I don't understand this comment. Can you perhaps rephrase it?
> > > >
> > >
> > > ix86_trampoline_init has
> > >
> > >      /* Compute offset from the end of the jmp to the target function.
> > >          In the case in which the trampoline stores the static chain on
> > >          the stack, we need to skip the first insn which pushes the
> > >          (call-saved) register static chain; this push is 1 byte.  */
> > >       offset += 5;
> > >       disp = expand_binop (SImode, sub_optab, fnaddr,
> > >                            plus_constant (Pmode, XEXP (m_tramp, 0),
> > >                                           offset - (MEM_P (chain) ? 1 : 0)),
> > >                            NULL_RTX, 1, OPTAB_DIRECT);
> > >       emit_move_insn (mem, disp);
> > >
> > > Without CET, we got
> > >
> > > 0000011 <bar.1878>:
> > >   11: 56                    push   %esi
> > >   12: 55                    push   %ebp   <<<<<< trampoline jumps here.
> > >   13: 89 e5                mov    %esp,%ebp
> > >   15: 83 ec 08              sub    $0x8,%esp
> > >
> > > With CET, if bar isn't only called directly, we got
> > >
> > > 00000015 <bar.1878>:
> > >   15: f3 0f 1e fb          endbr32
> > >   19: 56                    push   %esi
> > >   1a: 55                    push   %ebp   <<<<<<<< trampoline jumps here.
> > >   1b: 89 e5                mov    %esp,%ebp
> > >   1d: 83 ec 08              sub    $0x8,%esp
> > >
> > > We need to add 4 bytes for trampoline to skip endbr32.
> > >
> > > Here is the updated patch to check if nested function isn't only
> > > called directly,
> >
> > Please figure out the final patch. I don't want to waste my time
> > reviewing different version every half hour. Ping me in a couple of
> > days.
>
> This is the final version:
>
> https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00586.html
>
> You can try the testcase in the patch on any machine with CET binutils
> since ENDBR32 is nop on none-CET machines.  Without this patch,
> the test will fail.

Please rephrase the comment. I don't understand what it tries to say.

Uros.
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 44bc0e0176a..dbcae244acb 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16839,9 +16839,14 @@  ix86_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
 	 the stack, we need to skip the first insn which pushes the
 	 (call-saved) register static chain; this push is 1 byte.  */
       offset += 5;
+      int skip = MEM_P (chain) ? 1 : 0;
+      /* Since nested function isn't only called directly, there is
+	 ENDBR32 at function entry and we need to skip it.  */
+      if (need_endbr)
+	skip += 4;
       disp = expand_binop (SImode, sub_optab, fnaddr,
 			   plus_constant (Pmode, XEXP (m_tramp, 0),
-					  offset - (MEM_P (chain) ? 1 : 0)),
+					  offset - skip),
 			   NULL_RTX, 1, OPTAB_DIRECT);
       emit_move_insn (mem, disp);
     }
diff --git a/gcc/testsuite/gcc.target/i386/pr93656.c b/gcc/testsuite/gcc.target/i386/pr93656.c
new file mode 100644
index 00000000000..f0ac8c8edaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr93656.c
@@ -0,0 +1,4 @@ 
+/* { dg-do run { target { ia32 && cet } } } */
+/* { dg-options "-O2 -fcf-protection" } */
+
+#include "pr67770.c"