diff mbox series

[x86] Fix PR target/99264

Message ID 2049701.N6SdOS5uIv@fomalhaut
State New
Headers show
Series [x86] Fix PR target/99264 | expand

Commit Message

Eric Botcazou Feb. 26, 2021, 3:31 p.m. UTC
Hi,

this wrong-code PR for the C++ compiler on x86-64/Windows is a regression
in GCC 9 and later, but the underlying issue has probably been there since
SEH was implemented and is exposed by this comment in config/i386/winnt.c:

  /* SEH records saves relative to the "current" stack pointer, whether
     or not there's a frame pointer in place.  This tracks the current
     stack pointer offset from the CFA.  */
  HOST_WIDE_INT sp_offset;

That's not what the (current) Microsoft documentation says; instead it says:

  /* SEH records offsets relative to the lowest address of the fixed stack
     allocation.  If there is no frame pointer, these offsets are from the
     stack pointer; if there is a frame pointer, these offsets are from the
     value of the stack pointer when the frame pointer was established, i.e.
     the frame pointer minus the offset in the .seh_setframe directive.

That's why the implementation is correct only under the condition that the 
frame pointer be established *after* the fixed stack allocation; as a matter 
of fact, that's clearly the model underpinning SEH, but is the opposite of 
what is done e.g. on Linux.

However the issue is mostly papered over in practice because:

  1. SEH forces use_fast_prologue_epilogue to false, which in turns forces 
save_regs_using_mov to false, so the general regs are always pushed when they 
need to be saved, which eliminates the offset computation for them.

  2. As soon as a frame is larger than 240 bytes, the frame pointer is fixed 
arbitrarily to 128 bytes above the stack pointer, which of course requires 
that it be established after the fixed stack allocation.

So you need a very small frame clobbering one of the call-saved XMM registers 
in order to generate wrong SEH unwind info.

The attached fix makes sure that the frame pointer is always established after 
the fixed stack allocation by pointing it at or below the lowest used register 
save area, i.e. the SSE save area, and removing the special early saves in the 
prologue; the end result is a uniform prologue sequence for SEH whatever the 
frame size.  And it avoids a weird discrepancy between cases where the number 
of saved general regs is even and cases where it is odd.

Tested on x86_64-w64-mingw32, OK for mainline, 10 and 9 branches?


2021-02-26 Eric Botcazou  <ebotcazou@adacore.com>

	PR target/99264
	* config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
	point the hard frame pointer to the SSE register save area instead
	of the general register save area.  Perform only minimal adjustment
	for small frames if it is initially not correctly aligned.
	(ix86_expand_prologue): Remove early saves for a SEH target.
	* config/i386/winnt.c (struct seh_frame_state): Document constraint.


2021-02-26 Eric Botcazou  <ebotcazou@adacore.com>

	* g++.dg/eh/seh-xmm-unwind.C: New test.

Comments

Uros Bizjak Feb. 28, 2021, 5:42 p.m. UTC | #1
On Fri, Feb 26, 2021 at 5:14 PM Eric Botcazou <botcazou@adacore.com> wrote:
>
> Hi,
>
> this wrong-code PR for the C++ compiler on x86-64/Windows is a regression
> in GCC 9 and later, but the underlying issue has probably been there since
> SEH was implemented and is exposed by this comment in config/i386/winnt.c:
>
>   /* SEH records saves relative to the "current" stack pointer, whether
>      or not there's a frame pointer in place.  This tracks the current
>      stack pointer offset from the CFA.  */
>   HOST_WIDE_INT sp_offset;
>
> That's not what the (current) Microsoft documentation says; instead it says:
>
>   /* SEH records offsets relative to the lowest address of the fixed stack
>      allocation.  If there is no frame pointer, these offsets are from the
>      stack pointer; if there is a frame pointer, these offsets are from the
>      value of the stack pointer when the frame pointer was established, i.e.
>      the frame pointer minus the offset in the .seh_setframe directive.
>
> That's why the implementation is correct only under the condition that the
> frame pointer be established *after* the fixed stack allocation; as a matter
> of fact, that's clearly the model underpinning SEH, but is the opposite of
> what is done e.g. on Linux.
>
> However the issue is mostly papered over in practice because:
>
>   1. SEH forces use_fast_prologue_epilogue to false, which in turns forces
> save_regs_using_mov to false, so the general regs are always pushed when they
> need to be saved, which eliminates the offset computation for them.
>
>   2. As soon as a frame is larger than 240 bytes, the frame pointer is fixed
> arbitrarily to 128 bytes above the stack pointer, which of course requires
> that it be established after the fixed stack allocation.
>
> So you need a very small frame clobbering one of the call-saved XMM registers
> in order to generate wrong SEH unwind info.
>
> The attached fix makes sure that the frame pointer is always established after
> the fixed stack allocation by pointing it at or below the lowest used register
> save area, i.e. the SSE save area, and removing the special early saves in the
> prologue; the end result is a uniform prologue sequence for SEH whatever the
> frame size.  And it avoids a weird discrepancy between cases where the number
> of saved general regs is even and cases where it is odd.
>
> Tested on x86_64-w64-mingw32, OK for mainline, 10 and 9 branches?
>
>
> 2021-02-26 Eric Botcazou  <ebotcazou@adacore.com>
>
>         PR target/99264
>         * config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
>         point the hard frame pointer to the SSE register save area instead
>         of the general register save area.  Perform only minimal adjustment
>         for small frames if it is initially not correctly aligned.
>         (ix86_expand_prologue): Remove early saves for a SEH target.
>         * config/i386/winnt.c (struct seh_frame_state): Document constraint.
>
>
> 2021-02-26 Eric Botcazou  <ebotcazou@adacore.com>
>
>         * g++.dg/eh/seh-xmm-unwind.C: New test.

LGTM.

Thanks,
Uros.

> --
> Eric Botcazou
Eric Botcazou March 3, 2021, 11:34 a.m. UTC | #2
> LGTM.

Thanks.  However, I overlooked an issue with pathologically large frames 
(larger than SEH_MAX_FRAME_SIZE, i.e. 2GB and for which we cannot emit CFI 
directives) that is visible in the gnat.dg testsuite under the form of an ICE.

Tested on x86_64-w64-mingw32, applied on mainline/10/9 branches as obvious.


2021-03-03 Eric Botcazou  <ebotcazou@adacore.com>

	PR target/99234
	* config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
	point back the hard frame pointer to its default location when the
	frame is larger than SEH_MAX_FRAME_SIZE.
diff mbox series

Patch

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index a8f8bcd9105..25d943c1759 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6184,11 +6184,6 @@  ix86_compute_frame_layout (void)
   offset += frame->nregs * UNITS_PER_WORD;
   frame->reg_save_offset = offset;
 
-  /* On SEH target, registers are pushed just before the frame pointer
-     location.  */
-  if (TARGET_SEH)
-    frame->hard_frame_pointer_offset = offset;
-
   /* Calculate the size of the va-arg area (not including padding, if any).  */
   frame->va_arg_size = ix86_varargs_gpr_size + ix86_varargs_fpr_size;
 
@@ -6355,14 +6350,21 @@  ix86_compute_frame_layout (void)
      the unwind data structure.  */
   if (TARGET_SEH)
     {
-      HOST_WIDE_INT diff;
+      /* Force the frame pointer to point at or below the lowest register save
+	 area, see the SEH code in config/i386/winnt.c for the rationale.  */
+      frame->hard_frame_pointer_offset = frame->sse_reg_save_offset;
 
-      /* If we can leave the frame pointer where it is, do so.  Also, returns
+      /* If we can leave the frame pointer where it is, do so.  Also, return
 	 the establisher frame for __builtin_frame_address (0).  */
-      diff = frame->stack_pointer_offset - frame->hard_frame_pointer_offset;
-      if (diff <= SEH_MAX_FRAME_SIZE
-	  && (diff > 240 || (diff & 15) != 0)
-	  && !crtl->accesses_prior_frames)
+      const HOST_WIDE_INT diff
+	= frame->stack_pointer_offset - frame->hard_frame_pointer_offset;
+      if (diff <= 255)
+	{
+	  /* The resulting diff will be a multiple of 16 lower than 255,
+	     i.e. at most 240 as required by the unwind data structure.  */
+	  frame->hard_frame_pointer_offset += (diff & 15);
+	}
+      else if (diff <= SEH_MAX_FRAME_SIZE && !crtl->accesses_prior_frames)
 	{
 	  /* Ideally we'd determine what portion of the local stack frame
 	     (within the constraint of the lowest 240) is most heavily used.
@@ -8070,17 +8072,6 @@  ix86_expand_prologue (void)
       insn = emit_insn (gen_push (hard_frame_pointer_rtx));
       RTX_FRAME_RELATED_P (insn) = 1;
 
-      /* Push registers now, before setting the frame pointer
-	 on SEH target.  */
-      if (!int_registers_saved
-	  && TARGET_SEH
-	  && !frame.save_regs_using_mov)
-	{
-	  ix86_emit_save_regs ();
-	  int_registers_saved = true;
-	  gcc_assert (m->fs.sp_offset == frame.reg_save_offset);
-	}
-
       if (m->fs.sp_offset == frame.hard_frame_pointer_offset)
 	{
 	  insn = emit_move_insn (hard_frame_pointer_rtx, stack_pointer_rtx);
diff --git a/gcc/config/i386/winnt.c b/gcc/config/i386/winnt.c
index eae3ee2ef9a..201f69e74c4 100644
--- a/gcc/config/i386/winnt.c
+++ b/gcc/config/i386/winnt.c
@@ -830,9 +830,20 @@  i386_pe_asm_lto_end (void)
 
 struct seh_frame_state
 {
-  /* SEH records saves relative to the "current" stack pointer, whether
-     or not there's a frame pointer in place.  This tracks the current
-     stack pointer offset from the CFA.  */
+  /* SEH records offsets relative to the lowest address of the fixed stack
+     allocation.  If there is no frame pointer, these offsets are from the
+     stack pointer; if there is a frame pointer, these offsets are from the
+     value of the stack pointer when the frame pointer was established, i.e.
+     the frame pointer minus the offset in the .seh_setframe directive.
+
+     We do not distinguish these two cases, i.e. we consider that the offsets
+     are always relative to the "current" stack pointer.  This means that we
+     need to perform the fixed stack allocation before establishing the frame
+     pointer whenever there are registers to be saved, and this is guaranteed
+     by the prologue provided that we force the frame pointer to point at or
+     below the lowest used register save area, see ix86_compute_frame_layout.
+
+     This tracks the current stack pointer offset from the CFA.  */
   HOST_WIDE_INT sp_offset;
 
   /* The CFA is located at CFA_REG + CFA_OFFSET.  */