diff mbox series

Fix PR target/84277

Message ID 4011576.KGStx97QI1@polaris
State New
Headers show
Series Fix PR target/84277 | expand

Commit Message

Eric Botcazou March 4, 2018, 6:55 p.m. UTC
Hi,

this is the breakage of SEH on 64-bit Windows introduced by defaulting to the 
-freorder-blocks-and-partition optimization.  It is fixed by:
  1. Defining ASM_DECLARE_COLD_FUNCTION_NAME & ASM_DECLARE_COLD_FUNCTION_SIZE,
  2. Emitting a nop in one more case for SEH,
  3. Splitting the exception table into hot and cold parts; that's necessary 
because the LSDA is referenced directly in the SEH scheme.

Tested on x86-64/Windows, x86-64/Linux and IA-64/Linux, OK for mainline?


2018-03-04  Eric Botcazou  <ebotcazou@adacore.com>

	PR target/84277
	* except.h (output_function_exception_table): Adjust prototype.
	* except.c (output_function_exception_table): Remove FNNAME parameter and
	add SECTION parameter.  Ouput one part of the table at a time.
	* final.c (final_scan_insn_1) <NOTE_INSN_SWITCH_TEXT_SECTIONS>: Output
	the first part of the exception table and emit unwind directives, if any.
	* config/i386/i386-protos.h (i386_pe_end_cold_function): Declare.
	(i386_pe_seh_cold_init): Likewise.
	* config/i386/cygming.h (ASM_DECLARE_COLD_FUNCTION_NAME): New macro.
	(ASM_DECLARE_COLD_FUNCTION_SIZE): Likewise.
	* config/i386/i386.c (x86_expand_epilogue): Fix wording in comment.
	(ix86_output_call_insn): Emit a nop in one more case for SEH.
	* config/i386/winnt.c: Include except.h.
	(struct seh_frame_state): Add reg_offset, after_prologue, in_cold_section.
	(i386_pe_seh_end_prologue): Set seh->after_prologue.
	(i386_pe_seh_cold_init): New function.
	(i386_pe_seh_fini): Add COLD parameter and bail out if it is not equal
	to seh->in_cold_section.
	(seh_emit_push): Record the offset of the push.
	(seh_emit_save): Record the offet of the save.
	(i386_pe_seh_unwind_emit): Deal with NOTE_INSN_SWITCH_TEXT_SECTIONS.
	Test seh->after_prologue to disregard the epilogue.
	(i386_pe_end_function): Pass FALSE to i386_pe_seh_fini.
	(i386_pe_end_cold_function): New function.

Comments

Eric Botcazou March 6, 2018, 5:29 p.m. UTC | #1
> this is the breakage of SEH on 64-bit Windows introduced by defaulting to
> the -freorder-blocks-and-partition optimization.  It is fixed by:
>   1. Defining ASM_DECLARE_COLD_FUNCTION_NAME &
> ASM_DECLARE_COLD_FUNCTION_SIZE, 2. Emitting a nop in one more case for SEH,
>   3. Splitting the exception table into hot and cold parts; that's necessary
> because the LSDA is referenced directly in the SEH scheme.
> 
> Tested on x86-64/Windows, x86-64/Linux and IA-64/Linux, OK for mainline?

Also tested on x86/Windows to verify that it doesn't break anything.

Can anyone approve the middle-end bits?  The rest effectively affects only the 
-freorder-blocks-and-partition optimization on x86-64/Windows, which has never 
worked in C++ or Ada, so I don't think that it can do any harm...
Richard Biener March 7, 2018, 8:32 a.m. UTC | #2
On Tue, Mar 6, 2018 at 6:29 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> this is the breakage of SEH on 64-bit Windows introduced by defaulting to
>> the -freorder-blocks-and-partition optimization.  It is fixed by:
>>   1. Defining ASM_DECLARE_COLD_FUNCTION_NAME &
>> ASM_DECLARE_COLD_FUNCTION_SIZE, 2. Emitting a nop in one more case for SEH,
>>   3. Splitting the exception table into hot and cold parts; that's necessary
>> because the LSDA is referenced directly in the SEH scheme.
>>
>> Tested on x86-64/Windows, x86-64/Linux and IA-64/Linux, OK for mainline?
>
> Also tested on x86/Windows to verify that it doesn't break anything.
>
> Can anyone approve the middle-end bits?  The rest effectively affects only the
> -freorder-blocks-and-partition optimization on x86-64/Windows, which has never
> worked in C++ or Ada, so I don't think that it can do any harm...

For the middle-end part I'd like to see output_function_exception_table take
an enum with two members with appropriate name - I see there wasn't
documentation for the function but your change doesn't make semantics
more clear, esp.

-  output_one_function_exception_table (0);
-  if (crtl->eh.call_site_record_v[1])
-    output_one_function_exception_table (1);
+  output_one_function_exception_table (section);

looks like we now might output only once while we did twice before...

Looking at the two changed callers doesn't shed enough light on this either...

Richard.

> --
> Eric Botcazou
Eric Botcazou March 7, 2018, 9:28 a.m. UTC | #3
> For the middle-end part I'd like to see output_function_exception_table take
> an enum with two members with appropriate name - I see there wasn't
> documentation for the function but your change doesn't make semantics more
> clear, esp.
> 
> -  output_one_function_exception_table (0);
> -  if (crtl->eh.call_site_record_v[1])
> -    output_one_function_exception_table (1);
> +  output_one_function_exception_table (section);
> 
> looks like we now might output only once while we did twice before...
> 
> Looking at the two changed callers doesn't shed enough light on this
> either...

See output_one_function_exception_table itself which has the same argument and 
the other functions in the file: 0 corresponds to the table for hot part while 
1 corresponds to the table for the cold part (if any).  Before the change, 
output_function_exception_table outputs both parts at once whereas, after the 
change, it outputs one part at a time, and thus is called twice from final.c.

I can certainly add the missing documentation for this whole section stuff.
Richard Biener March 7, 2018, 10:28 a.m. UTC | #4
On Wed, Mar 7, 2018 at 10:28 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> For the middle-end part I'd like to see output_function_exception_table take
>> an enum with two members with appropriate name - I see there wasn't
>> documentation for the function but your change doesn't make semantics more
>> clear, esp.
>>
>> -  output_one_function_exception_table (0);
>> -  if (crtl->eh.call_site_record_v[1])
>> -    output_one_function_exception_table (1);
>> +  output_one_function_exception_table (section);
>>
>> looks like we now might output only once while we did twice before...
>>
>> Looking at the two changed callers doesn't shed enough light on this
>> either...
>
> See output_one_function_exception_table itself which has the same argument and
> the other functions in the file: 0 corresponds to the table for hot part while
> 1 corresponds to the table for the cold part (if any).  Before the change,
> output_function_exception_table outputs both parts at once whereas, after the
> change, it outputs one part at a time, and thus is called twice from final.c.

I see.

> I can certainly add the missing documentation for this whole section stuff.

Please.  Ok with that change.

Richard.

> --
> Eric Botcazou
diff mbox series

Patch

Index: config/i386/cygming.h
===================================================================
--- config/i386/cygming.h	(revision 258231)
+++ config/i386/cygming.h	(working copy)
@@ -312,10 +312,27 @@  do {						\
 #define ASM_DECLARE_FUNCTION_NAME(FILE, NAME, DECL) \
   i386_pe_start_function (FILE, NAME, DECL)
 
+/* Write the extra assembler code needed to declare the name of a
+   cold function partition properly.  */
+
+#undef ASM_DECLARE_COLD_FUNCTION_NAME
+#define ASM_DECLARE_COLD_FUNCTION_NAME(FILE, NAME, DECL)	\
+  do								\
+    {								\
+      i386_pe_declare_function_type (FILE, NAME, 0);		\
+      i386_pe_seh_cold_init (FILE, NAME);			\
+      ASM_OUTPUT_LABEL (FILE, NAME);				\
+    }								\
+  while (0)
+
 #undef ASM_DECLARE_FUNCTION_SIZE
 #define ASM_DECLARE_FUNCTION_SIZE(FILE,NAME,DECL) \
   i386_pe_end_function (FILE, NAME, DECL)
 
+#undef ASM_DECLARE_COLD_FUNCTION_SIZE
+#define ASM_DECLARE_COLD_FUNCTION_SIZE(FILE,NAME,DECL) \
+  i386_pe_end_cold_function (FILE, NAME, DECL)
+
 /* Add an external function to the list of functions to be declared at
    the end of the file.  */
 #define ASM_OUTPUT_EXTERNAL(FILE, DECL, NAME)				\
Index: config/i386/i386-protos.h
===================================================================
--- config/i386/i386-protos.h	(revision 258231)
+++ config/i386/i386-protos.h	(working copy)
@@ -256,6 +256,7 @@  extern void i386_pe_asm_output_aligned_d
 extern void i386_pe_file_end (void);
 extern void i386_pe_start_function (FILE *, const char *, tree);
 extern void i386_pe_end_function (FILE *, const char *, tree);
+extern void i386_pe_end_cold_function (FILE *, const char *, tree);
 extern void i386_pe_assemble_visibility (tree, int);
 extern tree i386_pe_mangle_decl_assembler_name (tree, tree);
 extern tree i386_pe_mangle_assembler_name (const char *);
@@ -263,6 +264,7 @@  extern void i386_pe_record_stub (const c
 
 extern void i386_pe_seh_init (FILE *);
 extern void i386_pe_seh_end_prologue (FILE *);
+extern void i386_pe_seh_cold_init (FILE *, const char *);
 extern void i386_pe_seh_unwind_emit (FILE *, rtx_insn *);
 extern void i386_pe_seh_emit_except_personality (rtx);
 extern void i386_pe_seh_init_sections (void);
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 258231)
+++ config/i386/i386.c	(working copy)
@@ -14702,9 +14702,9 @@  ix86_expand_epilogue (int style)
       if (TARGET_SEH)
 	{
 	  /* Prevent a catch region from being adjacent to the standard
-	     epilogue sequence.  Unfortuantely crtl->uses_eh_lsda nor
-	     several other flags that would be interesting to test are
-	     not yet set up.  */
+	     epilogue sequence.  Unfortunately neither crtl->uses_eh_lsda
+	     nor several other flags that would be interesting to test are
+	     set up yet.  */
 	  if (flag_non_call_exceptions)
 	    emit_insn (gen_nops (const1_rtx));
 	  else
@@ -29206,6 +29206,14 @@  ix86_output_call_insn (rtx_insn *insn, r
 
       for (i = NEXT_INSN (insn); i ; i = NEXT_INSN (i))
 	{
+	  /* Prevent a catch region from being adjacent to a jump that would
+	     be interpreted as an epilogue sequence by the unwinder.  */
+	  if (JUMP_P(i) && CROSSING_JUMP_P (i))
+	    {
+	      seh_nop_p = true;
+	      break;
+	    }
+	    
 	  /* If we get to another real insn, we don't need the nop.  */
 	  if (INSN_P (i))
 	    break;
Index: config/i386/winnt.c
===================================================================
--- config/i386/winnt.c	(revision 258231)
+++ config/i386/winnt.c	(working copy)
@@ -36,6 +36,7 @@  along with GCC; see the file COPYING3.
 #include "emit-rtl.h"
 #include "cgraph.h"
 #include "lto-streamer.h"
+#include "except.h"
 #include "output.h"
 #include "varasm.h"
 #include "lto-section-names.h"
@@ -820,6 +821,15 @@  struct seh_frame_state
   /* The CFA is located at CFA_REG + CFA_OFFSET.  */
   HOST_WIDE_INT cfa_offset;
   rtx cfa_reg;
+
+  /* The offset wrt the CFA where register N has been saved.  */
+  HOST_WIDE_INT reg_offset[FIRST_PSEUDO_REGISTER];
+
+  /* True if we are past the end of the epilogue.  */
+  bool after_prologue;
+
+  /* True if we are in the cold section.  */
+  bool in_cold_section;
 };
 
 /* Set up data structures beginning output for SEH.  */
@@ -850,10 +860,26 @@  i386_pe_seh_init (FILE *f)
   fputc ('\n', f);
 }
 
+/* Emit an assembler directive for the end of the prologue.  */
+
 void
 i386_pe_seh_end_prologue (FILE *f)
 {
+  if (!TARGET_SEH)
+    return;
+  if (cfun->is_thunk)
+    return;
+  cfun->machine->seh->after_prologue = true;
+  fputs ("\t.seh_endprologue\n", f);
+}
+
+/* Emit assembler directives to reconstruct the SEH state.  */
+
+void
+i386_pe_seh_cold_init (FILE *f, const char *name)
+{
   struct seh_frame_state *seh;
+  HOST_WIDE_INT offset;
 
   if (!TARGET_SEH)
     return;
@@ -861,19 +887,56 @@  i386_pe_seh_end_prologue (FILE *f)
     return;
   seh = cfun->machine->seh;
 
-  XDELETE (seh);
-  cfun->machine->seh = NULL;
+  fputs ("\t.seh_proc\t", f);
+  assemble_name (f, name);
+  fputc ('\n', f);
+
+  offset = seh->sp_offset - INCOMING_FRAME_SP_OFFSET;
+  if (offset > 0 && offset < SEH_MAX_FRAME_SIZE)
+    fprintf (f, "\t.seh_stackalloc\t" HOST_WIDE_INT_PRINT_DEC "\n", offset);
+
+  for (int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
+    if (seh->reg_offset[regno] > 0)
+      {
+	fputs ((SSE_REGNO_P (regno) ? "\t.seh_savexmm\t"
+	       : GENERAL_REGNO_P (regno) ?  "\t.seh_savereg\t"
+		 : (gcc_unreachable (), "")), f);
+	print_reg (gen_rtx_REG (DImode, regno), 0, f);
+	fprintf (f, ", " HOST_WIDE_INT_PRINT_DEC "\n",
+		 seh->sp_offset - seh->reg_offset[regno]);
+      }
+
+  if (seh->cfa_reg != stack_pointer_rtx)
+    {
+      offset = seh->sp_offset - seh->cfa_offset;
+
+      gcc_assert ((offset & 15) == 0);
+      gcc_assert (IN_RANGE (offset, 0, 240));
+
+      fputs ("\t.seh_setframe\t", f);
+      print_reg (seh->cfa_reg, 0, f);
+      fprintf (f, ", " HOST_WIDE_INT_PRINT_DEC "\n", offset);
+    }
 
   fputs ("\t.seh_endprologue\n", f);
 }
 
+/* Emit an assembler directive for the end of the function.  */
+
 static void
-i386_pe_seh_fini (FILE *f)
+i386_pe_seh_fini (FILE *f, bool cold)
 {
+  struct seh_frame_state *seh;
+
   if (!TARGET_SEH)
     return;
   if (cfun->is_thunk)
     return;
+  seh = cfun->machine->seh;
+  if (cold != seh->in_cold_section)
+    return;
+  XDELETE (seh);
+  cfun->machine->seh = NULL;
   fputs ("\t.seh_endproc\n", f);
 }
 
@@ -882,11 +945,12 @@  i386_pe_seh_fini (FILE *f)
 static void
 seh_emit_push (FILE *f, struct seh_frame_state *seh, rtx reg)
 {
-  unsigned int regno = REGNO (reg);
+  const unsigned int regno = REGNO (reg);
 
   gcc_checking_assert (GENERAL_REGNO_P (regno));
 
   seh->sp_offset += UNITS_PER_WORD;
+  seh->reg_offset[regno] = seh->sp_offset;
   if (seh->cfa_reg == stack_pointer_rtx)
     seh->cfa_offset += UNITS_PER_WORD;
 
@@ -901,9 +965,11 @@  static void
 seh_emit_save (FILE *f, struct seh_frame_state *seh,
 	       rtx reg, HOST_WIDE_INT cfa_offset)
 {
-  unsigned int regno = REGNO (reg);
+  const unsigned int regno = REGNO (reg);
   HOST_WIDE_INT offset;
 
+  seh->reg_offset[regno] = cfa_offset;
+
   /* Negative save offsets are of course not supported, since that
      would be a store below the stack pointer and thus clobberable.  */
   gcc_assert (seh->sp_offset >= cfa_offset);
@@ -1112,15 +1178,21 @@  i386_pe_seh_unwind_emit (FILE *asm_out_f
   if (!TARGET_SEH)
     return;
 
-  /* We free the SEH data once done with the prologue.  Ignore those
-     RTX_FRAME_RELATED_P insns that are associated with the epilogue.  */
   seh = cfun->machine->seh;
-  if (seh == NULL)
-    return;
+  if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_SWITCH_TEXT_SECTIONS)
+    {
+      fputs ("\t.seh_endproc\n", asm_out_file);
+      seh->in_cold_section = true;
+      return;
+    }
 
   if (NOTE_P (insn) || !RTX_FRAME_RELATED_P (insn))
     return;
 
+  /* Skip RTX_FRAME_RELATED_P insns that are associated with the epilogue.  */
+  if (seh->after_prologue)
+    return;
+
   for (note = REG_NOTES (insn); note ; note = XEXP (note, 1))
     {
       switch (REG_NOTE_KIND (note))
@@ -1227,8 +1299,13 @@  i386_pe_start_function (FILE *f, const c
 void
 i386_pe_end_function (FILE *f, const char *, tree)
 {
-  i386_pe_seh_fini (f);
+  i386_pe_seh_fini (f, false);
 }
 
+void
+i386_pe_end_cold_function (FILE *f, const char *, tree)
+{
+  i386_pe_seh_fini (f, true);
+}
 
 #include "gt-winnt.h"
Index: except.c
===================================================================
--- except.c	(revision 258231)
+++ except.c	(working copy)
@@ -3168,12 +3168,16 @@  output_one_function_exception_table (int
 }
 
 void
-output_function_exception_table (const char *fnname)
+output_function_exception_table (int section)
 {
+  const char *fnname = get_fnname_from_decl (current_function_decl);
   rtx personality = get_personality_function (current_function_decl);
 
   /* Not all functions need anything.  */
-  if (! crtl->uses_eh_lsda)
+  if (!crtl->uses_eh_lsda)
+    return;
+
+  if (section == 1 && !crtl->eh.call_site_record_v[1])
     return;
 
   if (personality)
@@ -3189,9 +3193,7 @@  output_function_exception_table (const c
   /* If the target wants a label to begin the table, emit it here.  */
   targetm.asm_out.emit_except_table_label (asm_out_file);
 
-  output_one_function_exception_table (0);
-  if (crtl->eh.call_site_record_v[1])
-    output_one_function_exception_table (1);
+  output_one_function_exception_table (section);
 
   switch_to_section (current_function_section ());
 }
Index: except.h
===================================================================
--- except.h	(revision 258231)
+++ except.h	(working copy)
@@ -229,7 +229,7 @@  extern void remove_eh_handler (eh_region
 extern void remove_unreachable_eh_regions (sbitmap);
 
 extern bool current_function_has_exception_handlers (void);
-extern void output_function_exception_table (const char *);
+extern void output_function_exception_table (int);
 
 extern rtx expand_builtin_eh_pointer (tree);
 extern rtx expand_builtin_eh_filter (tree);
Index: final.c
===================================================================
--- final.c	(revision 258231)
+++ final.c	(working copy)
@@ -2265,6 +2265,11 @@  final_scan_insn_1 (rtx_insn *insn, FILE
 	case NOTE_INSN_SWITCH_TEXT_SECTIONS:
 	  maybe_output_next_view (seen);
 
+	  output_function_exception_table (0);
+
+	  if (targetm.asm_out.unwind_emit)
+	    targetm.asm_out.unwind_emit (asm_out_file, insn);
+
 	  in_cold_section_p = !in_cold_section_p;
 
 	  if (in_cold_section_p)
@@ -4672,7 +4677,7 @@  rest_of_handle_final (void)
   /* The IA-64 ".handlerdata" directive must be issued before the ".endp"
      directive that closes the procedure descriptor.  Similarly, for x64 SEH.
      Otherwise it's not strictly necessary, but it doesn't hurt either.  */
-  output_function_exception_table (fnname);
+  output_function_exception_table (crtl->has_bb_partition ? 1 : 0);
 
   assemble_end_function (current_function_decl, fnname);