Fix PR 60040

Submitted by Senthil Kumar Selvaraj on April 28, 2016, 6:57 a.m.

Details

Message ID 87potafcf1.fsf@atmel.com
State New
Headers show

Commit Message

Senthil Kumar Selvaraj April 28, 2016, 6:57 a.m.
Bernd Schmidt writes:

> On 04/15/2016 02:52 PM, Senthil Kumar Selvaraj wrote:
>>
>> For both testcases in the PR, reload fails to take into account that
>> FP-SP elimination can no longer be performed, and tries to find reload
>> regs for an rtx generated when FP-SP elimination was valid.
>>
>> 1. reload initializes elim table with FP->SP elimination enabled.
>> 2. alter_reg for a pseudo allocates a stack slot for the pseudo, and sets
>>     reg_equiv_memory_loc to frame_pointer_rtx plus offset. It also sets
>>     something_was_spilled to true.
>> 3. The main reload loop starts, and it resets something_was_spilled to false.
>> 4. reload calls eliminate_regs for the pseudo and sets reg_equiv_address to
>>     (mem(SP + offset)).
>> 5. calculate_needs_all_insns pushes a reload for SP (for the AVR target,
>>     SP cannot be a pointer reg - it needs to be reloaded into X Y or Z regs).
>> 6. update_eliminables_and_spill calls targetm.frame_pointer_required,
>>     which returns true. That causes can_eliminate for FP->SP to be reset
>>     to zero, and FP to be added to bad_spill_regs_global. For the AVR,
>>     FP is Y, one of the 3 pointer regs. reload also notes that something
>>     has changed, and that the loop needs to run again.
>> 7. reload still calls select_reload_regs, and find_regs fails to find a
>>     pointer reg to reload SP, which is unnecessary as FP->SP elimination
>>     had been disabled anyway in (6).
>>
>> IOW, reload fails to find pointer regs for an RTL expression that was
>> created when FP->SP elimination was true, even after it turns out that
>> the elimination can't be done after all. The patch tries to detect that
>> - if it knows the loop is going to run again, it silences the failure.
>>
>> Also note that at a different point in the loop, the reload loop starts
>> over if something_was_spilled (line 982-986). If set outside the reload
>> loop by alter_reg, it gets reset at (3) - not sure why. I'd think a
>> "continue" after update_eliminables_and_spill (line 1019-1022) would
>> also work - haven't tested it though.
>
> That's what I was going to ask next. I think if that works for you, I 
> think that's an approach that would make more sense.
>
> However, it looks like after this call, and the other one in the same 
> loop, should probably be calling finish_spills. It looks like an 
> oversight in the existing code that this doesn't happen for the existing 
> continue. Please try adding it in both places.

Tried that out - that works too. Bootstrap and regression tests on
x86_64 went fine, and there were no regressions for the avr target as
well - the extra three passes that showed up with the initial patch show
up now too.

I couldn't put a continue before select_reload_regs though.
save_call_clobbered_regs sets up a few things that are cleaned up in
delete_caller_save_insns a little later in the loop. So I only exclude
select_reload_regs - finish_spills and cleanup will happen in both cases.

Does this patch look ok? If yes, could you commit it for me please? I
don't have commit access.

Regards
Senthil

gcc/ChangeLog

2016-04-28  Senthil Kumar Selvaraj  <senthil_kumar.selvaraj@atmel.com>

	PR target/60040
	* reload1.c (reload): Call finish_spills before
	restarting reload loop. Skip select_reload_regs
	if update_eliminables_and_spill returns true.

gcc/testsuite/ChangeLog

2016-04-28  Sebastian Huber  <sebastian.huber@embedded-brains.de>
	Matthijs Kooijman  <matthijs@stdin.nl>
	Senthil Kumar Selvaraj  <senthil_kumar.selvaraj@atmel.com>

	PR target/60040
	* gcc.target/avr/pr60040-1.c: New.
	* gcc.target/avr/pr60040-2.c: New.

Comments

Joern Wolfgang Rennecke April 28, 2016, 7:09 a.m.
On 28/04/16 07:57, Senthil Kumar Selvaraj wrote:
> diff --git libcilkrts/ChangeLog libcilkrts/ChangeLog
> index 8fada8a..ed26a3a 100644
> --- libcilkrts/ChangeLog
> +++ libcilkrts/ChangeLog
> @@ -1,9 +1,3 @@
> -2016-04-26  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
> -
> -	PR target/60290
> -	* Makefile.am (GENERAL_FLAGS): Add -funwind-tables.
> -	* Makefile.in: Regenerate.
> -
>   2015-11-09  Igor Zamyatin  <igor.zamyatin@intel.com>
>   
>   	PR target/66326
This does not appear related, and it would be the wrong way to back out 
a patch in the FSF repo
even if you wanted to, so I suppose this must be unintentional.

Patch hide | download patch | download mbox

diff --git gcc/reload1.c gcc/reload1.c
index c2800f8..d6fcece 100644
--- gcc/reload1.c
+++ gcc/reload1.c
@@ -981,7 +981,8 @@  reload (rtx_insn *first, int global)
       /* If we allocated another stack slot, redo elimination bookkeeping.  */
       if (something_was_spilled || starting_frame_size != get_frame_size ())
 	{
-	  update_eliminables_and_spill ();
+	  if (update_eliminables_and_spill ())
+	    finish_spills (global);
 	  continue;
 	}
 
@@ -1021,10 +1022,12 @@  reload (rtx_insn *first, int global)
 	  did_spill = 1;
 	  something_changed = 1;
 	}
-
-      select_reload_regs ();
-      if (failure)
-	goto failed;
+      else
+	{
+	  select_reload_regs ();
+	  if (failure)
+	    goto failed;
+	}
 
       if (insns_need_reload != 0 || did_spill)
 	something_changed |= finish_spills (global);
diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog
index 65fe0d0..bbaa3a2 100644
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -1,7 +1,3 @@ 
-2016-04-26  Bernd Schmidt  <bschmidt@redhat.com>
-
-	* gcc.target/i386/lzcnt-1.c: Allow a different lzcntw output register.
-
 2016-04-25  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/70780
diff --git gcc/testsuite/gcc.target/avr/pr60040-1.c gcc/testsuite/gcc.target/avr/pr60040-1.c
new file mode 100644
index 0000000..4fe296b
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr60040-1.c
@@ -0,0 +1,29 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+
+float dhistory[10];
+float test;
+
+float getSlope(float history[]) {
+  float sumx = 0;
+  float sumy = 0;
+  float sumxy = 0;
+  float sumxsq = 0;
+  float rate = 0;
+  int n = 10;
+
+  int i;
+  for (i=1; i< 11; i++) {
+    sumx = sumx + i;
+    sumy = sumy + history[i-1];
+    sumy = sumy + history[i-1];
+    sumxsq = sumxsq + (i*i);
+  }
+
+  rate = sumy+sumx+sumxsq;
+  return rate;
+}
+
+void loop() {
+  test = getSlope(dhistory);
+}
diff --git gcc/testsuite/gcc.target/avr/pr60040-2.c gcc/testsuite/gcc.target/avr/pr60040-2.c
new file mode 100644
index 0000000..c40d49f
--- /dev/null
+++ gcc/testsuite/gcc.target/avr/pr60040-2.c
@@ -0,0 +1,112 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef unsigned char __uint8_t;
+typedef short unsigned int __uint16_t;
+typedef long unsigned int __uint32_t;
+typedef __uint8_t uint8_t ;
+typedef __uint16_t uint16_t ;
+typedef __uint32_t uint32_t ;
+typedef __builtin_va_list __gnuc_va_list;
+typedef __gnuc_va_list va_list;
+typedef enum rtems_blkdev_request_op {
+  RTEMS_BLKDEV_REQ_READ,
+} rtems_fdisk_segment_desc;
+typedef struct rtems_fdisk_driver_handlers
+{
+  int (*blank) (const rtems_fdisk_segment_desc* sd,
+                uint32_t device,
+                uint32_t segment,
+                uint32_t offset,
+                uint32_t size);
+} rtems_fdisk_driver_handlers;
+typedef struct rtems_fdisk_page_desc
+{
+  uint16_t flags;
+  uint32_t block;
+} rtems_fdisk_page_desc;
+typedef struct rtems_fdisk_segment_ctl
+{
+  rtems_fdisk_page_desc* page_descriptors;
+  uint32_t pages_active;
+} rtems_fdisk_segment_ctl;
+typedef struct rtems_fdisk_segment_ctl_queue
+{
+} rtems_fdisk_segment_ctl_queue;
+typedef struct rtems_fdisk_device_ctl
+{
+  uint32_t flags;
+  uint8_t* copy_buffer;
+} rtems_flashdisk;
+
+extern void rtems_fdisk_error (const char *, ...);
+extern int rtems_fdisk_seg_write(const rtems_flashdisk*,
+                                 rtems_fdisk_segment_ctl*,
+                                 uint32_t,
+                                 const rtems_fdisk_page_desc* page_desc,
+				 uint32_t);
+
+void rtems_fdisk_printf (const rtems_flashdisk* fd, const char *format, ...)
+{
+  {
+    va_list args;
+    __builtin_va_start(args,format);
+  }
+}
+static int
+rtems_fdisk_seg_blank_check (const rtems_flashdisk* fd,
+                             rtems_fdisk_segment_ctl* sc,
+                             uint32_t offset,
+                             uint32_t size)
+{
+  uint32_t device;
+  uint32_t segment;
+  const rtems_fdisk_segment_desc* sd;
+  const rtems_fdisk_driver_handlers* ops;
+  return ops->blank (sd, device, segment, offset, size);
+}
+static int
+rtems_fdisk_seg_write_page_desc (const rtems_flashdisk* fd,
+                                 rtems_fdisk_segment_ctl* sc,
+                                 uint32_t page,
+                                 const rtems_fdisk_page_desc* page_desc)
+{
+  uint32_t offset = page * sizeof (rtems_fdisk_page_desc);
+  if ((fd->flags & (1 << 3)))
+  {
+    int ret = rtems_fdisk_seg_blank_check (fd, sc,
+                                           offset,
+                                           sizeof (rtems_fdisk_page_desc));
+  }
+  return rtems_fdisk_seg_write (fd, sc, offset,
+                                page_desc, sizeof (rtems_fdisk_page_desc));
+}
+void
+rtems_fdisk_recycle_segment (rtems_flashdisk* fd,
+                                    rtems_fdisk_segment_ctl* ssc,
+                                    rtems_fdisk_segment_ctl* dsc,
+                                    uint32_t *pages)
+{
+  int ret;
+  uint32_t spage;
+  uint32_t used = 0;
+  uint32_t active = 0;
+  {
+    rtems_fdisk_page_desc* spd = &ssc->page_descriptors[spage];
+    {
+      rtems_fdisk_page_desc* dpd;
+      uint32_t dpage;
+      dpd = &dsc->page_descriptors[dpage];
+      *dpd = *spd;
+      ret = rtems_fdisk_seg_write_page_desc (fd,
+                                             dsc,
+                                             dpage, dpd);
+    }
+  }
+  rtems_fdisk_printf (fd, "ssc end: %d-%d: p=%ld, a=%ld, u=%ld",
+                      pages, active, used);
+  {
+    rtems_fdisk_error ("compacting: ssc pages not 0: %d",
+                       ssc->pages_active);
+  }
+}
diff --git gcc/testsuite/gcc.target/i386/lzcnt-1.c gcc/testsuite/gcc.target/i386/lzcnt-1.c
index c7054d1..f6240d1b 100644
--- gcc/testsuite/gcc.target/i386/lzcnt-1.c
+++ gcc/testsuite/gcc.target/i386/lzcnt-1.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -mlzcnt " } */
-/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)\[ad\]\[xi\]" } } */
+/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)ax" } } */
 
 #include <x86intrin.h>
 
diff --git libcilkrts/ChangeLog libcilkrts/ChangeLog
index 8fada8a..ed26a3a 100644
--- libcilkrts/ChangeLog
+++ libcilkrts/ChangeLog
@@ -1,9 +1,3 @@ 
-2016-04-26  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
-
-	PR target/60290
-	* Makefile.am (GENERAL_FLAGS): Add -funwind-tables.
-	* Makefile.in: Regenerate.
-
 2015-11-09  Igor Zamyatin  <igor.zamyatin@intel.com>
 
 	PR target/66326
diff --git libcilkrts/Makefile.am libcilkrts/Makefile.am
index 4f944dd..70538a2 100644
--- libcilkrts/Makefile.am
+++ libcilkrts/Makefile.am
@@ -43,9 +43,6 @@  GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime -I$(top_srcdir)/
 # Enable Intel Cilk Plus extension
 GENERAL_FLAGS += -fcilkplus
 
-# Always generate unwind tables
-GENERAL_FLAGS += -funwind-tables
-
 AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99
 AM_CPPFLAGS = $(GENERAL_FLAGS)
 AM_LDFLAGS = $(XLDFLAGS)
diff --git libcilkrts/Makefile.in libcilkrts/Makefile.in
index a25d1c6..629aa6a 100644
--- libcilkrts/Makefile.in
+++ libcilkrts/Makefile.in
@@ -371,11 +371,9 @@  ACLOCAL_AMFLAGS = -I .. -I ../config
 # GENERAL_FLAGS += -D_Cilk_spawn="" -D_Cilk_sync="" -D_Cilk_for=for
 
 # Enable Intel Cilk Plus extension
-
-# Always generate unwind tables
 GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime \
 	-I$(top_srcdir)/runtime/config/$(config_dir) \
-	-DIN_CILK_RUNTIME=1 -fcilkplus -funwind-tables
+	-DIN_CILK_RUNTIME=1 -fcilkplus
 AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99
 AM_CPPFLAGS = $(GENERAL_FLAGS)
 AM_LDFLAGS = $(XLDFLAGS)