[LRA] Fix wrong-code PR 91109 take 2
diff mbox series

Message ID AM6PR10MB25666954C106C135C17DAB65E4AC0@AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM
State New
Headers show
Series
  • [LRA] Fix wrong-code PR 91109 take 2
Related show

Commit Message

Bernd Edlinger Aug. 15, 2019, 7:46 p.m. UTC
Hi,

as discussed in the PR 91109 audit trail,
my previous patch missed a case where no spilling is necessary,
but the re-materialized instruction has now scratch regs without
a hard register assignment.  And thus the LRA pass falls out of
the loop pre-maturely.

Fixed by checking for scratch regs with no assignment
and continuing the loop in that case.


Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf.
Is it OK for trunk?


Thanks
Bernd.

Comments

Bernd Edlinger Aug. 16, 2019, 11:53 a.m. UTC | #1
On 8/15/19 9:46 PM, Bernd Edlinger wrote:
> Hi,
> 
> as discussed in the PR 91109 audit trail,
> my previous patch missed a case where no spilling is necessary,
> but the re-materialized instruction has now scratch regs without
> a hard register assignment.  And thus the LRA pass falls out of
> the loop pre-maturely.
> 
> Fixed by checking for scratch regs with no assignment
> and continuing the loop in that case.
> 
> 
> Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf.
> Is it OK for trunk?
> 

Aehm, sorry, I forgot to ask, but is it also OK for the gcc-9 branch ?

Thanks
Bernd.
Vladimir Makarov Aug. 16, 2019, 3:06 p.m. UTC | #2
On 2019-08-15 3:46 p.m., Bernd Edlinger wrote:
> Hi,
>
> as discussed in the PR 91109 audit trail,
> my previous patch missed a case where no spilling is necessary,
> but the re-materialized instruction has now scratch regs without
> a hard register assignment.  And thus the LRA pass falls out of
> the loop pre-maturely.
>
> Fixed by checking for scratch regs with no assignment
> and continuing the loop in that case.
>
>
> Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf.
> Is it OK for trunk?

Sorry, I am afraid this patch can make LRA cycle forever in some cases.

The reason for this is an existing pattern (scratch "r,X").  So if LRA 
makes a choice for the 2nd alternative, it will be a former spilled 
scratch (such spilled pseudo is changed into scratch at the end of 
LRA).  In this case the constraint subpass satisfies all constraints. 
There are no changes at all but because there are spilled (not in remat 
subpass) former scratches we continue the loop.

I guess you need something more accurate interaction with remat subpass.
Vladimir Makarov Aug. 16, 2019, 3:18 p.m. UTC | #3
On 2019-08-16 11:06 a.m., Vladimir Makarov wrote:
>
> On 2019-08-15 3:46 p.m., Bernd Edlinger wrote:
>> Hi,
>>
>> as discussed in the PR 91109 audit trail,
>> my previous patch missed a case where no spilling is necessary,
>> but the re-materialized instruction has now scratch regs without
>> a hard register assignment.  And thus the LRA pass falls out of
>> the loop pre-maturely.
>>
>> Fixed by checking for scratch regs with no assignment
>> and continuing the loop in that case.
>>
>>
>> Boot-strapped and reg-tested on x86_64-pc-linux-gnu and 
>> arm-linux-gnueabihf.
>> Is it OK for trunk?
>
> Sorry, I am afraid this patch can make LRA cycle forever in some cases.
>
> The reason for this is an existing pattern (scratch "r,X").  So if LRA 
> makes a choice for the 2nd alternative, it will be a former spilled 
> scratch (such spilled pseudo is changed into scratch at the end of 
> LRA).  In this case the constraint subpass satisfies all constraints. 
> There are no changes at all but because there are spilled (not in 
> remat subpass) former scratches we continue the loop.
>
> I guess you need something more accurate interaction with remat subpass.
>
>
Sorry, I missed that your new code is run only if remat returns true 
which means some changes in it.  It was not seen in the patch context.  
So there will be no cycling as remat at some point stop to do changes.

The patch is ok for trunk and gcc-9 branch.

Thank you, Bernd

Patch
diff mbox series

2019-08-12  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR tree-optimization/91109
	* lra-int.h (lra_need_for_scratch_reg_p): Declare.
	* lra.c (lra): Use lra_need_for_scratch_reg_p.
	* lra-spills.c (lra_need_for_scratch_reg_p): New function.

Index: gcc/lra-int.h
===================================================================
--- gcc/lra-int.h	(revision 274168)
+++ gcc/lra-int.h	(working copy)
@@ -396,6 +396,7 @@  extern bool lra_coalesce (void);
 
 /* lra-spills.c:  */
 
+extern bool lra_need_for_scratch_reg_p (void);
 extern bool lra_need_for_spills_p (void);
 extern void lra_spill (void);
 extern void lra_final_code_change (void);
Index: gcc/lra-spills.c
===================================================================
--- gcc/lra-spills.c	(revision 274168)
+++ gcc/lra-spills.c	(working copy)
@@ -549,6 +549,19 @@  spill_pseudos (void)
     }
 }
 
+/* Return true if we need scratch reg assignments.  */
+bool
+lra_need_for_scratch_reg_p (void)
+{
+  int i; max_regno = max_reg_num ();
+
+  for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
+    if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0
+	&& lra_former_scratch_p (i))
+      return true;
+  return false;
+}
+
 /* Return true if we need to change some pseudos into memory.  */
 bool
 lra_need_for_spills_p (void)
Index: gcc/lra.c
===================================================================
--- gcc/lra.c	(revision 274168)
+++ gcc/lra.c	(working copy)
@@ -2567,7 +2567,11 @@  lra (FILE *f)
 	  lra_create_live_ranges (lra_reg_spill_p, true);
 	  live_p = true;
 	  if (! lra_need_for_spills_p ())
-	    break;
+	    {
+	      if (lra_need_for_scratch_reg_p ())
+		continue;
+	      break;
+	    }
 	}
       lra_spill ();
       /* Assignment of stack slots changes elimination offsets for