diff mbox

One more path to fix PR70478

Message ID 930934bd-0341-6607-3d7e-96c34cebbabc@redhat.com
State New
Headers show

Commit Message

Vladimir Makarov April 10, 2017, 3:05 p.m. UTC
This is the second try to fix

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478

   The first try patch triggered a latent bug and broke one Fortran 
testcase on x86-64.

   The patch was successfully bootstrapped on x86-64 and tested on 
x86-64, ppc64, and aarch64.

   Committed as rev. 246808.

Comments

Christophe Lyon April 11, 2017, 7:30 a.m. UTC | #1
Hi Vladimir,

On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote:
>   This is the second try to fix
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478
>
>   The first try patch triggered a latent bug and broke one Fortran testcase
> on x86-64.
>
>   The patch was successfully bootstrapped on x86-64 and tested on x86-64,
> ppc64, and aarch64.
>
>   Committed as rev. 246808.
>
>

This patch causes regression on arm*hf configurations:
  Executed from: gcc.target/arm/arm.exp
    gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times ldrh\\tr[0-9]+ 2
    gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times strh\\tr[0-9]+ 2
    gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
vld1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2
    gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
vmov\\.f16\\tr[0-9]+, s[0-9]+ 4
    gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
vmov\\.f16\\ts[0-9]+, r[0-9]+ 4
    gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
vst1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2

See http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/246809/report-build-info.html

Is it just a matter of adjusting the testcases? (note that there is no
regression when forcing either:
-march=armv5t
-mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard
in the runtestflags.

I would have to re--run the build/test manually to get the generated
code, let me know if it's needed.

Thanks,

Christophe
Vladimir Makarov April 11, 2017, 3:42 p.m. UTC | #2
On 04/11/2017 03:30 AM, Christophe Lyon wrote:
> Hi Vladimir,
>
> On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>    This is the second try to fix
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478
>>
>>    The first try patch triggered a latent bug and broke one Fortran testcase
>> on x86-64.
>>
>>    The patch was successfully bootstrapped on x86-64 and tested on x86-64,
>> ppc64, and aarch64.
>>
>>    Committed as rev. 246808.
>>
>>
> I would have to re--run the build/test manually to get the generated
> code, let me know if it's needed.
Yes, Christophe.  It would be helpful.  I've tried to reproduce it but I 
don't see the difference in the generated code.
Vladimir Makarov April 11, 2017, 5:27 p.m. UTC | #3
On 04/11/2017 11:42 AM, Vladimir Makarov wrote:
>
>
> Yes, Christophe.  It would be helpful.  I've tried to reproduce it but 
> I don't see the difference in the generated code.
>
Never mind.  I've reproduced it.  Thanks.
Vladimir Makarov April 11, 2017, 7:43 p.m. UTC | #4
On 04/11/2017 03:30 AM, Christophe Lyon wrote:
> Hi Vladimir,
>
> On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>    This is the second try to fix
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478
>>
>>    The first try patch triggered a latent bug and broke one Fortran testcase
>> on x86-64.
>>
>>    The patch was successfully bootstrapped on x86-64 and tested on x86-64,
>> ppc64, and aarch64.
>>
>>    Committed as rev. 246808.
>>
>>
> This patch causes regression on arm*hf configurations:
>    Executed from: gcc.target/arm/arm.exp
>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times ldrh\\tr[0-9]+ 2
>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times strh\\tr[0-9]+ 2
>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
> vld1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2
>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
> vmov\\.f16\\tr[0-9]+, s[0-9]+ 4
>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
> vmov\\.f16\\ts[0-9]+, r[0-9]+ 4
>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
> vst1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2
>
>
I've committed a patch which is supposed to fix the regression.
Christophe Lyon April 12, 2017, 6:09 a.m. UTC | #5
On 11 April 2017 at 21:43, Vladimir Makarov <vmakarov@redhat.com> wrote:
>
>
> On 04/11/2017 03:30 AM, Christophe Lyon wrote:
>>
>> Hi Vladimir,
>>
>> On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>>
>>>    This is the second try to fix
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478
>>>
>>>    The first try patch triggered a latent bug and broke one Fortran
>>> testcase
>>> on x86-64.
>>>
>>>    The patch was successfully bootstrapped on x86-64 and tested on
>>> x86-64,
>>> ppc64, and aarch64.
>>>
>>>    Committed as rev. 246808.
>>>
>>>
>> This patch causes regression on arm*hf configurations:
>>    Executed from: gcc.target/arm/arm.exp
>>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
>> ldrh\\tr[0-9]+ 2
>>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
>> strh\\tr[0-9]+ 2
>>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
>> vld1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2
>>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
>> vmov\\.f16\\tr[0-9]+, s[0-9]+ 4
>>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
>> vmov\\.f16\\ts[0-9]+, r[0-9]+ 4
>>      gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times
>> vst1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2
>>
>>
> I've committed a patch which is supposed to fix the regression.
>

I confirm it's now OK. Thanks for the prompt fix!

Christophe
diff mbox

Patch

Index: ChangeLog
===================================================================
--- ChangeLog	(revision 246807)
+++ ChangeLog	(working copy)
@@ -1,3 +1,12 @@ 
+2017-04-10  Vladimir Makarov  <vmakarov@redhat.com>
+
+	PR rtl-optimization/70478
+	* lra-constraints.c (curr_small_class_check): New.
+	(update_and_check_small_class_inputs): New.
+	(process_alt_operands): Update curr_small_class_check.  Disfavor
+	alternative insn memory operands.  Check available regs for small
+	class operands.
+
 2017-03-31  Matthew Fortune  <matthew.fortune@imgtec.com>
 
 	PR target/80057
Index: lra-constraints.c
===================================================================
--- lra-constraints.c	(revision 246789)
+++ lra-constraints.c	(working copy)
@@ -1852,6 +1852,42 @@  prohibited_class_reg_set_mode_p (enum re
 	  (temp, ira_prohibited_class_mode_regs[rclass][mode]));
 }
 
+
+/* Used to check validity info about small class input operands.  It
+   should be incremented at start of processing an insn
+   alternative.  */
+static unsigned int curr_small_class_check = 0;
+
+/* Update number of used inputs of class OP_CLASS for operand NOP.
+   Return true if we have more such class operands than the number of
+   available regs.  */
+static bool
+update_and_check_small_class_inputs (int nop, enum reg_class op_class)
+{
+  static unsigned int small_class_check[LIM_REG_CLASSES];
+  static int small_class_input_nums[LIM_REG_CLASSES];
+  
+  if (SMALL_REGISTER_CLASS_P (op_class)
+      /* We are interesting in classes became small because of fixing
+	 some hard regs, e.g. by an user through GCC options.  */
+      && hard_reg_set_intersect_p (reg_class_contents[op_class],
+				   ira_no_alloc_regs)
+      && (curr_static_id->operand[nop].type != OP_OUT
+	  || curr_static_id->operand[nop].early_clobber))
+    {
+      if (small_class_check[op_class] == curr_small_class_check)
+	small_class_input_nums[op_class]++;
+      else
+	{
+	  small_class_check[op_class] = curr_small_class_check;
+	  small_class_input_nums[op_class] = 1;
+	}
+      if (small_class_input_nums[op_class] > ira_class_hard_regs_num[op_class])
+	return true;
+    }
+  return false;
+}
+
 /* Major function to choose the current insn alternative and what
    operands should be reloaded and how.	 If ONLY_ALTERNATIVE is not
    negative we should consider only this alternative.  Return false if
@@ -1952,6 +1988,7 @@  process_alt_operands (int only_alternati
       if (!TEST_BIT (preferred, nalt))
 	continue;
 
+      curr_small_class_check++;
       overall = losers = addr_losers = 0;
       static_reject = reject = reload_nregs = reload_sum = 0;
       for (nop = 0; nop < n_operands; nop++)
@@ -2685,6 +2722,21 @@  process_alt_operands (int only_alternati
 		    }
 		}
 
+	      /* When we use memory operand, the insn should read the
+		 value from memory and even if we just wrote a value
+		 into the memory it is costly in comparison with an
+		 insn alternative which does not use memory
+		 (e.g. register or immediate operand).  */
+	      if (no_regs_p && offmemok)
+		{
+		  if (lra_dump_file != NULL)
+		    fprintf
+		      (lra_dump_file,
+		       "            Using memory insn operand %d: reject+=3\n",
+		       nop);
+		  reject += 3;
+		}
+	      
 #ifdef SECONDARY_MEMORY_NEEDED
 	      /* If reload requires moving value through secondary
 		 memory, it will need one more insn at least.  */
@@ -2747,6 +2799,14 @@  process_alt_operands (int only_alternati
               goto fail;
             }
 
+	  if (update_and_check_small_class_inputs (nop, this_alternative))
+	    {
+	      if (lra_dump_file != NULL)
+		fprintf (lra_dump_file,
+			 "            alt=%d, not enough small class regs -- refuse\n",
+			 nalt);
+	      goto fail;
+	    }
 	  curr_alt[nop] = this_alternative;
 	  COPY_HARD_REG_SET (curr_alt_set[nop], this_alternative_set);
 	  curr_alt_win[nop] = this_alternative_win;