diff mbox series

[V2] rs6000: re-enable web and rnreg with -funroll-loops

Message ID h48fthbtqa0.fsf_-_@genoa.aus.stglabs.ibm.com
State New
Headers show
Series [V2] rs6000: re-enable web and rnreg with -funroll-loops | expand

Commit Message

Jiufu Guo Dec. 23, 2019, 8:11 a.m. UTC
Jiufu Guo <guojiufu@linux.ibm.com> writes:

> Segher Boessenkool <segher@kernel.crashing.org> writes:
>
>> Hi!
>>
>> On Fri, Dec 20, 2019 at 02:34:05PM +0800, Jiufu Guo wrote:
>>> Previously, limited unrolling was enabled at O2 for powerpc in r278034.  At that
>>> time, -fweb and -frename-registers were not enabled together with -funroll-loops
>>> even for -O3.  After that, we notice there are some performance degradation on
>>> SPEC2006fp which caused by without web and rnreg.
>>
>> And 2017 was fine on all tests.  Annoying :-(
>>
>>> This patch enable -fweb
>>> and -frename-registers for -O3 to align original behavior before r278034.
>>
>> Okay.  Hopefully we can find a way to determine in what circumstances web
>> and rnreg help instead of hurt, but until then, the old behaviour is
>> certainly the safe choice.
>>
>>> --- a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-1.c
>>> +++ b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-1.c
>>> @@ -2,6 +2,7 @@
>>>  /* Originator: Andrew Church <gcczilla@achurch.org> */
>>>  /* { dg-do run } */
>>>  /* { dg-require-effective-target untyped_assembly } */
>>> +/* { dg-additional-options "-fno-rename-registers" { target { powerpc*-*-* } } } */
>>
>> What is this for?  What happens without it?
> The reason of this fail is: -frename-registers does not work well with
> __builtin_return/__builtin_apply which need to save and restore
> registers which could not be renamed incorrectly.
>
> When this case runs with -O3, with this patch, -frename-registers is
> enabled. Originally, -frename-registers is enabled with -funroll-loops
> instead pure -O3. This change cause this case fail at -O3.
>

To align with original behavior better, I updated the patch and attached
it at the end of this mail.
The updated patch also pass bootstrap and regtests.

Is this patch ok for trunk?

Thanks,
Jiufu

>>
>> The rs6000/ parts are okay for trunk.  Thanks!
>>
>>
>> Segher

gcc/
2019-12-23  Jiufu Guo  <guojiufu@linux.ibm.com>

	* gcc/config/rs6000/rs6000.c
	(rs6000_option_override_internal): Enable -fweb and -frename-registers
	with -funroll-loops

---
 gcc/config/rs6000/rs6000.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Comments

Segher Boessenkool Dec. 23, 2019, 5:07 p.m. UTC | #1
On Mon, Dec 23, 2019 at 04:11:35PM +0800, Jiufu Guo wrote:
> To align with original behavior better, I updated the patch and attached
> it at the end of this mail.
> The updated patch also pass bootstrap and regtests.
> 
> Is this patch ok for trunk?

If this performs well, okay for trunk.  Thanks!


Segher


> 2019-12-23  Jiufu Guo  <guojiufu@linux.ibm.com>
> 
> 	* gcc/config/rs6000/rs6000.c
> 	(rs6000_option_override_internal): Enable -fweb and -frename-registers
> 	with -funroll-loops
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 23b6d99..dfba6b4 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4538,12 +4538,19 @@  rs6000_option_override_internal (bool global_init_p)
 			   param_sched_pressure_algorithm,
 			   SCHED_PRESSURE_MODEL);
 
-      /* Explicit -funroll-loops turns -munroll-only-small-loops off.  */
-      if (((global_options_set.x_flag_unroll_loops && flag_unroll_loops)
+      /* Explicit -funroll-loops turns -munroll-only-small-loops off, and
+	 turns -fweb and -frename-registers on.  */
+      if ((global_options_set.x_flag_unroll_loops && flag_unroll_loops)
 	   || (global_options_set.x_flag_unroll_all_loops
 	       && flag_unroll_all_loops))
-	  && !global_options_set.x_unroll_only_small_loops)
-	unroll_only_small_loops = 0;
+	{
+	  if (!global_options_set.x_unroll_only_small_loops)
+	    unroll_only_small_loops = 0;
+	  if (!global_options_set.x_flag_rename_registers)
+	    flag_rename_registers = 1;
+	  if (!global_options_set.x_flag_web)
+	    flag_web = 1;
+	}
 
       /* If using typedef char *va_list, signal that
 	 __builtin_va_start (&ap, 0) can be optimized to