diff mbox

[RFC:,AArch64] Parametrically set defaults for function and jump alignment

Message ID 1415961336-1986-1-git-send-email-james.greenhalgh@arm.com
State New
Headers show

Commit Message

James Greenhalgh Nov. 14, 2014, 10:35 a.m. UTC
Hi,

We currently do not set any interesting default values for jump and function
alignment in AArch64. I've made the formula for these values derive from
the issue rate of the processor as so:

  jumps: 4 * processor issue-rate (rounded down to nearest power of two)
  functions: 4 * processor issue-rate (rounded up to nearest power of two)

This is sensible for the ARMv8-a implementations I tested on. An
alternative patch would make these values new fields in the tuning
tables.

This happens to work well for some benchmarks and doesn't harm others.
The benefit swings depending on the existing alignment and the knock-on
effects.

Bootstrapped on aarch64-none-linux-gnu with no issues.

Does anyone have any thoughts or preferences as to how we set these
values in future? If not, OK For trunk?

Thanks,
James

---
2014-11-14  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64.c (aarch64_override_options): Set default
	alignments for functions and jumps.

Comments

Andrew Pinski Nov. 14, 2014, 10:42 a.m. UTC | #1
On Fri, Nov 14, 2014 at 2:35 AM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
>
> Hi,
>
> We currently do not set any interesting default values for jump and function
> alignment in AArch64. I've made the formula for these values derive from
> the issue rate of the processor as so:
>
>   jumps: 4 * processor issue-rate (rounded down to nearest power of two)
>   functions: 4 * processor issue-rate (rounded up to nearest power of two)
>
> This is sensible for the ARMv8-a implementations I tested on. An
> alternative patch would make these values new fields in the tuning
> tables.

I had submitted an alternative patch a few hours ago which allows the
tuning structure say what alignment is wanted for all three:
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01615.html

>
> This happens to work well for some benchmarks and doesn't harm others.
> The benefit swings depending on the existing alignment and the knock-on
> effects.
>
> Bootstrapped on aarch64-none-linux-gnu with no issues.
>
> Does anyone have any thoughts or preferences as to how we set these
> values in future? If not, OK For trunk?

Also I notice you don't align loops, was that an oversight or just you
did not think it was needed?

Thanks,
Andrew


>
> Thanks,
> James
>
> ---
> 2014-11-14  James Greenhalgh  <james.greenhalgh@arm.com>
>
>         * config/aarch64/aarch64.c (aarch64_override_options): Set default
>         alignments for functions and jumps.
James Greenhalgh Nov. 14, 2014, 10:50 a.m. UTC | #2
On Fri, Nov 14, 2014 at 10:42:27AM +0000, Andrew Pinski wrote:
> On Fri, Nov 14, 2014 at 2:35 AM, James Greenhalgh
> <james.greenhalgh@arm.com> wrote:
> >
> > Hi,
> >
> > We currently do not set any interesting default values for jump and function
> > alignment in AArch64. I've made the formula for these values derive from
> > the issue rate of the processor as so:
> >
> >   jumps: 4 * processor issue-rate (rounded down to nearest power of two)
> >   functions: 4 * processor issue-rate (rounded up to nearest power of two)
> >
> > This is sensible for the ARMv8-a implementations I tested on. An
> > alternative patch would make these values new fields in the tuning
> > tables.
> 
> I had submitted an alternative patch a few hours ago which allows the
> tuning structure say what alignment is wanted for all three:
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01615.html

D'oh! I should have flicked through gcc-patches before hitting send!
I imagine I'm encoding similar logic to that you used when writing
this patch.

I'm happy with either approach, so I'll leave it to the maintainers to
decide which they prefer.

> >
> > This happens to work well for some benchmarks and doesn't harm others.
> > The benefit swings depending on the existing alignment and the knock-on
> > effects.
> >
> > Bootstrapped on aarch64-none-linux-gnu with no issues.
> >
> > Does anyone have any thoughts or preferences as to how we set these
> > values in future? If not, OK For trunk?
> 
> Also I notice you don't align loops, was that an oversight or just you
> did not think it was needed?

I didn't think it was needed, and saw some performance issues when
benchmarking with it on. I found these parameters to be very fickle
in the way they change performance. It is easy to end up with a
bunch of cold loops needlessly padding out the binary and therefore
the cache.

It might be that hard-coding loop alignment to 8 for all cores is
sensible, but I don't have data either way.

Cheers,
James

> > ---
> > 2014-11-14  James Greenhalgh  <james.greenhalgh@arm.com>
> >
> >         * config/aarch64/aarch64.c (aarch64_override_options): Set default
> >         alignments for functions and jumps.
>
Marcus Shawcroft Nov. 14, 2014, 11:11 a.m. UTC | #3
On 14 November 2014 10:50, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> On Fri, Nov 14, 2014 at 10:42:27AM +0000, Andrew Pinski wrote:
>> On Fri, Nov 14, 2014 at 2:35 AM, James Greenhalgh
>> <james.greenhalgh@arm.com> wrote:
>> >
>> > Hi,
>> >
>> > We currently do not set any interesting default values for jump and function
>> > alignment in AArch64. I've made the formula for these values derive from
>> > the issue rate of the processor as so:
>> >
>> >   jumps: 4 * processor issue-rate (rounded down to nearest power of two)
>> >   functions: 4 * processor issue-rate (rounded up to nearest power of two)
>> >
>> > This is sensible for the ARMv8-a implementations I tested on. An
>> > alternative patch would make these values new fields in the tuning
>> > tables.
>>
>> I had submitted an alternative patch a few hours ago which allows the
>> tuning structure say what alignment is wanted for all three:
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01615.html
>
> D'oh! I should have flicked through gcc-patches before hitting send!
> I imagine I'm encoding similar logic to that you used when writing
> this patch.
>
> I'm happy with either approach, so I'll leave it to the maintainers to
> decide which they prefer.

I think Andrews approach of making it adjustable per core makes sense.
Andrew can you split the alignment part of your patch from the fusion
part of your patch?

Cheers
/Marcus
Ramana Radhakrishnan Nov. 14, 2014, 11:31 a.m. UTC | #4
On Fri, Nov 14, 2014 at 11:11 AM, Marcus Shawcroft
<marcus.shawcroft@gmail.com> wrote:
> On 14 November 2014 10:50, James Greenhalgh <james.greenhalgh@arm.com> wrote:
>> On Fri, Nov 14, 2014 at 10:42:27AM +0000, Andrew Pinski wrote:
>>> On Fri, Nov 14, 2014 at 2:35 AM, James Greenhalgh
>>> <james.greenhalgh@arm.com> wrote:
>>> >
>>> > Hi,
>>> >
>>> > We currently do not set any interesting default values for jump and function
>>> > alignment in AArch64. I've made the formula for these values derive from
>>> > the issue rate of the processor as so:
>>> >
>>> >   jumps: 4 * processor issue-rate (rounded down to nearest power of two)
>>> >   functions: 4 * processor issue-rate (rounded up to nearest power of two)
>>> >
>>> > This is sensible for the ARMv8-a implementations I tested on. An
>>> > alternative patch would make these values new fields in the tuning
>>> > tables.
>>>
>>> I had submitted an alternative patch a few hours ago which allows the
>>> tuning structure say what alignment is wanted for all three:
>>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01615.html
>>
>> D'oh! I should have flicked through gcc-patches before hitting send!
>> I imagine I'm encoding similar logic to that you used when writing
>> this patch.
>>
>> I'm happy with either approach, so I'll leave it to the maintainers to
>> decide which they prefer.
>
> I think Andrews approach of making it adjustable per core makes sense.
> Andrew can you split the alignment part of your patch from the fusion
> part of your patch?

FWIW, I agree parameterizing this off issue width appears an interesting metric
but having a handle on these independently would probably be more
useful in the long run. And I think it would make sense for generic to have
8 bytes alignment where sensible.

I'm not sure about align-loops from benchmarking experiments in AArch32 land
a couple of years ago. IIRC it was very hard to stop the compiler from
just willy
nilly inserting too many nops. The situation in A64 may be different to this but
doesn't sound like it from James's experimentation so far.

Ramana

>
> Cheers
> /Marcus
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d4a8a2f..6b51885 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6493,6 +6493,32 @@  aarch64_override_options (void)
 #endif
     }
 
+  /* If we haven't been asked for any particular alignment for loops, jumps
+     and functions, choose defaults for the user.  We pick an alignment
+     which is word-size * issue-rate, rounded up to the nearest power of
+     two up for functions and down to the nearest power of two for
+     jumps.  */
+  if (!optimize_size)
+    {
+      if (align_jumps <= 0)
+	{
+	  /* Ideally, we want to be aligned to a block at least as large
+	     as the issue width of the processor, but too much padding
+	     risks wasting cache space.  Settle for the nearest power
+	     of below what we wanted.  */
+	  align_jumps = aarch64_tune_params->issue_rate * 4;
+	  align_jumps = 1 << floor_log2 (align_jumps);
+	}
+      if (align_functions <= 0)
+	{
+	  /* We want to be aligned to a block at least as large as the issue
+	     width of the processor.  */
+	  align_functions = aarch64_tune_params->issue_rate * 4;
+	  /* Round up to the nearest power of two.  */
+	  align_functions = 1 << ceil_log2 (align_functions);
+	}
+    }
+
   aarch64_override_options_after_change ();
 }