diff mbox

[RFC] introduce --param max-lto-partition for having an upper bound on partition size

Message ID CAAgBjMkjMcf7XTvBOtnTr-zcKu-rED3WcLY=a4iYoijOw3V3vQ@mail.gmail.com
State New
Headers show

Commit Message

Prathamesh Kulkarni April 1, 2016, 1:48 p.m. UTC
Hi,
The attached patch introduces param max-lto-partition which creates an upper
bound for partition size.

My primary motivation for this patch is to fix building chromium for arm
with -flto-partition=one.
Chromium fails to build with -flto-partition={none, one} with assembler error:
"branch out of range error"
because in both these cases LTO creates a single text section of 18 mb
which exceeds thumb's limit of 16 mb and arm backend emits a short
call if caller and callee are in same section.
This is binutils PR18625:
https://sourceware.org/bugzilla/show_bug.cgi?id=18625
With patch, chromium builds for -flto-partition=one (by creating more
than one but minimal number of partitions to honor 16 mb limit).
I haven't tested with -flto-partition=none but I suppose the build
will still fail for none, because it won't involve partitioning?  I am
not sure how to fix for none case.

As suggested by Jim in binutils PR18625, the proper fix would be to
implement branch relaxation in arm's port of gas, however I suppose
only LTO will realistically create such large sections,
and implementing branch relaxation appears to be quite complicated and
probably too much of
an effort for this single use case, so this patch serves as a
work-around to the issue.
I am looking into fine-tuning the param value for ARM backend to
roughly match limit
of 16 mb.

AFAIU, this would change semantics of --param n_lto_partitions (or
-flto-partition=one) from
"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
not desirable maybe we could add
another param/option ?
Cross-tested on arm*-*-*.
Would this patch be OK for stage-1 (after getting param value right
for ARM target) ?

Thanks,
Prathamesh

Comments

Richard Biener April 1, 2016, 5:32 p.m. UTC | #1
On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
>Hi,
>The attached patch introduces param max-lto-partition which creates an
>upper
>bound for partition size.
>
>My primary motivation for this patch is to fix building chromium for
>arm
>with -flto-partition=one.
>Chromium fails to build with -flto-partition={none, one} with assembler
>error:
>"branch out of range error"
>because in both these cases LTO creates a single text section of 18 mb
>which exceeds thumb's limit of 16 mb and arm backend emits a short
>call if caller and callee are in same section.
>This is binutils PR18625:
>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
>With patch, chromium builds for -flto-partition=one (by creating more
>than one but minimal number of partitions to honor 16 mb limit).
>I haven't tested with -flto-partition=none but I suppose the build
>will still fail for none, because it won't involve partitioning?  I am
>not sure how to fix for none case.
>
>As suggested by Jim in binutils PR18625, the proper fix would be to
>implement branch relaxation in arm's port of gas, however I suppose
>only LTO will realistically create such large sections,
>and implementing branch relaxation appears to be quite complicated and
>probably too much of
>an effort for this single use case, so this patch serves as a
>work-around to the issue.
>I am looking into fine-tuning the param value for ARM backend to
>roughly match limit
>of 16 mb.
>
>AFAIU, this would change semantics of --param n_lto_partitions (or
>-flto-partition=one) from
>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
>not desirable maybe we could add
>another param/option ?
>Cross-tested on arm*-*-*.
>Would this patch be OK for stage-1 (after getting param value right
>for ARM target) ?

What do you want to achieve?  Changing =one semantics doesn't look right to me.
Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).

Richard.

>
>Thanks,
>Prathamesh
Prathamesh Kulkarni April 4, 2016, 7:47 a.m. UTC | #2
On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
> On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
>>Hi,
>>The attached patch introduces param max-lto-partition which creates an
>>upper
>>bound for partition size.
>>
>>My primary motivation for this patch is to fix building chromium for
>>arm
>>with -flto-partition=one.
>>Chromium fails to build with -flto-partition={none, one} with assembler
>>error:
>>"branch out of range error"
>>because in both these cases LTO creates a single text section of 18 mb
>>which exceeds thumb's limit of 16 mb and arm backend emits a short
>>call if caller and callee are in same section.
>>This is binutils PR18625:
>>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
>>With patch, chromium builds for -flto-partition=one (by creating more
>>than one but minimal number of partitions to honor 16 mb limit).
>>I haven't tested with -flto-partition=none but I suppose the build
>>will still fail for none, because it won't involve partitioning?  I am
>>not sure how to fix for none case.
>>
>>As suggested by Jim in binutils PR18625, the proper fix would be to
>>implement branch relaxation in arm's port of gas, however I suppose
>>only LTO will realistically create such large sections,
>>and implementing branch relaxation appears to be quite complicated and
>>probably too much of
>>an effort for this single use case, so this patch serves as a
>>work-around to the issue.
>>I am looking into fine-tuning the param value for ARM backend to
>>roughly match limit
>>of 16 mb.
>>
>>AFAIU, this would change semantics of --param n_lto_partitions (or
>>-flto-partition=one) from
>>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
>>not desirable maybe we could add
>>another param/option ?
>>Cross-tested on arm*-*-*.
>>Would this patch be OK for stage-1 (after getting param value right
>>for ARM target) ?
>
> What do you want to achieve?  Changing =one semantics doesn't look right to me.
> Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
Well, chromium fails to build on ARM with -flto-partition={none, one}
because the size of text section created with LTO,
exceeds the limit of 16 mb for thumb2 which results in assembler
errors: "branch out of range".
I was trying to fix that by creating minimal number of partitions such
that size of each partition is not greater than section size limit.
I suppose in theory the problem could also present with balanced
partitioning if total_size / n_lto_partitions exceeds section size
limit,
although not sure if this will be a practical case.

Thanks,
Prathamesh
>
> Richard.
>
>>
>>Thanks,
>>Prathamesh
>
>
Richard Biener April 4, 2016, 8:26 a.m. UTC | #3
On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:

> On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
> > On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
> >>Hi,
> >>The attached patch introduces param max-lto-partition which creates an
> >>upper
> >>bound for partition size.
> >>
> >>My primary motivation for this patch is to fix building chromium for
> >>arm
> >>with -flto-partition=one.
> >>Chromium fails to build with -flto-partition={none, one} with assembler
> >>error:
> >>"branch out of range error"
> >>because in both these cases LTO creates a single text section of 18 mb
> >>which exceeds thumb's limit of 16 mb and arm backend emits a short
> >>call if caller and callee are in same section.
> >>This is binutils PR18625:
> >>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
> >>With patch, chromium builds for -flto-partition=one (by creating more
> >>than one but minimal number of partitions to honor 16 mb limit).
> >>I haven't tested with -flto-partition=none but I suppose the build
> >>will still fail for none, because it won't involve partitioning?  I am
> >>not sure how to fix for none case.
> >>
> >>As suggested by Jim in binutils PR18625, the proper fix would be to
> >>implement branch relaxation in arm's port of gas, however I suppose
> >>only LTO will realistically create such large sections,
> >>and implementing branch relaxation appears to be quite complicated and
> >>probably too much of
> >>an effort for this single use case, so this patch serves as a
> >>work-around to the issue.
> >>I am looking into fine-tuning the param value for ARM backend to
> >>roughly match limit
> >>of 16 mb.
> >>
> >>AFAIU, this would change semantics of --param n_lto_partitions (or
> >>-flto-partition=one) from
> >>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
> >>not desirable maybe we could add
> >>another param/option ?
> >>Cross-tested on arm*-*-*.
> >>Would this patch be OK for stage-1 (after getting param value right
> >>for ARM target) ?
> >
> > What do you want to achieve?  Changing =one semantics doesn't look right to me.
> > Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
>
> Well, chromium fails to build on ARM with -flto-partition={none, one}
> because the size of text section created with LTO,
> exceeds the limit of 16 mb for thumb2 which results in assembler
> errors: "branch out of range".
> I was trying to fix that by creating minimal number of partitions such
> that size of each partition is not greater than section size limit.

Ok, but you simply shouldn't use -flto-partition={none,one} if it doesn't 
work.  Note that "partition size" and text section size do not have
a 1:1 correspondence so a safe limit is hard to achieve anyway.

> I suppose in theory the problem could also present with balanced
> partitioning if total_size / n_lto_partitions exceeds section size
> limit,
> although not sure if this will be a practical case.

I guess an artificial testcase can easily hit this.  Or you can
hit this by adjusting --param lto-partitions to 1.  I think
adding a --param lto-max-partition is missing given that we
already have a --param lto-min-partition and the partitioning
algorithm tries to create lto-partitions partitions (but not smaller
than lto-min-partition) but it never creates more than lto-partitions
partitions as there is no upper bound on individual partition size.

This is also why lto-partitions has such a high default (to exploit
parallelism - but if there is only a very small number of CPU cores
available it doesn't make sense to split up so much for small programs).

That said, lto-partitions is a hint currently but also an upper bound
because we lack lto-max-partition.  Let's fix that instead.

Richard.
Prathamesh Kulkarni April 4, 2016, 11:19 a.m. UTC | #4
On 4 April 2016 at 13:56, Richard Biener <rguenther@suse.de> wrote:
> On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
>> > On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
>> >>Hi,
>> >>The attached patch introduces param max-lto-partition which creates an
>> >>upper
>> >>bound for partition size.
>> >>
>> >>My primary motivation for this patch is to fix building chromium for
>> >>arm
>> >>with -flto-partition=one.
>> >>Chromium fails to build with -flto-partition={none, one} with assembler
>> >>error:
>> >>"branch out of range error"
>> >>because in both these cases LTO creates a single text section of 18 mb
>> >>which exceeds thumb's limit of 16 mb and arm backend emits a short
>> >>call if caller and callee are in same section.
>> >>This is binutils PR18625:
>> >>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
>> >>With patch, chromium builds for -flto-partition=one (by creating more
>> >>than one but minimal number of partitions to honor 16 mb limit).
>> >>I haven't tested with -flto-partition=none but I suppose the build
>> >>will still fail for none, because it won't involve partitioning?  I am
>> >>not sure how to fix for none case.
>> >>
>> >>As suggested by Jim in binutils PR18625, the proper fix would be to
>> >>implement branch relaxation in arm's port of gas, however I suppose
>> >>only LTO will realistically create such large sections,
>> >>and implementing branch relaxation appears to be quite complicated and
>> >>probably too much of
>> >>an effort for this single use case, so this patch serves as a
>> >>work-around to the issue.
>> >>I am looking into fine-tuning the param value for ARM backend to
>> >>roughly match limit
>> >>of 16 mb.
>> >>
>> >>AFAIU, this would change semantics of --param n_lto_partitions (or
>> >>-flto-partition=one) from
>> >>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
>> >>not desirable maybe we could add
>> >>another param/option ?
>> >>Cross-tested on arm*-*-*.
>> >>Would this patch be OK for stage-1 (after getting param value right
>> >>for ARM target) ?
>> >
>> > What do you want to achieve?  Changing =one semantics doesn't look right to me.
>> > Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
>>
>> Well, chromium fails to build on ARM with -flto-partition={none, one}
>> because the size of text section created with LTO,
>> exceeds the limit of 16 mb for thumb2 which results in assembler
>> errors: "branch out of range".
>> I was trying to fix that by creating minimal number of partitions such
>> that size of each partition is not greater than section size limit.
>
> Ok, but you simply shouldn't use -flto-partition={none,one} if it doesn't
> work.  Note that "partition size" and text section size do not have
> a 1:1 correspondence so a safe limit is hard to achieve anyway.
>
>> I suppose in theory the problem could also present with balanced
>> partitioning if total_size / n_lto_partitions exceeds section size
>> limit,
>> although not sure if this will be a practical case.
>
> I guess an artificial testcase can easily hit this.  Or you can
> hit this by adjusting --param lto-partitions to 1.  I think
> adding a --param lto-max-partition is missing given that we
> already have a --param lto-min-partition and the partitioning
> algorithm tries to create lto-partitions partitions (but not smaller
> than lto-min-partition) but it never creates more than lto-partitions
> partitions as there is no upper bound on individual partition size.
>
> This is also why lto-partitions has such a high default (to exploit
> parallelism - but if there is only a very small number of CPU cores
> available it doesn't make sense to split up so much for small programs).
>
> That said, lto-partitions is a hint currently but also an upper bound
> because we lack lto-max-partition.  Let's fix that instead.
Um not sure if I understood correctly.
Do we want to constrain individual partition size by adding parameter
lto-max-partition
for balanced partitioning but not for -flto-partition=one
case (since latter would also change semantics of =one) ?

Thanks,
Prathamesh
>
> Richard.
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Richard Biener April 4, 2016, 11:42 a.m. UTC | #5
On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:

> On 4 April 2016 at 13:56, Richard Biener <rguenther@suse.de> wrote:
> > On Mon, 4 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 1 April 2016 at 23:02, Richard Biener <rguenther@suse.de> wrote:
> >> > On April 1, 2016 3:48:35 PM GMT+02:00, Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> wrote:
> >> >>Hi,
> >> >>The attached patch introduces param max-lto-partition which creates an
> >> >>upper
> >> >>bound for partition size.
> >> >>
> >> >>My primary motivation for this patch is to fix building chromium for
> >> >>arm
> >> >>with -flto-partition=one.
> >> >>Chromium fails to build with -flto-partition={none, one} with assembler
> >> >>error:
> >> >>"branch out of range error"
> >> >>because in both these cases LTO creates a single text section of 18 mb
> >> >>which exceeds thumb's limit of 16 mb and arm backend emits a short
> >> >>call if caller and callee are in same section.
> >> >>This is binutils PR18625:
> >> >>https://sourceware.org/bugzilla/show_bug.cgi?id=18625
> >> >>With patch, chromium builds for -flto-partition=one (by creating more
> >> >>than one but minimal number of partitions to honor 16 mb limit).
> >> >>I haven't tested with -flto-partition=none but I suppose the build
> >> >>will still fail for none, because it won't involve partitioning?  I am
> >> >>not sure how to fix for none case.
> >> >>
> >> >>As suggested by Jim in binutils PR18625, the proper fix would be to
> >> >>implement branch relaxation in arm's port of gas, however I suppose
> >> >>only LTO will realistically create such large sections,
> >> >>and implementing branch relaxation appears to be quite complicated and
> >> >>probably too much of
> >> >>an effort for this single use case, so this patch serves as a
> >> >>work-around to the issue.
> >> >>I am looking into fine-tuning the param value for ARM backend to
> >> >>roughly match limit
> >> >>of 16 mb.
> >> >>
> >> >>AFAIU, this would change semantics of --param n_lto_partitions (or
> >> >>-flto-partition=one) from
> >> >>"exactly n_lto_partitions" to "at-least n_lto_partitions". If that's
> >> >>not desirable maybe we could add
> >> >>another param/option ?
> >> >>Cross-tested on arm*-*-*.
> >> >>Would this patch be OK for stage-1 (after getting param value right
> >> >>for ARM target) ?
> >> >
> >> > What do you want to achieve?  Changing =one semantics doesn't look right to me.
> >> > Adding a param for maximum size sounds good in general, but only to increase the maximum number of partitions for =balanced (the default).
> >>
> >> Well, chromium fails to build on ARM with -flto-partition={none, one}
> >> because the size of text section created with LTO,
> >> exceeds the limit of 16 mb for thumb2 which results in assembler
> >> errors: "branch out of range".
> >> I was trying to fix that by creating minimal number of partitions such
> >> that size of each partition is not greater than section size limit.
> >
> > Ok, but you simply shouldn't use -flto-partition={none,one} if it doesn't
> > work.  Note that "partition size" and text section size do not have
> > a 1:1 correspondence so a safe limit is hard to achieve anyway.
> >
> >> I suppose in theory the problem could also present with balanced
> >> partitioning if total_size / n_lto_partitions exceeds section size
> >> limit,
> >> although not sure if this will be a practical case.
> >
> > I guess an artificial testcase can easily hit this.  Or you can
> > hit this by adjusting --param lto-partitions to 1.  I think
> > adding a --param lto-max-partition is missing given that we
> > already have a --param lto-min-partition and the partitioning
> > algorithm tries to create lto-partitions partitions (but not smaller
> > than lto-min-partition) but it never creates more than lto-partitions
> > partitions as there is no upper bound on individual partition size.
> >
> > This is also why lto-partitions has such a high default (to exploit
> > parallelism - but if there is only a very small number of CPU cores
> > available it doesn't make sense to split up so much for small programs).
> >
> > That said, lto-partitions is a hint currently but also an upper bound
> > because we lack lto-max-partition.  Let's fix that instead.
> Um not sure if I understood correctly.
> Do we want to constrain individual partition size by adding parameter
> lto-max-partition
> for balanced partitioning but not for -flto-partition=one
> case (since latter would also change semantics of =one) ?

Yes, I think so.

Richard.

> Thanks,
> Prathamesh
> >
> > Richard.
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> 
>
Jan Hubicka April 4, 2016, noon UTC | #6
> > Um not sure if I understood correctly.
> > Do we want to constrain individual partition size by adding parameter
> > lto-max-partition
> > for balanced partitioning but not for -flto-partition=one
> > case (since latter would also change semantics of =one) ?
> 
> Yes, I think so.

Yep, I agree.  Having partition one that produces multiple partitions doesn't seem to make much sense.
Given that we have such not so predictable target specific limits on size of single translation unit
we can handle, I suppose adding a resonable limit to the default balanced partitioning makes more sense.
One can always push the behaviour you intend by setting max partitions to 1 (I suppose max size should
have precedence over max partitions)

Honza
> 
> Richard.
> 
> > Thanks,
> > Prathamesh
> > >
> > > Richard.
> > >
> > > --
> > > Richard Biener <rguenther@suse.de>
> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> > 
> > 
> 
> -- 
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
diff mbox

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c868490..f734d56 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3459,6 +3459,11 @@  arm_option_override (void)
 
   /* Init initial mode for testing.  */
   thumb_flipper = TARGET_THUMB;
+
+    maybe_set_param_value (MAX_PARTITION_SIZE,
+			  10000, /* FIXME: fine-tune this value to roughly match 16 mb limit.  */
+                           global_options.x_param_values,
+			   global_options_set.x_param_values);
 }
 
 static void
diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 9eb63c2..bc0c612 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -511,9 +511,20 @@  lto_balanced_map (int n_lto_partitions)
   varpool_order.qsort (varpool_node_cmp);
 
   /* Compute partition size and create the first partition.  */
+  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
+    fatal_error (input_location, "min partition size cannot be greater than max partition size");
+
   partition_size = total_size / n_lto_partitions;
   if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
     partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
+  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
+    {
+      n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
+      if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
+	n_lto_partitions++;
+      partition_size = total_size / n_lto_partitions;
+    }
+
   npartitions = 1;
   partition = new_partition ("");
   if (symtab->dump_file)
diff --git a/gcc/params.def b/gcc/params.def
index 9362c15..b6055ff 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1029,6 +1029,11 @@  DEFPARAM (MIN_PARTITION_SIZE,
 	  "Minimal size of a partition for LTO (in estimated instructions).",
 	  1000, 0, 0)
 
+DEFPARAM (MAX_PARTITION_SIZE,
+	  "lto-max-partition",
+	  "Maximal size of a partition for LTO (in estimated instructions).",
+	  INT_MAX, 0, INT_MAX)
+
 /* Diagnostic parameters.  */
 
 DEFPARAM (CXX_MAX_NAMESPACES_FOR_DIAGNOSTIC_HELP,