Patchwork Call GNU ld with -O*

login
register
mail settings
Submitter Marc Glisse
Date July 12, 2013, 2:54 p.m.
Message ID <alpine.DEB.2.02.1307121633360.28978@stedding.saclay.inria.fr>
Download mbox | patch
Permalink /patch/258759/
State New
Headers show

Comments

Marc Glisse - July 12, 2013, 2:54 p.m.
Hello,

this patch makes the driver pass -O2 to GNU ld if gcc was called with -O3 
(or more), -Ofast or -Os. Is there a cleaner way to test (optimize > 2 || 
optimize_size) in the driver?

I did a bootstrap on x86_64-unknown-linux-gnu and then manually tested by 
looking at the -v output.

I tried a -O3 bootstrap, but even setting CFLAGS, BOOT_CFLAGS, 
CFLAGS_FOT_TARGET, and the same 3 versions with CXXFLAGS and LDFLAGS, most 
stuff still used the default -O2 -g. And most libraries that are part of 
gcc explicitly use -Wl,-O1 for linking already.

2013-07-12  Marc Glisse  <marc.glisse@inria.fr>

 	PR driver/44080
 	* gcc.c (LINK_OPT_SPEC): New macro.
 	(LINK_COMMAND_SPEC): Use it.
Jakub Jelinek - July 12, 2013, 3:27 p.m.
On Fri, Jul 12, 2013 at 04:54:21PM +0200, Marc Glisse wrote:
> +/* GNU ld has -O1 and gold -O2, but we only pass it with -O3, -Os or -Ofast. */
> +#ifndef LINK_OPT_SPEC
> +#if HAVE_GNU_LD
> +#define LINK_OPT_SPEC "%{O*:%{!O0:%{!O1:%{!O2:%{!Og:%{!O:-O2}}}}}} "

Wouldn't something like "%{O*:%{O|O0|O1|O2|Og:;:-O2}}" work?

	Jakub
Thomas Schwinge - July 12, 2013, 4:49 p.m.
Hi!

On Fri, 12 Jul 2013 16:54:21 +0200 (CEST), Marc Glisse <marc.glisse@inria.fr> wrote:
> this patch makes the driver pass -O2 to GNU ld if gcc was called with -O3 
> (or more), -Ofast or -Os.

I wondered about this before: why are no -O flags are passed to the (GNU)
linker.

What's the rationale for the mapping you propose of GCC and linker
optimization flags?  Why not a one-to-one mapping?


Grüße,
 Thomas
Jakub Jelinek - July 12, 2013, 4:53 p.m.
On Fri, Jul 12, 2013 at 09:49:28AM -0700, Thomas Schwinge wrote:
> Hi!
> 
> On Fri, 12 Jul 2013 16:54:21 +0200 (CEST), Marc Glisse <marc.glisse@inria.fr> wrote:
> > this patch makes the driver pass -O2 to GNU ld if gcc was called with -O3 
> > (or more), -Ofast or -Os.
> 
> I wondered about this before: why are no -O flags are passed to the (GNU)
> linker.
> 
> What's the rationale for the mapping you propose of GCC and linker
> optimization flags?  Why not a one-to-one mapping?

Because there is no 1:1 mapping.  I bet ld doesn't grok -Os or -Ofast or
-Og, and what -O* does in the linker doesn't have direct correspondence to
what -O* does in the compiler.  Last time I've looked, -O* in the linker
just enabled some link time expensive computation of the hash table,
so enabling -O in the linker for -O1 in the compiler is definitely not a
good idea, similarly for -Og, those definitely want speed rather than better
hash table.

	Jakub
Thomas Schwinge - July 12, 2013, 5:02 p.m.
Hi!

On Fri, 12 Jul 2013 18:53:58 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Jul 12, 2013 at 09:49:28AM -0700, Thomas Schwinge wrote:
> > On Fri, 12 Jul 2013 16:54:21 +0200 (CEST), Marc Glisse <marc.glisse@inria.fr> wrote:
> > > this patch makes the driver pass -O2 to GNU ld if gcc was called with -O3 
> > > (or more), -Ofast or -Os.
> > 
> > I wondered about this before: why are no -O flags are passed to the (GNU)
> > linker.
> > 
> > What's the rationale for the mapping you propose of GCC and linker
> > optimization flags?  Why not a one-to-one mapping?
> 
> Because there is no 1:1 mapping.  I bet ld doesn't grok -Os or -Ofast or
> -Og, and what -O* does in the linker doesn't have direct correspondence to
> what -O* does in the compiler.  Last time I've looked, -O* in the linker
> just enabled some link time expensive computation of the hash table,
> so enabling -O in the linker for -O1 in the compiler is definitely not a
> good idea, similarly for -Og, those definitely want speed rather than better
> hash table.

OK, there currently is no one-to-one mapping -- but why not establish
one?  Then the linker would enable the expensive optimizations only for
-O2 and higher -- which would break backwards compatibility, bummer...
(Though, I could see potential uses in the linker for -Ofast and -Og,
too.)


Grüße,
 Thomas
Ian Taylor - July 12, 2013, 6:15 p.m.
On Fri, Jul 12, 2013 at 7:54 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>
> this patch makes the driver pass -O2 to GNU ld if gcc was called with -O3
> (or more), -Ofast or -Os.

I don't understand why that is a good idea.  The linker -O option is
only distantly related to the compiler -O option.  It was probably a
mistake to have a linker -O option at all.

Ian
Marc Glisse - July 12, 2013, 6:21 p.m.
On Fri, 12 Jul 2013, Ian Lance Taylor wrote:

> On Fri, Jul 12, 2013 at 7:54 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>
>> this patch makes the driver pass -O2 to GNU ld if gcc was called with -O3
>> (or more), -Ofast or -Os.
>
> I don't understand why that is a good idea.

Well, you thought it was 3 years ago ;-)
http://gcc.gnu.org/ml/gcc/2010-05/msg00193.html

> The linker -O option is only distantly related to the compiler -O 
> option.  It was probably a mistake to have a linker -O option at all.

Doesn't the linker produce something that is faster to link and/or 
smaller, with this flag?
Ian Taylor - July 12, 2013, 7:25 p.m.
On Fri, Jul 12, 2013 at 11:21 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 12 Jul 2013, Ian Lance Taylor wrote:
>
>> On Fri, Jul 12, 2013 at 7:54 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>>
>>> this patch makes the driver pass -O2 to GNU ld if gcc was called with -O3
>>> (or more), -Ofast or -Os.
>>
>>
>> I don't understand why that is a good idea.
>
>
> Well, you thought it was 3 years ago ;-)
> http://gcc.gnu.org/ml/gcc/2010-05/msg00193.html

Ah well, coming to it fresh I'm not sure it is.


>> The linker -O option is only distantly related to the compiler -O option.
>> It was probably a mistake to have a linker -O option at all.
>
>
> Doesn't the linker produce something that is faster to link and/or smaller,
> with this flag?

For gold I think it has two effects.  If you use compressed debug
sections, it will compress them with zlib level 9 rather than 1.  If
you use -O2 or greater it will optimize string tables (e.g., the table
holding the names of symbols) such that if the table stores two
strings S1 and S2, and S2 is a suffix of S1, it will store only S1,
and change pointers to S2 to point to the suffix of S1.  So you are
correct that the result will be smaller, but not faster to link.

People routinely compile with -O or -O2.  It's not clear to me that
they should routinely link with -O or -O2.  These decisions are not
the default in the linker because they aren't good tradeoffs for most
people.

Ian
Jakub Jelinek - July 12, 2013, 7:30 p.m.
On Fri, Jul 12, 2013 at 12:25:16PM -0700, Ian Lance Taylor wrote:
> For gold I think it has two effects.  If you use compressed debug
> sections, it will compress them with zlib level 9 rather than 1.  If

Marc's patch enabled it only for -O3/-Ofast (which are already compile time
expensive options, thus it perhaps it doesn't hurt that much to spend extra
time in the linker too) and -Os (then you are really looking for small,
and if ld -O2 provides that, then perhaps it is desirable too).

> you use -O2 or greater it will optimize string tables (e.g., the table
> holding the names of symbols) such that if the table stores two
> strings S1 and S2, and S2 is a suffix of S1, it will store only S1,
> and change pointers to S2 to point to the suffix of S1.  So you are
> correct that the result will be smaller, but not faster to link.

Oh, gold doesn't do this by default?  ld.bfd does (unless it changed since
I've implemented many years ago).

	Jakub
Ian Taylor - July 12, 2013, 7:41 p.m.
On Fri, Jul 12, 2013 at 12:30 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Jul 12, 2013 at 12:25:16PM -0700, Ian Lance Taylor wrote:
>> For gold I think it has two effects.  If you use compressed debug
>> sections, it will compress them with zlib level 9 rather than 1.  If
>
> Marc's patch enabled it only for -O3/-Ofast (which are already compile time
> expensive options, thus it perhaps it doesn't hurt that much to spend extra
> time in the linker too) and -Os (then you are really looking for small,
> and if ld -O2 provides that, then perhaps it is desirable too).

OK, let me put it this way: perhaps there is some set of linker
options that we should enable by default when linking with -O.  But I
don't see any particular reason that they are specifically the linker
options that are selected by -O.


>> you use -O2 or greater it will optimize string tables (e.g., the table
>> holding the names of symbols) such that if the table stores two
>> strings S1 and S2, and S2 is a suffix of S1, it will store only S1,
>> and change pointers to S2 to point to the suffix of S1.  So you are
>> correct that the result will be smaller, but not faster to link.
>
> Oh, gold doesn't do this by default?  ld.bfd does (unless it changed since
> I've implemented many years ago).

Correct: gold does not do it by default.  For us link time matters
most for large C++ programs.  For those programs, the optimization
only fired a tiny percentage of the time, but it took something like
4% of total link time.  It wasn't worth it by default.

Ian
Marc Glisse - July 12, 2013, 8:12 p.m.
On Fri, 12 Jul 2013, Ian Lance Taylor wrote:

> On Fri, Jul 12, 2013 at 12:30 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Fri, Jul 12, 2013 at 12:25:16PM -0700, Ian Lance Taylor wrote:
>>> For gold I think it has two effects.  If you use compressed debug
>>> sections, it will compress them with zlib level 9 rather than 1.  If
>>
>> Marc's patch enabled it only for -O3/-Ofast (which are already compile time
>> expensive options, thus it perhaps it doesn't hurt that much to spend extra
>> time in the linker too) and -Os (then you are really looking for small,
>> and if ld -O2 provides that, then perhaps it is desirable too).
>
> OK, let me put it this way: perhaps there is some set of linker
> options that we should enable by default when linking with -O.  But I
> don't see any particular reason that they are specifically the linker
> options that are selected by -O.

I completely agree with this. We could pass --compress-level=9 
--merge-string-suffix --optimize-hashtables and a number of others (names 
invented). -O2 just happens to include things (both for bfd and gold, but 
the original motivation was bfd) that to me make sense to enable when 
linking with gcc -O3 (highest level, not the day-to-day fast compilation). 
If you think they don't, then let's reject this patch. Do you have any 
suggestions for linker flags that we could pass for high optimization 
levels where we don't mind if the linker takes a bit longer? (I guess that 
if an option was suitable for gcc -O2, you might have made it the default 
in the linker, hence my focus on -O3)
Michael Matz - July 15, 2013, 2:48 p.m.
Hi,

On Fri, 12 Jul 2013, Ian Lance Taylor wrote:

> >> It was probably a mistake to have a linker -O option at all.
> >
> >
> > Doesn't the linker produce something that is faster to link and/or smaller,
> > with this flag?
> 
> For gold I think it has two effects.  If you use compressed debug
> sections, it will compress them with zlib level 9 rather than 1.  If
> you use -O2 or greater it will optimize string tables (e.g., the table
> holding the names of symbols) such that if the table stores two
> strings S1 and S2, and S2 is a suffix of S1, it will store only S1,
> and change pointers to S2 to point to the suffix of S1.  So you are
> correct that the result will be smaller, but not faster to link.

For bfd ld it has only one effect (except for alpha), namely to try to use 
shorter chains for the ELF hash buckets (string table optimization will be 
done unconditionally).  The resulting objects are a slight bit faster to 
link (I don't think I could ever actually measure any improvement, 
though), and since the introduction of .gnu.hash the whole dance is 
useless.

So, I'd also argue against calling the linker with -O or -O2 except if any 
interesting speedup can be demonstrated.


Ciao,
Michael.

Patch

Index: gcc.c

===================================================================
--- gcc.c	(revision 200921)

+++ gcc.c	(working copy)

@@ -715,36 +715,45 @@  proper position among the other output f

 /* Linker command line options for -fsanitize= late on the command line.  */
 #ifndef SANITIZER_SPEC
 #define SANITIZER_SPEC "\
 %{!nostdlib:%{!nodefaultlibs:%{fsanitize=address:" LIBASAN_SPEC "\
     %{static:%ecannot specify -static with -fsanitize=address}\
     %{fsanitize=thread:%e-fsanitize=address is incompatible with -fsanitize=thread}}\
     %{fsanitize=thread:" LIBTSAN_SPEC "\
     %{!pie:%{!shared:%e-fsanitize=thread linking must be done with -pie or -shared}}}}}"
 #endif
 
+/* GNU ld has -O1 and gold -O2, but we only pass it with -O3, -Os or -Ofast. */

+#ifndef LINK_OPT_SPEC

+#if HAVE_GNU_LD

+#define LINK_OPT_SPEC "%{O*:%{!O0:%{!O1:%{!O2:%{!Og:%{!O:-O2}}}}}} "

+#else

+#define LINK_OPT_SPEC ""

+#endif

+#endif

+

 /* -u* was put back because both BSD and SysV seem to support it.  */
 /* %{static:} simply prevents an error message if the target machine
    doesn't handle -static.  */
 /* We want %{T*} after %{L*} and %D so that it can be used to specify linker
    scripts which exist in user specified directories, or in standard
    directories.  */
 /* We pass any -flto flags on to the linker, which is expected
    to understand them.  In practice, this means it had better be collect2.  */
 /* %{e*} includes -export-dynamic; see comment in common.opt.  */
 #ifndef LINK_COMMAND_SPEC
 #define LINK_COMMAND_SPEC "\
 %{!fsyntax-only:%{!c:%{!M:%{!MM:%{!E:%{!S:\
     %(linker) " \
     LINK_PLUGIN_SPEC \
    "%{flto|flto=*:%<fcompare-debug*} \
-    %{flto} %{flto=*} %l " LINK_PIE_SPEC \

+    %{flto} %{flto=*} %l " LINK_PIE_SPEC LINK_OPT_SPEC \

    "%{fuse-ld=*:-fuse-ld=%*}\
     %X %{o*} %{e*} %{N} %{n} %{r}\
     %{s} %{t} %{u*} %{z} %{Z} %{!nostdlib:%{!nostartfiles:%S}}\
     %{static:} %{L*} %(mfwrap) %(link_libgcc) " SANITIZER_EARLY_SPEC " %o\
     %{fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
     %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
     %(mflib) " STACK_SPLIT_SPEC "\
     %{fprofile-arcs|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
     %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
     %{!nostdlib:%{!nostartfiles:%E}} %{T*} }}}}}}"