Patchwork Fix PR 51910, -frepo/linker demangling interaction

login
register
mail settings
Submitter Sandra Loosemore
Date Jan. 30, 2012, 12:08 a.m.
Message ID <4F25DF6A.3060608@codesourcery.com>
Download mbox | patch
Permalink /patch/138486/
State New
Headers show

Comments

Sandra Loosemore - Jan. 30, 2012, 12:08 a.m.
[Ooops, resending to include the attachment, this time....]

As discussed in PR c++/51910, my patch from last summer

http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01368.html

to make HAVE_LD_DEMANGLE the default when using GNU ld exposed a bug in 
collect2's link-time -frepo handling -- it depends on the linker passing 
back mangled names of undefined symbols to collect2.  This patch fixes 
that by explicitly adding --no-demangle to the options passed to the 
linker in the repository-processing loop.  Then the original settings 
are restored for the final link, so that (for instance) if the user 
requested a map file or if there are messages for real link errors, they 
respect the user-specified demangling options.  This patch therefore 
adds two extra link steps to the case where HAVE_LD_DEMANGLE is defined, 
there is repository information, and recompilation is necessary to 
resolve undefined symbols from the repository.  In other cases it adds 
no overhead.

Bootstrapped and regression-tested on i636 linux.  OK to check in?

-Sandra


2012-01-29  Sandra Loosemore <sandra@codesourcery.com>
	    Jason Merrill <jason@redhat.com>
	    Jakub Jelinek <jakub@redhat.com>

	PR c++/51910
	gcc/
	* tlink.c (do_tlink): Explicitly pass --no-demangle to linker
	for repo processing when HAVE_LD_DEMANGLE is defined.

	gcc/testsuite/
	* g++.dg/torture/pr51910.C: New testcase.
Jakub Jelinek - Jan. 30, 2012, 2:23 p.m.
On Sun, Jan 29, 2012 at 05:08:10PM -0700, Sandra Loosemore wrote:
> Bootstrapped and regression-tested on i636 linux.  OK to check in?

I think that is terribly expensive fix, for larger projects a single
ld invocation can take several minutes.  What exactly are advantages
of using the linker demangling as opposed to collect2 doing the
demangling as before?  Does it really justify penalizing -frepo
that much?

	Jakub
Jason Merrill - Jan. 30, 2012, 3:42 p.m.
On 01/30/2012 09:23 AM, Jakub Jelinek wrote:
> On Sun, Jan 29, 2012 at 05:08:10PM -0700, Sandra Loosemore wrote:
>> Bootstrapped and regression-tested on i636 linux.  OK to check in?
>
> I think that is terribly expensive fix, for larger projects a single
> ld invocation can take several minutes.

Agreed.  What if we always look for .rpo files, and then disable linker 
demangling if we find them?

Jason
Jakub Jelinek - Jan. 30, 2012, 4:11 p.m.
On Mon, Jan 30, 2012 at 10:42:13AM -0500, Jason Merrill wrote:
> On 01/30/2012 09:23 AM, Jakub Jelinek wrote:
> >On Sun, Jan 29, 2012 at 05:08:10PM -0700, Sandra Loosemore wrote:
> >>Bootstrapped and regression-tested on i636 linux.  OK to check in?
> >
> >I think that is terribly expensive fix, for larger projects a single
> >ld invocation can take several minutes.
> 
> Agreed.  What if we always look for .rpo files, and then disable
> linker demangling if we find them?

The *.rpo discovery can be many thousands of syscalls too for larger
links and I think -frepo is quite rarely used.

Can't tlink just take into account that sometimes different
symbols mangle the same, and handle those as a group (i.e. if
the linker output requests S::~S(), and *.rpo files contain
several _ZN1SD[0-9]Ev symbols, mark or tweak them all)?

	Jakub
Sandra Loosemore - Jan. 30, 2012, 4:26 p.m.
On 01/30/2012 07:23 AM, Jakub Jelinek wrote:
> On Sun, Jan 29, 2012 at 05:08:10PM -0700, Sandra Loosemore wrote:
>> Bootstrapped and regression-tested on i636 linux.  OK to check in?
>
> I think that is terribly expensive fix, for larger projects a single
> ld invocation can take several minutes.  What exactly are advantages
> of using the linker demangling as opposed to collect2 doing the
> demangling as before?  Does it really justify penalizing -frepo
> that much?

collect2 doesn't know how to demangle linker map files, and our 
experience has been that users are more likely to want a map file than 
to use -frepo.

-Sandra
Jason Merrill - Jan. 30, 2012, 8:26 p.m.
On 01/30/2012 11:11 AM, Jakub Jelinek wrote:
> The *.rpo discovery can be many thousands of syscalls too for larger
> links and I think -frepo is quite rarely used.

OK.

> Can't tlink just take into account that sometimes different
> symbols mangle the same, and handle those as a group (i.e. if
> the linker output requests S::~S(), and *.rpo files contain
> several _ZN1SD[0-9]Ev symbols, mark or tweak them all)?

Emitting one doesn't necessary imply emitting all of them.  Perhaps we 
could change the demangler to distinguish between the symbols, but of 
course that wouldn't affect current linkers.

What if we use linker demangling only if the user has specified creating 
a linker map?

Jason
Jakub Jelinek - Feb. 1, 2012, 1:56 p.m.
On Mon, Jan 30, 2012 at 03:26:32PM -0500, Jason Merrill wrote:
> On 01/30/2012 11:11 AM, Jakub Jelinek wrote:
> >Can't tlink just take into account that sometimes different
> >symbols mangle the same, and handle those as a group (i.e. if
> >the linker output requests S::~S(), and *.rpo files contain
> >several _ZN1SD[0-9]Ev symbols, mark or tweak them all)?
> 
> Emitting one doesn't necessary imply emitting all of them.  Perhaps

Sure, but would they ever be provided by different CUs?  If some CU
says it can provide _ZN1SD1Ev, doesn't it either always say that
it can provide _ZN1SD2Ev, or none of the CU is able to provide the latter
(at least in valid C++ without ODR violations)?
I meant if the linker says it wants S::~S(), you'd see that in CU25
the rpo file offers both of them and you'd mark both for compilation.

> we could change the demangler to distinguish between the symbols,
> but of course that wouldn't affect current linkers.
> 
> What if we use linker demangling only if the user has specified
> creating a linker map?

Do users really want to demangle linker maps?  I would never want that,
e.g. because it is ambiguous and less compact.
IMHO the best is just to back out the changes that introduced this
regression.

	Jakub
Jason Merrill - Feb. 1, 2012, 3:20 p.m.
On 02/01/2012 08:56 AM, Jakub Jelinek wrote:
> Sure, but would they ever be provided by different CUs?  If some CU
> says it can provide _ZN1SD1Ev, doesn't it either always say that
> it can provide _ZN1SD2Ev, or none of the CU is able to provide the latter
> (at least in valid C++ without ODR violations)?

Yes.

> I meant if the linker says it wants S::~S(), you'd see that in CU25
> the rpo file offers both of them and you'd mark both for compilation.

That might result in emitting variants that aren't needed, but I suppose 
that isn't so bad.  I'll give it a shot.

> Do users really want to demangle linker maps?  I would never want that,
> e.g. because it is ambiguous and less compact.

I'm surprised by that too, but apparently CodeSourcery clients do want that.

Jason
Sandra Loosemore - Feb. 1, 2012, 6:51 p.m.
On 02/01/2012 06:56 AM, Jakub Jelinek wrote:
>
> Do users really want to demangle linker maps?  I would never want that,
> e.g. because it is ambiguous and less compact.
> IMHO the best is just to back out the changes that introduced this
> regression.

I don't think this is even really a regression.  I haven't actually sat 
down to try to reproduce this, but backing out the previous patch would 
leave the -frepo behavior even more broken on Windows hosts than it is 
now; one of the bugs fixed by the previous patch was that on Windows 
collect2 *always* told ld to demangle names and there was no way for 
users to override this via a command-line option or setting 
COLLECT_NO_DEMANGLE.  And, I think the -frepo bug was present on Linux 
hosts too if you explicitly configured your GCC build with 
--with-demangler-in-ld, which is what you need to do if you want ld to 
be able to produce demangled link maps at all.

The patch I posted the other day does reconcile the problem when the 
user wants both a demangled link map and -frepo processing, although it 
does add the two extra link steps.  Maybe we can figure out a way to 
bypass the extra steps so that they only happen when the user has 
specified -Wl,-Map and not --no-demangle?  And/or make explicit -frepo 
on the link command line shortcut some of it?

(BTW, the CodeSourcery IDE adds -Wl,-Map to the link line by default 
because it's easier to always generate a link map than to have to 
explain to users how to get one.  Especially in the embedded space, it's 
something many of our users actually do look at.)

-Sandra
Jason Merrill - Feb. 1, 2012, 7:31 p.m.
On 02/01/2012 01:51 PM, Sandra Loosemore wrote:
> I don't think this is even really a regression.

The testcase works with 4.6 and not with pre-4.7; that makes it a 
regression.

> I haven't actually sat
> down to try to reproduce this, but backing out the previous patch would
> leave the -frepo behavior even more broken on Windows hosts than it is
> now; one of the bugs fixed by the previous patch was that on Windows
> collect2 *always* told ld to demangle names and there was no way for
> users to override this via a command-line option or setting
> COLLECT_NO_DEMANGLE. And, I think the -frepo bug was present on Linux
> hosts too if you explicitly configured your GCC build with
> --with-demangler-in-ld, which is what you need to do if you want ld to
> be able to produce demangled link maps at all.

Yep.

> (BTW, the CodeSourcery IDE adds -Wl,-Map to the link line by default
> because it's easier to always generate a link map than to have to
> explain to users how to get one. Especially in the embedded space, it's
> something many of our users actually do look at.)

Why do you want the link map to be demangled?  It seems more reliable to 
deal with mangled symbols; there's always c++filt if you want to check 
what the symbols demangle to.

Jason
Sandra Loosemore - Feb. 1, 2012, 8:50 p.m.
On 02/01/2012 12:31 PM, Jason Merrill wrote:
>
> Why do you want the link map to be demangled? It seems more reliable to
> deal with mangled symbols; there's always c++filt if you want to check
> what the symbols demangle to.

It's certainly more reliable for automated tools to deal with mangled 
symbols, but users (especially users who are not also compiler 
implementors!) have a very hard time translating mangled names into 
names that actually appear in their program source code.  Additionally, 
demangling is the documented default behavior of ld and it's not exactly 
user-friendly for the linker to completely ignore both its usual default 
and explicit requests for demangled output when invoked via collect2. 
Using gcc to link your program instead of invoking ld directly is 
supposed to make things *easier* for users, not harder.

-Sandra

Patch

Index: gcc/tlink.c
===================================================================
--- gcc/tlink.c	(revision 183674)
+++ gcc/tlink.c	(working copy)
@@ -777,23 +777,53 @@  do_tlink (char **ld_argv, char **object_
       /* Until collect does a better job of figuring out which are object
 	 files, assume that everything on the command line could be.  */
       if (read_repo_files (ld_argv))
-	while (exit && i++ < MAX_ITERATIONS)
-	  {
-	    if (tlink_verbose >= 3)
-	      {
-		dump_file (ldout, stdout);
-		dump_file (lderrout, stderr);
-	      }
-	    demangle_new_symbols ();
-	    if (! scan_linker_output (ldout)
-		&& ! scan_linker_output (lderrout))
-	      break;
-	    if (! recompile_files ())
-	      break;
-	    if (tlink_verbose)
-	      fprintf (stderr, _("collect: relinking\n"));
-	    exit = tlink_execute ("ld", ld_argv, ldout, lderrout);
-	  }
+	{
+	  char **ld1_argv = ld_argv;
+
+#ifdef HAVE_LD_DEMANGLE
+	  /* We must explicitly tell the linker not to demangle names
+	     and re-do the initial link attempt, since repository processing
+	     requires demangled error output from the linker.  */
+	  int argc = 0;
+	  while (ld_argv[argc])
+	    ++argc;
+	  ld1_argv = XNEWVEC (char *, argc + 2);
+	  memcpy (ld1_argv, ld_argv, sizeof (ld_argv[0]) * argc);
+	  ld1_argv[argc] = CONST_CAST (char *, "--no-demangle");
+	  ld1_argv[argc + 1] = NULL;
+	  if (tlink_verbose)
+	    fprintf (stderr, _("collect: relinking\n"));
+	  exit = tlink_execute ("ld", ld1_argv, ldout, lderrout);
+#endif
+
+	  while (exit && i++ < MAX_ITERATIONS)
+	    {
+	      if (tlink_verbose >= 3)
+		{
+		  dump_file (ldout, stdout);
+		  dump_file (lderrout, stderr);
+		}
+	      demangle_new_symbols ();
+	      if (! scan_linker_output (ldout)
+		  && ! scan_linker_output (lderrout))
+		break;
+	      if (! recompile_files ())
+		break;
+	      if (tlink_verbose)
+		fprintf (stderr, _("collect: relinking\n"));
+	      exit = tlink_execute ("ld", ld1_argv, ldout, lderrout);
+	    }
+
+#ifdef HAVE_LD_DEMANGLE
+	  /* Now redo the final link with the original options so that
+	     any remaining errors, the link map, etc are presented to
+	     the user with the original mangling options.*/
+	  XDELETEVEC (ld1_argv);
+	  if (tlink_verbose)
+	    fprintf (stderr, _("collect: relinking\n"));
+	  exit = tlink_execute ("ld", ld_argv, ldout, lderrout);
+#endif
+	}
     }
 
   dump_file (ldout, stdout);
Index: gcc/testsuite/g++.dg/torture/pr51910.C
===================================================================
--- gcc/testsuite/g++.dg/torture/pr51910.C	(revision 0)
+++ gcc/testsuite/g++.dg/torture/pr51910.C	(revision 0)
@@ -0,0 +1,19 @@ 
+// PR c++/51910 
+// Check that -frepo works in the presence of linker symbol demangling.
+//
+// { dg-options "-frepo -Wl,--demangle" }
+// { dg-require-host-local "" }
+// { dg-skip-if "dkms are not final links" { vxworks_kernel } }
+
+template<typename T>
+struct Foo
+{
+  virtual ~Foo() { }
+};
+
+int main( int, char*[] )
+{
+  Foo<int> test;
+}
+
+// { dg-final { cleanup-repo-files } }