diff mbox

Real fix for AIX exception handling

Message ID 6b0acae2-7243-efe1-6e78-4ff89f0eb25d@ssi-schaefer.com
State New
Headers show

Commit Message

Michael Haubenwallner March 30, 2017, 8:11 a.m. UTC
On 03/29/2017 10:21 PM, David Edelsohn wrote:
> On Wed, Mar 29, 2017 at 3:50 PM, Jeff Law <law@redhat.com> wrote:
>> On 03/27/2017 09:41 AM, David Edelsohn wrote:
>>>>
>>>> As far as I have discovered, the real problem with AIX exception handling
>>>> is
>>>> that the exception landing pads are symbols that must not (but still are)
>>>> exported from shared libraries - even libstdc++.
>>>>
>>>> I'm wondering if attached libtool(!)-patch would fix even that GDB
>>>> problem
>>>> once applied to each(!) shared library creation procedure.
>>>>
>>>> However, each workaround still applies as long as there's a single shared
>>>> library involved that has not stopped exporting these symbols yet.
>>>>
>>>> Thoughts?
>>>>
>>>> Maybe gcc's collect2 should apply this additional symbol filter itself
>>>> calling (AIX) ld, rather than leaving this to each build system?
>>>>
>>>> * m4/libtool.m4 (_LT_LINKER_SHLIBS): On AIX, GNU g++ generates
>>>> _GLOBAL__ symbols as, amongst others, landing pads for C++ exceptions.
>>>> These symbols must not be exported from shared libraries, or exception
>>>> handling may break for applications with runtime linking enabled.
>>>
>>>
>>> Hi, Michael
>>>
>>> Thanks for the analysis.
>>>
>>> The problem with EH for GDB involves static linking, not runtime
>>> linking.
>>
>> That seems to be my understanding as well.
>>
>>> And there seems to be different behavior for GCC 4.8 and GCC
>>> 4.9.
>>
>> Could it perhaps be an IPA issue -- we know IPA can change symbol
>> scope/linkage  in some cases.  Maybe it's mucking things up.  Is there more
>> detail in a thread elsewhere on this issue?
> 
> The problem is GCC EH tables and static linking.  libstdc++ and the
> main application are ending up with two separate copies of the tables
> to register EH frames.

When statically linked, shouldn't collect2 add libstdc++'s EH frames to
the main executable's registration table again?
Or is libstdc++'s constructor called instead?

> Static linking worked in GCC 4.8, but not in GCC 4.9.  I have been
> trying to understand what changed and if GCC 4.8 worked by accident.

Wild guess:
When (and how) did you disable runtime linking (-G) for libstdc++?
Maybe there's a side effect related to -bsymbolic when statically linking
a shared object.

> Note that AIX does not install a separate libstdc++ static archive and
> instead statically links against the shared object.

Note that libtool's --with-aix-soname=svr4 would behave different here...

> libstdc++
> apparently uses the EH table referenced when it was bound by collect2
> and the application uses the one created when the executable is
> created -- the libgcc_eh.a solution doesn't work.  Again, the question
> is why this apparently functioned correctly for GCC.4.8.
> 
> There was a change to constructor order around that time and that may
> have affected where EH frames were recorded.

Next wild guess: When libstdc++'s EH frames are registered calling
libstdc++'s constructor even when statically linked rather than being
added to main executable's table, both registered EH tables may overlap
each other - where attached patch might help...

Thanks!
/haubi/

Comments

David Edelsohn March 30, 2017, 1 p.m. UTC | #1
On Thu, Mar 30, 2017 at 4:11 AM, Michael Haubenwallner
<michael.haubenwallner@ssi-schaefer.com> wrote:
> On 03/29/2017 10:21 PM, David Edelsohn wrote:
>> On Wed, Mar 29, 2017 at 3:50 PM, Jeff Law <law@redhat.com> wrote:
>>> On 03/27/2017 09:41 AM, David Edelsohn wrote:
>>>>>
>>>>> As far as I have discovered, the real problem with AIX exception handling
>>>>> is
>>>>> that the exception landing pads are symbols that must not (but still are)
>>>>> exported from shared libraries - even libstdc++.
>>>>>
>>>>> I'm wondering if attached libtool(!)-patch would fix even that GDB
>>>>> problem
>>>>> once applied to each(!) shared library creation procedure.
>>>>>
>>>>> However, each workaround still applies as long as there's a single shared
>>>>> library involved that has not stopped exporting these symbols yet.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Maybe gcc's collect2 should apply this additional symbol filter itself
>>>>> calling (AIX) ld, rather than leaving this to each build system?
>>>>>
>>>>> * m4/libtool.m4 (_LT_LINKER_SHLIBS): On AIX, GNU g++ generates
>>>>> _GLOBAL__ symbols as, amongst others, landing pads for C++ exceptions.
>>>>> These symbols must not be exported from shared libraries, or exception
>>>>> handling may break for applications with runtime linking enabled.
>>>>
>>>>
>>>> Hi, Michael
>>>>
>>>> Thanks for the analysis.
>>>>
>>>> The problem with EH for GDB involves static linking, not runtime
>>>> linking.
>>>
>>> That seems to be my understanding as well.
>>>
>>>> And there seems to be different behavior for GCC 4.8 and GCC
>>>> 4.9.
>>>
>>> Could it perhaps be an IPA issue -- we know IPA can change symbol
>>> scope/linkage  in some cases.  Maybe it's mucking things up.  Is there more
>>> detail in a thread elsewhere on this issue?
>>
>> The problem is GCC EH tables and static linking.  libstdc++ and the
>> main application are ending up with two separate copies of the tables
>> to register EH frames.
>
> When statically linked, shouldn't collect2 add libstdc++'s EH frames to
> the main executable's registration table again?
> Or is libstdc++'s constructor called instead?
>
>> Static linking worked in GCC 4.8, but not in GCC 4.9.  I have been
>> trying to understand what changed and if GCC 4.8 worked by accident.
>
> Wild guess:
> When (and how) did you disable runtime linking (-G) for libstdc++?
> Maybe there's a side effect related to -bsymbolic when statically linking
> a shared object.

Yes, two hypotheses are:

1) The removal of -G AIX linker option that allowed runtime overriding
of libstdc++ symbols and somehow allowed merging of symbols when
linking statically.

2) Change in order of initialization from AIX default breadth first to
force SVR4-like depth first.

>
>> Note that AIX does not install a separate libstdc++ static archive and
>> instead statically links against the shared object.
>
> Note that libtool's --with-aix-soname=svr4 would behave different here...
>
>> libstdc++
>> apparently uses the EH table referenced when it was bound by collect2
>> and the application uses the one created when the executable is
>> created -- the libgcc_eh.a solution doesn't work.  Again, the question
>> is why this apparently functioned correctly for GCC.4.8.
>>
>> There was a change to constructor order around that time and that may
>> have affected where EH frames were recorded.
>
> Next wild guess: When libstdc++'s EH frames are registered calling
> libstdc++'s constructor even when statically linked rather than being
> added to main executable's table, both registered EH tables may overlap
> each other - where attached patch might help...

Thanks, David
Jeff Law March 31, 2017, 2:42 p.m. UTC | #2
On 03/30/2017 02:11 AM, Michael Haubenwallner wrote:
>
> When statically linked, shouldn't collect2 add libstdc++'s EH frames to
> the main executable's registration table again?
> Or is libstdc++'s constructor called instead?
I would think the latter -- because libstdc++ is already linked into a 
DSO.  If libstdc++ was an archive library, then it would be the former.

Jeff
diff mbox

Patch

From 7ed6bd3bba4e3161e761d4cdb8393ec9bcf98038 Mon Sep 17 00:00:00 2001
From: Michael Haubenwallner <michael.haubenwallner@ssi-schaefer.com>
Date: Mon, 8 Feb 2016 12:37:56 +0100
Subject: [PATCH] libgcc: On AIX, increase chances to find landing pads for
 exceptions.

* unwind-dw2-fde.c (_Unwind_Find_FDE): Stop assuming registered
object's address ranges to not overlap.
---
 libgcc/ChangeLog        |  6 ++++++
 libgcc/unwind-dw2-fde.c | 20 ++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index 4bae69f..ffa6f60 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,9 @@ 
+2016-02-08  Michael Haubenwallner  <michael.haubenwallner@ssi-schaefer.com>
+
+	On AIX, increase chances to find landing pads for exceptions.
+	* unwind-dw2-fde.c (_Unwind_Find_FDE): Stop assuming registered
+	object's address ranges to not overlap.
+
 2017-03-10  John Marino  <gnugcc@marino.st>
 
 	* config/aarch64/freebsd-unwind.h: New file.
diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
index 02b588d..9cf2d65 100644
--- a/libgcc/unwind-dw2-fde.c
+++ b/libgcc/unwind-dw2-fde.c
@@ -1054,7 +1054,27 @@  _Unwind_Find_FDE (void *pc, struct dwarf_eh_bases *bases)
 	f = search_object (ob, pc);
 	if (f)
 	  goto fini;
+	/* In an ideal world, even on AIX, we could break here because
+	   objects would not overlap.  But the larger an application is,
+	   the more likely an "overlap" may happen (on AIX) because of:
+	   - Shared libraries do export the FDE symbols ("_GLOBAL__F*"),
+	     which is a bug in their build system, out of gcc's control.
+	   - Other shared libraries, or the main executable, may contain
+	     identical or similar object files - which is suboptimal, but
+	     may be intentional.  However, exporting their FDE symbols,
+	     which may have identical symbol names as in their original
+	     shared libraries, again is a bug in their build system, but
+	     still out of gcc's control.
+	   - When enabled, run time linking may redirect adresses of
+	     duplicate FDE symbols from their original shared library's
+	     address range into another shared library's or the main
+	     executable's address range, when they share the same FDE
+	     symbol name.
+	   This results in address ranges being registered by different
+	   object to potentially overlap.  */
+#if !(defined(_POWER) && defined(_AIX))
 	break;
+#endif
       }
 
   /* Classify and search the objects we've not yet processed.  */
-- 
2.10.2