diff mbox series

libaio: Fix library creation for ARC with -Os

Message ID 20180917204329.15787-1-abrodkin@synopsys.com
State Superseded
Headers show
Series libaio: Fix library creation for ARC with -Os | expand

Commit Message

Alexey Brodkin Sept. 17, 2018, 8:43 p.m. UTC
On ARC if "-Os" optimization is used compiler prefers to use so-called
millicode (basically function prologue & epilogue) implemented once and for
all in libgcc.a. And if we don't link with libgcc.a on DSO (read libaio.so)
creation those millicode functions won't be included, instead their
symbols will be referenced as they were in libgcc (HIDDEN).

And then when LD is about to link a final application it sees HIDDEN
symbol in libaio.so that is supposed to come from some static
libgcc.a... wait.. what? Shared library has unresolved symbol
implemented in static library? No, that's not right I guess :)

So to resolve that wrong sequence we just make sure libaio.so has
its own copy of used millicode functions implemented locally.

BTW I guess something similar might happen for PowerPC,
see https://git.buildroot.org/buildroot/commit/?id=ce6536ae500fc4ac0c201d5cb4edf39dd1b4d386
--------------------------->8------------------------
hidden symbol `_rest32gpr_30_x' in libgcc.a(e500crtresx32gpr.o) is referenced by DSO
--------------------------->8------------------------

Fixes: http://autobuild.buildroot.net/?reason=blktrace-1.2.0&arch=arc

Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Evgeniy Didin <Evgeniy.Didin@synopsys.com>
Cc: arc-buildroot@synopsys.com
---
 package/libaio/libaio.mk | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Thomas Petazzoni Sept. 18, 2018, 7:36 a.m. UTC | #1
Hello,

Thanks Alexey for working on this. I have a few questions below, to
better understand the problem and its solution.

On Mon, 17 Sep 2018 23:43:29 +0300, Alexey Brodkin wrote:
> On ARC if "-Os" optimization is used compiler prefers to use so-called
> millicode (basically function prologue & epilogue) implemented once and for
> all in libgcc.a.

Until this point, I follow.

> And if we don't link with libgcc.a on DSO (read libaio.so)

I don't understand "link with libgcc.A on DSO" and therefore the rest
of the explanation is a bit unclear to me.

> creation those millicode functions won't be included, instead their
> symbols will be referenced as they were in libgcc (HIDDEN).
> 
> And then when LD is about to link a final application it sees HIDDEN
> symbol in libaio.so that is supposed to come from some static
> libgcc.a... wait.. what? Shared library has unresolved symbol
> implemented in static library? No, that's not right I guess :)
> 
> So to resolve that wrong sequence we just make sure libaio.so has
> its own copy of used millicode functions implemented locally.

Why is this problem happening only with libaio, and not with other
packages ? Is this the final fix, or is this a temporary workaround ?

> BTW I guess something similar might happen for PowerPC,
> see https://git.buildroot.org/buildroot/commit/?id=ce6536ae500fc4ac0c201d5cb4edf39dd1b4d386
> --------------------------->8------------------------  
> hidden symbol `_rest32gpr_30_x' in libgcc.a(e500crtresx32gpr.o) is referenced by DSO
> --------------------------->8------------------------  

Ah, interesting. So there is really something special about libaio.so
forgetting to link with libgcc ? Should this also be changed to cover
PowerPC to remove the workaround ?

Best regards,

Thomas
Alexey Brodkin Sept. 18, 2018, 8:03 a.m. UTC | #2
Hi Thomas,

On Tue, 2018-09-18 at 09:36 +0200, Thomas Petazzoni wrote:
> Hello,
> 
> Thanks Alexey for working on this. I have a few questions below, to
> better understand the problem and its solution.
> 
> On Mon, 17 Sep 2018 23:43:29 +0300, Alexey Brodkin wrote:
> > On ARC if "-Os" optimization is used compiler prefers to use so-called
> > millicode (basically function prologue & epilogue) implemented once and for
> > all in libgcc.a.
> 
> Until this point, I follow.
> 
> > And if we don't link with libgcc.a on DSO (read libaio.so)
> 
> I don't understand "link with libgcc.A on DSO" and therefore the rest
> of the explanation is a bit unclear to me.

This is how libaio.so is linked now (before my patch) with
omitted some non-important options:
----------------------->8-------------------
arc-buildroot-linux-uclibc-gcc -shared -matomic -Os -nostdlib -nostartfiles -fPIC \
  -o libaio.so io_queue_init.os io_queue_release.os io_queue_wait.os io_queue_run.os \
  io_getevents.os io_submit.os io_cancel.os io_setup.os io_destroy.os raw_syscall.os compat-0_1.os
----------------------->8-------------------

And that's what happens with my patch:
----------------------->8-------------------
arc-buildroot-linux-uclibc-gcc -shared -matomic -Os -nostdlib -nostartfiles -fPIC \
  -o libaio.so io_queue_init.os io_queue_release.os io_queue_wait.os io_queue_run.os \
  io_getevents.os io_submit.os io_cancel.os io_setup.os io_destroy.os raw_syscall.os compat-0_1.os -lgcc
----------------------->8-------------------

Note "-lgcc" in the latter case.

Now let's see what happens with millicode symbols in both cases.
Pre-patch case:
----------------------->8-------------------
# arc-linux-gnu-readelf -s libaio.so | grep __st_r13_to_r17
    3: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __st_r13_to_r17
   41: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __st_r13_to_r17
----------------------->8-------------------

Note millicode's "__st_r13_to_r17" is GLOBAL UNDEFINED symbol.
Thus during linkage of the final application that symbol will
need to be resolved and LD finds it in libgcc.a and essentially
refuses to proceed as there's no way for shared library to
use a function that is implemented in a different static library.

Post-patch case:
----------------------->8-------------------
# arc-linux-gnu-readelf -s libaio.so | grep __st_r13_to_r17
   38: 000006cc    18 FUNC    LOCAL  DEFAULT    9 __st_r13_to_r17
----------------------->8-------------------

Note "__st_r13_to_r17" is locally defined so shared library doesn't
need anything from libgcc.a.

> > creation those millicode functions won't be included, instead their
> > symbols will be referenced as they were in libgcc (HIDDEN).
> > 
> > And then when LD is about to link a final application it sees HIDDEN
> > symbol in libaio.so that is supposed to come from some static
> > libgcc.a... wait.. what? Shared library has unresolved symbol
> > implemented in static library? No, that's not right I guess :)
> > 
> > So to resolve that wrong sequence we just make sure libaio.so has
> > its own copy of used millicode functions implemented locally.
> 
> Why is this problem happening only with libaio, and not with other
> packages ?

That's a good question :)
I may assume other shared libs gets linked with libgcc.a, i.e. all
symbols from libgcc including millicode get built into libXXX.so.

> Is this the final fix, or is this a temporary workaround ?

This is the correct fix for that problem we were seeing
with those packages that depend on libaio.

> > BTW I guess something similar might happen for PowerPC,
> > see 
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.buildroot.org_buildroot_commit_-3Fid-3Dce6536ae500fc4ac0c201d5cb4edf39dd1b4d386&d=DwICAg&c=DPL6_X_6JkXFx7AXWqB0tg&r=lqdeeSSEes0GFDDl656eViXO7breS55ytWkhpk5R81I&m=_sEB1SuyEghlCEki0fjXd9hCsCHOpiuQfkzhkrbxc5g&s=K49USLQoVu9LQm1rrUzqOt-E4XGsh4BcLKi0flkChnM&e=
> > --------------------------->8------------------------  
> > hidden symbol `_rest32gpr_30_x' in libgcc.a(e500crtresx32gpr.o) is referenced by DSO
> > --------------------------->8------------------------  
> 
> Ah, interesting. So there is really something special about libaio.so
> forgetting to link with libgcc?

Looks like that.

> Should this also be changed to cover PowerPC to remove the workaround ?

I think so.

Though essentially we'll need to:
 a) Reproduce PPC issue with
    commit ce6536ae500f "libaio: work-around for PowerPC issue"
    reverted.

 b) See if addition of "libgcc" fixes reproduced above problem.

I'll try to do that today.

-Alexey
Thomas Petazzoni Sept. 18, 2018, 9:15 a.m. UTC | #3
Hello,

On Tue, 18 Sep 2018 08:03:19 +0000, Alexey Brodkin wrote:

> This is how libaio.so is linked now (before my patch) with
> omitted some non-important options:
> ----------------------->8-------------------  
> arc-buildroot-linux-uclibc-gcc -shared -matomic -Os -nostdlib -nostartfiles -fPIC \
>   -o libaio.so io_queue_init.os io_queue_release.os io_queue_wait.os io_queue_run.os \
>   io_getevents.os io_submit.os io_cancel.os io_setup.os io_destroy.os raw_syscall.os compat-0_1.os
> ----------------------->8-------------------  
> 
> And that's what happens with my patch:
> ----------------------->8-------------------  
> arc-buildroot-linux-uclibc-gcc -shared -matomic -Os -nostdlib -nostartfiles -fPIC \
>   -o libaio.so io_queue_init.os io_queue_release.os io_queue_wait.os io_queue_run.os \
>   io_getevents.os io_submit.os io_cancel.os io_setup.os io_destroy.os raw_syscall.os compat-0_1.os -lgcc
> ----------------------->8-------------------  
> 
> Note "-lgcc" in the latter case.

Thank you, I do understand that passing LDFLAGS=-lgcc will add -lgcc to
the link command line :-)

The question is: why does libaio needs to be told explicitly to link
against libgcc, and not any other package.

And I believe the reason is because -nostdlib -nostartfiles are used.
Why is libaio using those flags? Is there a good reason ?

> Now let's see what happens with millicode symbols in both cases.
> Pre-patch case:
> ----------------------->8-------------------  
> # arc-linux-gnu-readelf -s libaio.so | grep __st_r13_to_r17
>     3: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __st_r13_to_r17
>    41: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __st_r13_to_r17
> ----------------------->8-------------------  
> 
> Note millicode's "__st_r13_to_r17" is GLOBAL UNDEFINED symbol.
> Thus during linkage of the final application that symbol will
> need to be resolved and LD finds it in libgcc.a and essentially
> refuses to proceed as there's no way for shared library to
> use a function that is implemented in a different static library.
> 
> Post-patch case:
> ----------------------->8-------------------  
> # arc-linux-gnu-readelf -s libaio.so | grep __st_r13_to_r17
>    38: 000006cc    18 FUNC    LOCAL  DEFAULT    9 __st_r13_to_r17
> ----------------------->8-------------------  
> 
> Note "__st_r13_to_r17" is locally defined so shared library doesn't
> need anything from libgcc.a.

Indeed.

> > Why is this problem happening only with libaio, and not with other
> > packages ?  
> 
> That's a good question :)
> I may assume other shared libs gets linked with libgcc.a, i.e. all
> symbols from libgcc including millicode get built into libXXX.so.

See above: -nostdlib -nostartfiles are used in libaio. That's the
difference. The question is why libaio is using those flags.

Best regards,

Thomas
Alexey Brodkin Sept. 18, 2018, 9:17 a.m. UTC | #4
Hi Thomas,

On Tue, 2018-09-18 at 11:15 +0200, Thomas Petazzoni wrote:
> Hello,
> 
> On Tue, 18 Sep 2018 08:03:19 +0000, Alexey Brodkin wrote:
> 
> > This is how libaio.so is linked now (before my patch) with
> > omitted some non-important options:
> > ----------------------->8-------------------  
> > arc-buildroot-linux-uclibc-gcc -shared -matomic -Os -nostdlib -nostartfiles -fPIC \
> >   -o libaio.so io_queue_init.os io_queue_release.os io_queue_wait.os io_queue_run.os \
> >   io_getevents.os io_submit.os io_cancel.os io_setup.os io_destroy.os raw_syscall.os compat-0_1.os
> > ----------------------->8-------------------  
> > 
> > And that's what happens with my patch:
> > ----------------------->8-------------------  
> > arc-buildroot-linux-uclibc-gcc -shared -matomic -Os -nostdlib -nostartfiles -fPIC \
> >   -o libaio.so io_queue_init.os io_queue_release.os io_queue_wait.os io_queue_run.os \
> >   io_getevents.os io_submit.os io_cancel.os io_setup.os io_destroy.os raw_syscall.os compat-0_1.os -lgcc
> > ----------------------->8-------------------  
> > 
> > Note "-lgcc" in the latter case.
> 
> Thank you, I do understand that passing LDFLAGS=-lgcc will add -lgcc to
> the link command line :-)
> 
> The question is: why does libaio needs to be told explicitly to link
> against libgcc, and not any other package.
> 
> And I believe the reason is because -nostdlib -nostartfiles are used.
> Why is libaio using those flags? Is there a good reason ?

I tried removing those 2 as the first thing but it made no difference
until I added "-lgcc" explicitly.

-Alexey
Thomas Petazzoni Sept. 18, 2018, 9:20 a.m. UTC | #5
Hello,

On Tue, 18 Sep 2018 11:15:06 +0200, Thomas Petazzoni wrote:

> And I believe the reason is because -nostdlib -nostartfiles are used.
> Why is libaio using those flags? Is there a good reason ?

BTW: https://patchwork.openembedded.org/patch/106979/. It is still
there in their tree:

  https://github.com/openembedded/openembedded-core/blob/master/meta/recipes-extended/libaio/libaio/system-linkage.patch

In Debian:

  https://sources.debian.org/patches/libaio/0.3.111-1/01_link_libs.patch/

In Gentoo:

  https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-libs/libaio/files/libaio-0.3.110-link-stdlib.patch

Perhaps we should use the same solution, and submit it upstream :-)

I looked at the Git history of libaio, and the first import of libaio
already had those flags, so there is no useful Git commit information
to explain those flags.

Best regards,

Thomas
Thomas Petazzoni Sept. 18, 2018, 9:26 a.m. UTC | #6
Hello,

On Tue, 18 Sep 2018 09:17:20 +0000, Alexey Brodkin wrote:

> > Thank you, I do understand that passing LDFLAGS=-lgcc will add -lgcc to
> > the link command line :-)
> > 
> > The question is: why does libaio needs to be told explicitly to link
> > against libgcc, and not any other package.
> > 
> > And I believe the reason is because -nostdlib -nostartfiles are used.
> > Why is libaio using those flags? Is there a good reason ?  
> 
> I tried removing those 2 as the first thing but it made no difference
> until I added "-lgcc" explicitly.

Then there is another toolchain bug, because linking against libgcc
should be done automatically as soon as a libgcc feature gets used by
the thing being compiled/linked.

Best regards,

Thomas
Alexey Brodkin Sept. 18, 2018, 9:30 a.m. UTC | #7
Cuper, Claus,

On Tue, 2018-09-18 at 11:26 +0200, Thomas Petazzoni wrote:
> Hello,
> 
> On Tue, 18 Sep 2018 09:17:20 +0000, Alexey Brodkin wrote:
> 
> > > Thank you, I do understand that passing LDFLAGS=-lgcc will add -lgcc to
> > > the link command line :-)
> > > 
> > > The question is: why does libaio needs to be told explicitly to link
> > > against libgcc, and not any other package.
> > > 
> > > And I believe the reason is because -nostdlib -nostartfiles are used.
> > > Why is libaio using those flags? Is there a good reason ?  
> > 
> > I tried removing those 2 as the first thing but it made no difference
> > until I added "-lgcc" explicitly.
> 
> Then there is another toolchain bug, because linking against libgcc
> should be done automatically as soon as a libgcc feature gets used by
> the thing being compiled/linked.

What do you think about that?

-Alexey
Alexey Brodkin Sept. 18, 2018, 12:44 p.m. UTC | #8
Hi Claus,

On Tue, 2018-09-18 at 14:12 +0200, Claudiu Zissulescu wrote:
> Hi,
> 
> Millicode is a special function call which doesn't behave like normal calls. So, once you use -fpic/dynamic and friends on the gcc driver, the
> latter assumes all the libs are ready to go via dynamic linker, but not miliicode which is using the regular non-plt relocations to resolve the
> millicode symbols. Hence they need to be explicitly linked.
> 
> Why the linker doesn't throw an error when detects such an odd situation? It goes to some historic reasons which are linked on how u-boot is
> compiled.

I'm not sure though what it has to do with U-Boot especially given U-Boot uses its own
[reduced] copy of libgcc.

> What it needs to be done to avoid this issue? Well, I would say don't use millicode which is more an optimization for bare-metal apps. If you really
> want to use it, then you need to link the libgcc explicitly.

Sure that's indeed an option. We'll just finally add "-mno-millicode" in TARGET_ABI for ARC and forget about that class of problems.
Also maybe it worth having "-mno-millicode" set by default for Linux toolchains so looking forward there will be no need to mess with millicode
manually.

-Alexey
Thomas Petazzoni Sept. 18, 2018, 1:58 p.m. UTC | #9
Hello,

On Tue, 18 Sep 2018 12:44:26 +0000, Alexey Brodkin wrote:

> Sure that's indeed an option. We'll just finally add "-mno-millicode"
> in TARGET_ABI for ARC and forget about that class of problems. Also
> maybe it worth having "-mno-millicode" set by default for Linux
> toolchains so looking forward there will be no need to mess with
> millicode manually.

Yes, that's indeed one solution, though the problem also exists for
PowerPC, so it would have been nice to understand it.

What I find weird is that why it happens only with libaio ? I mean
there are plenty of libraries built with -fpic, and libaio is the only
one causing this problem. The fact that libaio is using -nostdlib
-nostartfiles was an initial hint to explain why libgcc isn't brought
into the link, but as Alexey said, removing -nostdlib -nostartfiles
doesn't solve the problem.

Best regards,

Thomas
Alexey Brodkin Sept. 28, 2018, 1:36 p.m. UTC | #10
Hi Thomas,

On Tue, 2018-09-18 at 15:58 +0200, Thomas Petazzoni wrote:
> Hello,
> 
> On Tue, 18 Sep 2018 12:44:26 +0000, Alexey Brodkin wrote:
> 
> > Sure that's indeed an option. We'll just finally add "-mno-millicode"
> > in TARGET_ABI for ARC and forget about that class of problems. Also
> > maybe it worth having "-mno-millicode" set by default for Linux
> > toolchains so looking forward there will be no need to mess with
> > millicode manually.
> 
> Yes, that's indeed one solution, though the problem also exists for
> PowerPC, so it would have been nice to understand it.
> 
> What I find weird is that why it happens only with libaio ? I mean
> there are plenty of libraries built with -fpic, and libaio is the only
> one causing this problem. The fact that libaio is using -nostdlib
> -nostartfiles was an initial hint to explain why libgcc isn't brought
> into the link, but as Alexey said, removing -nostdlib -nostartfiles
> doesn't solve the problem.

Not sure what kind of pot I was smoking to make a conclusion that
removal of both "-nostdlib" and "nostartfiles" makes no difference.
Maybe my 

I returned back to this topic and decided to give it another shot based on
Debian fix you pointed me to [1]. And I do see it works for both PPC and ARC:
-------------------------------->8---------------------------------
# arc-linux-gcc -shared -Os -Wall -I. -fPIC -Wl,--version-script=libaio.map \
  -Wl,-soname=libaio.so.1 -o libaio.so.1.0.1 io_queue_init.os io_queue_release.os \
  io_queue_wait.os io_queue_run.os io_getevents.os io_submit.os io_cancel.os \
  io_setup.os io_destroy.os raw_syscall.os compat-0_1.os

# arc-linux-readelf -s libaio.so.1.0.1 | grep __ld
    59: 00000ab4    50 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r23_ret
    62: 00000ad0    22 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r16_ret
    63: 00000abc    42 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r21_ret
    65: 00000ac4    34 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r19_ret
    66: 00000ab8    46 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r22_ret
    77: 00000acc    26 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r17_ret
    83: 00000ab0    54 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r24_ret
    84: 00000ac8    30 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r18_ret
    86: 00000aac    58 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r25_ret
    88: 00000ac0    38 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r20_ret
    89: 00000ad4    18 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r15_ret
    94: 00000ad8    14 FUNC    LOCAL  DEFAULT   11 __ld_r13_to_r14_ret

# arc-linux-gcc -shared -Os -nostdlib -nostartfiles -Wall -I. -fPIC \
  -Wl,--version-script=libaio.map -Wl,-soname=libaio.so.1 -o libaio.so.1.0.1 \
  io_queue_init.os io_queue_release.os io_queue_wait.os io_queue_run.os \
  io_getevents.os io_submit.os io_cancel.os io_setup.os io_destroy.os \
  raw_syscall.os compat-0_1.os

#  arc-linux-readelf -s libaio.so.1.0.1 | grep __ld
     4: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r16_ret
     6: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r19_ret
     9: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r17_ret
    11: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r15_ret
    42: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r16_ret
    44: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r19_ret
    51: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r17_ret
    55: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __ld_r13_to_r15_ret
-------------------------------->8---------------------------------

You see in case of "-nostdlib" and "nostartfiles" absence symbol
__ld_rX_to_rY_ret is LOCAL while in presence of them 
__ld_rX_to_rY_ret is GLOBAL - and that's what you expected :)

That said I created a pull-request with corresponding fix for libaio [2]
and will shortly send a patch for Buildroot with it.

[1] https://sources.debian.org/patches/libaio/0.3.111-1/01_link_libs.patch/
[2] https://pagure.io/libaio/pull-request/7

-Alexey
diff mbox series

Patch

diff --git a/package/libaio/libaio.mk b/package/libaio/libaio.mk
index adb4d1c4b19e..2a452b0296c2 100644
--- a/package/libaio/libaio.mk
+++ b/package/libaio/libaio.mk
@@ -22,6 +22,13 @@  ifeq ($(BR2_powerpc),y)
 LIBAIO_CONFIGURE_OPTS += CFLAGS="$(subst -Os,-O2,$(TARGET_CFLAGS))"
 endif
 
+# Make sure millicode functions get their implmentations in final DSO
+# Otherwise later LD will righteously refuse to link an application saying:
+#  "hidden symbol __xxx in libgcc.a is referenced by DSO".
+ifeq ($(BR2_arc),y)
+LIBAIO_CONFIGURE_OPTS += LDFLAGS=-lgcc
+endif
+
 define LIBAIO_BUILD_CMDS
 	$(LIBAIO_CONFIGURE_OPTS) $(TARGET_MAKE_ENV) $(MAKE) -C $(@D)
 endef