Message ID | 20230302112519.914641-1-arsen@gentoo.org |
---|---|
State | New |
Headers | show |
Series | elf,nptl: Add -z lazy -z norelro to tests that need it | expand |
* Arsen Arsenović via Libc-alpha: > Some toolchains, such as that used on Gentoo Hardened, set -z now -z > relro out of the box. These flags break tests that rely on fixups in > underlinked libraries being applied after a dlopen happens. I'm surprised that -z norelro is ever required. Why isn't -z lazy enough? If ld.so crashes because it attempts to apply relocations after the fact, woudln't that be an ld.so bug (or a linker bug that sets up the RELRO segment incorrectly)? Thanks, Florian
Florian Weimer <fweimer@redhat.com> writes: > * Arsen Arsenović via Libc-alpha: > >> Some toolchains, such as that used on Gentoo Hardened, set -z now -z >> relro out of the box. These flags break tests that rely on fixups in >> underlinked libraries being applied after a dlopen happens. > > I'm surprised that -z norelro is ever required. Why isn't -z lazy > enough? If ld.so crashes because it attempts to apply relocations after > the fact, woudln't that be an ld.so bug (or a linker bug that sets up > the RELRO segment incorrectly)? Hm. Something went awry while I was debugging this. I looked at a test again just now and noticed that the symbols some of these tests were crashing on came from libc (dlopen here) while loading constload2 (which is dlopen'd from constload1). The backtrace contains a PLT trampoline which then fixups dlopen inside the RELRO segment. I take it dlopen@got[plt] is not supposed to be in the RELRO range? I could have sworn this failed when fixing up bar (void) as a result of constload2 dlopening constload3... but maybe that was a different failure. Let's put this patch on hold while I investigate further. FWIW, this should be easy to reproduce by building with CC='gcc -Wl,-z,relro,-z,now' or so, I think. Thanks, sorry about the fuss.
Arsen Arsenović <arsen@gentoo.org> writes: > Hm. Something went awry while I was debugging this. I looked at a test > again just now and noticed that the symbols some of these tests were > crashing on came from libc (dlopen here) while loading constload2 (which > is dlopen'd from constload1). The backtrace contains a PLT trampoline > which then fixups dlopen inside the RELRO segment. > > I take it dlopen@got[plt] is not supposed to be in the RELRO range? > > I could have sworn this failed when fixing up bar (void) as a result of > constload2 dlopening constload3... but maybe that was a different > failure. > > Let's put this patch on hold while I investigate further. > > FWIW, this should be easy to reproduce by building with CC='gcc > -Wl,-z,relro,-z,now' or so, I think. Ah, I think I see the issue: ~/gnu/glibc/b2$ diff -u0 shlib.lds.-Wl,-z,{lazy,now},-z,relro --- shlib.lds.-Wl,-z,lazy,-z,relro 2023-03-04 19:54:42.977032934 +0100 +++ shlib.lds.-Wl,-z,now,-z,relro 2023-03-04 18:57:03.195010040 +0100 @@ -1 +1 @@ -/* Script for -shared -z combreloc -z separate-code */ +/* Script for -shared -z combreloc -z separate-code -z relro -z now */ @@ -153,3 +153,2 @@ - .got : { *(.got) *(.igot) } - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); - .got.plt : { *(.got.plt) *(.igot.plt) } + .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } + . = DATA_SEGMENT_RELRO_END (0, .); ~/gnu/glibc/b2 1 $ The builds system assumes that all the flags used while building glibc use the same linker script, and that this will be the same linker script as the one that's used initially to generate shlib.lds. This is not true when -z relro is set and -z {now,lazy} are being varied. This also explains why the problem only arose after we introduced -Wl,-z,now. Do the tests even need to use this linker script? If not, it's probably best to just not use it for tests. I can wire that case up, if you think that is sensible. Thanks in advance.
* Arsen Arsenović: > Arsen Arsenović <arsen@gentoo.org> writes: > >> Hm. Something went awry while I was debugging this. I looked at a test >> again just now and noticed that the symbols some of these tests were >> crashing on came from libc (dlopen here) while loading constload2 (which >> is dlopen'd from constload1). The backtrace contains a PLT trampoline >> which then fixups dlopen inside the RELRO segment. >> >> I take it dlopen@got[plt] is not supposed to be in the RELRO range? >> >> I could have sworn this failed when fixing up bar (void) as a result of >> constload2 dlopening constload3... but maybe that was a different >> failure. >> >> Let's put this patch on hold while I investigate further. >> >> FWIW, this should be easy to reproduce by building with CC='gcc >> -Wl,-z,relro,-z,now' or so, I think. > > Ah, I think I see the issue: > > ~/gnu/glibc/b2$ diff -u0 shlib.lds.-Wl,-z,{lazy,now},-z,relro > --- shlib.lds.-Wl,-z,lazy,-z,relro 2023-03-04 19:54:42.977032934 +0100 > +++ shlib.lds.-Wl,-z,now,-z,relro 2023-03-04 18:57:03.195010040 +0100 > @@ -1 +1 @@ > -/* Script for -shared -z combreloc -z separate-code */ > +/* Script for -shared -z combreloc -z separate-code -z relro -z now */ > @@ -153,3 +153,2 @@ > - .got : { *(.got) *(.igot) } > - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); > - .got.plt : { *(.got.plt) *(.igot.plt) } > + .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } > + . = DATA_SEGMENT_RELRO_END (0, .); > ~/gnu/glibc/b2 1 $ > > The builds system assumes that all the flags used while building glibc > use the same linker script, and that this will be the same linker script > as the one that's used initially to generate shlib.lds. This is not > true when -z relro is set and -z {now,lazy} are being varied. > > This also explains why the problem only arose after we introduced > -Wl,-z,now. Interesting, thanks for looking into this. Does the issue go away if you configure with --enable-bind-now? That's what we are doing, and we don't see those test failures. Florian
Florian Weimer <fweimer@redhat.com> writes: > * Arsen Arsenović: > >> Arsen Arsenović <arsen@gentoo.org> writes: >> >>> Hm. Something went awry while I was debugging this. I looked at a test >>> again just now and noticed that the symbols some of these tests were >>> crashing on came from libc (dlopen here) while loading constload2 (which >>> is dlopen'd from constload1). The backtrace contains a PLT trampoline >>> which then fixups dlopen inside the RELRO segment. >>> >>> I take it dlopen@got[plt] is not supposed to be in the RELRO range? >>> >>> I could have sworn this failed when fixing up bar (void) as a result of >>> constload2 dlopening constload3... but maybe that was a different >>> failure. >>> >>> Let's put this patch on hold while I investigate further. >>> >>> FWIW, this should be easy to reproduce by building with CC='gcc >>> -Wl,-z,relro,-z,now' or so, I think. >> >> Ah, I think I see the issue: >> >> ~/gnu/glibc/b2$ diff -u0 shlib.lds.-Wl,-z,{lazy,now},-z,relro >> --- shlib.lds.-Wl,-z,lazy,-z,relro 2023-03-04 19:54:42.977032934 +0100 >> +++ shlib.lds.-Wl,-z,now,-z,relro 2023-03-04 18:57:03.195010040 +0100 >> @@ -1 +1 @@ >> -/* Script for -shared -z combreloc -z separate-code */ >> +/* Script for -shared -z combreloc -z separate-code -z relro -z now */ >> @@ -153,3 +153,2 @@ >> - .got : { *(.got) *(.igot) } >> - . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .); >> - .got.plt : { *(.got.plt) *(.igot.plt) } >> + .got : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) } >> + . = DATA_SEGMENT_RELRO_END (0, .); >> ~/gnu/glibc/b2 1 $ >> >> The builds system assumes that all the flags used while building glibc >> use the same linker script, and that this will be the same linker script >> as the one that's used initially to generate shlib.lds. This is not >> true when -z relro is set and -z {now,lazy} are being varied. >> >> This also explains why the problem only arose after we introduced >> -Wl,-z,now. > > Interesting, thanks for looking into this. > > Does the issue go away if you configure with --enable-bind-now? > > That's what we are doing, and we don't see those test failures. From a brief look, it would seem to still leave some failures which might be related (specifically the relro hardening tests). I haven't investigated. A patchset by Adhemerval seems to remove the linker script altogether piqued my interest. I'll test his patch on our toolchain, I suspect that it removes this problem altogether. I'll keep you posted. https://inbox.sourceware.org/libc-alpha/20221227211145.3765256-1-adhemerval.zanella@linaro.org/ Thanks, have a lovely day.
Arsen Arsenović <arsen@gentoo.org> writes: > Florian Weimer <fweimer@redhat.com> writes: > > From a brief look, it would seem to still leave some failures which > might be related (specifically the relro hardening tests). I haven't > investigated. > > A patchset by Adhemerval seems to remove the linker script altogether > piqued my interest. I'll test his patch on our toolchain, I suspect > that it removes this problem altogether. I'll keep you posted. > > https://inbox.sourceware.org/libc-alpha/20221227211145.3765256-1-adhemerval.zanella@linaro.org/ > > Thanks, have a lovely day. This patchset indeed removes the need for norelro. Will send a reroll tonight.
diff --git a/elf/Makefile b/elf/Makefile index 0d19964d42..3387b7e6c9 100644 --- a/elf/Makefile +++ b/elf/Makefile @@ -1190,6 +1190,11 @@ postclean-generated += $(objpfx)/dso-sort-tests-2.generated-makefile \ ifeq (yes,$(have-tunables)) $(eval $(call include_dsosort_tests,dso-sort-tests-1.def)) $(eval $(call include_dsosort_tests,dso-sort-tests-2.def)) + +# BZ15311 is intentionally underlinked. +LDFLAGS-tst-bz15311-b.so += -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-tst-bz15311-c.so += -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-tst-bz15311-d.so += -Wl,-z,lazy -Wl,-z,norelro endif check-abi: $(objpfx)check-abi-ld.out \ @@ -1514,6 +1519,20 @@ LDFLAGS-tst-initorderb2.so = -Wl,--no-as-needed LDFLAGS-tst-tlsmod5.so = -nostdlib -Wl,--no-as-needed LDFLAGS-tst-tlsmod6.so = -nostdlib -Wl,--no-as-needed +# The following tests are underlinked or rely on fixups being applied after a +# dlopen call. On toolchains that set -z now and -z relro by default, this +# leads to failures to load or fix up the executables being tested. +LDFLAGS-circlemod2.so = -Wl,-z,lazy +LDFLAGS-tst-tls20mod-bad.so = -Wl,-z,lazy +LDFLAGS-reldep6mod1.so += -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-constload2.so = -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-constload3.so = -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-dblloadmod3.so = -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-ifuncmod6.so = -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-ltglobmod2.so = -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-testobj1.so = -Wl,-z,lazy -Wl,-z,norelro +LDFLAGS-testobj6.so = -Wl,-z,lazy -Wl,-z,norelro + testobj1.so-no-z-defs = yes testobj3.so-no-z-defs = yes testobj4.so-no-z-defs = yes @@ -1612,6 +1631,7 @@ $(objpfx)multiload.out: $(objpfx)testobj1.so LDFLAGS-origtest = -rdynamic $(objpfx)origtest.out: $(objpfx)testobj1.so +$(objpfx)resolvfail.out: $(objpfx)testobj1.so ifeq ($(have-thread-library),yes) $(objpfx)resolvfail: $(shared-thread-library) endif @@ -2337,7 +2357,7 @@ LDFLAGS-tst-audit25a = -Wl,-z,lazy $(objpfx)tst-audit25mod1.so: $(objpfx)tst-audit25mod3.so LDFLAGS-tst-audit25mod1.so = -Wl,-z,now $(objpfx)tst-audit25mod2.so: $(objpfx)tst-audit25mod4.so -LDFLAGS-tst-audit25mod2.so = -Wl,-z,lazy +LDFLAGS-tst-audit25mod2.so = -Wl,-z,lazy -Wl,-z,norelro tst-audit25a-ARGS = -- $(host-test-program-cmd) $(objpfx)tst-audit25b.out: $(objpfx)tst-auditmod25.so diff --git a/nptl/Makefile b/nptl/Makefile index 8cec6faee3..140add412c 100644 --- a/nptl/Makefile +++ b/nptl/Makefile @@ -569,7 +569,7 @@ $(objpfx)tst-compat-forwarder: $(objpfx)tst-compat-forwarder-mod.so tst-mutex10-ENV = GLIBC_TUNABLES=glibc.elision.enable=1 # Protect against a build using -Wl,-z,now. -LDFLAGS-tst-audit-threads-mod1.so = -Wl,-z,lazy +LDFLAGS-tst-audit-threads-mod1.so = -Wl,-z,lazy -Wl,-z,norelro LDFLAGS-tst-audit-threads-mod2.so = -Wl,-z,lazy LDFLAGS-tst-audit-threads = -Wl,-z,lazy $(objpfx)tst-audit-threads: $(objpfx)tst-audit-threads-mod2.so