Message ID | YkuX4d0lVot5i27k@makrotopia.org |
---|---|
State | Superseded |
Delegated to: | Daniel Golle |
Headers | show |
Series | [PATCH/RFC] kernel-defaults.mk: get rid of BuildID | expand |
Daniel Golle <daniel@makrotopia.org> [2022-04-05 02:14:09]: Hi, thanks a lot for your and Paul's reproducible efforts! > diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk > index 1e82f7d739..9c8d5fbe97 100644 > --- a/include/kernel-defaults.mk > +++ b/include/kernel-defaults.mk > @@ -46,6 +46,7 @@ else > if [ -d $(LINUX_DIR)/user_headers ]; then \ > rm -rf $(LINUX_DIR)/user_headers; \ > fi BTW we likely have LINUX_VERMAGIC md5 hash generated over kernel config symbols: grep '=[ym]' $(LINUX_DIR)/.config.set | LC_ALL=C sort | $(MKHASH) md5 > $(LINUX_DIR)/.vermagic LINUX_VERMAGIC:=$(strip $(shell cat $(LINUX_DIR)/.vermagic 2>/dev/null)) So it makes me wonder if we could use something like this instead (untested): > + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=none/g' + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=0x$(LINUX_VERMAGIC)/g' From ld(1) `--build-id=style` help: or "0x hexstring " to use a chosen bit string specified as an even number of hexadecimal digits ("-" and ":" characters between digit pairs are ignored). Having some kind of build ID is sometimes handy, for example during troubleshooting. Cheers, Petr
Hi, > To investigate the issue of non-reproducible kernel images accross > buildhosts I compated the files build by OpenWrt's buildbots with > Paul's rebuilder script running on his (Aarch64) Mac. Sorry there was a misunderstanding. I’m building on macOS to find extra issues but the “rebuild” files I sent you earlier are created by the GitHub CI using Ubuntu on x86[1]. [1]: https://github.com/aparcar/openwrt-rebuilder/actions/runs/2089052638 > + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=0x$(LINUX_VERMAGIC)/g’ Looks good, I’ll try this. Paul
Hi,
>> + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=0x$(LINUX_VERMAGIC)/g’
This doesn’t fly since LINUX_VERMAGIC (based on .vermagic) is based on the Kernel configuration and only available after the Configuration step. I moved it from the Prepare to the end of Configuration and it works fine:
diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk
index 1e82f7d739..63da2ea038 100644
--- a/include/kernel-defaults.mk
+++ b/include/kernel-defaults.mk
@@ -119,6 +119,7 @@ define Kernel/Configure/Default
}
$(_SINGLE) [ -d $(LINUX_DIR)/user_headers ] || $(KERNEL_MAKE) INSTALL_HDR_PATH=$(LINUX_DIR)/user_headers headers_install
grep '=[ym]' $(LINUX_DIR)/.config.set | LC_ALL=C sort | $(MKHASH) md5 > $(LINUX_DIR)/.vermagic
+ $(SED) "s/--build-id=.*/--build-id=0x$$$$(cat $(LINUX_DIR)/.vermagic)/g" $(LINUX_DIR)/Makefile
endef
It works as expected:
ubuntu@primary:~/a$ cat /home/ubuntu/a/build_dir/target-x86_64_musl/linux-x86_64/linux-5.10.109/Makefile | grep build-id
KBUILD_LDFLAGS_MODULE += --build-id=0xc096cf71d9bd3c319494033a0e38394b
ubuntu@primary:~$ file kernela
kernela: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[md5/uuid]=c096cf71d9bd3c319494033a0e38394b, stripped
ubuntu@primary:~/a$ make -C target/linux val.LINUX_VERMAGIC
make: Entering directory '/home/ubuntu/a/target/linux'
c096cf71d9bd3c319494033a0e38394b
make: Leaving directory '/home/ubuntu/a/target/linux’
Best,
Paul
please forgive me stupidity, I couldn't understand the last part of your recommendation: Daniel Golle <daniel@makrotopia.org> wrote: > Hence, to achieve reproducible builds we will either have to resort to > identical containers/VMs for building or get rid of the BuildID hash > alltogether (or use a different build-id style) A) identical containers/VMs B) get rid of BuildID C) use a different build-id style > At this point, this seems to be what Debian is doing as well in order > to achieve reproducible kernel builds, see[1]. And which is Debian doing? :-) > [1]: https://wiki.debian.org/SameKernel#How_this_works I read this page and I think it's temporarily (B), but unclear if they are going to (A). (I prefer a path that leads to the build-id meaning that the same versions of the same compilers on the same host OS were used, but it would be nice not to require the same compile of those compilers...)
On 05.04.22 03:14, Daniel Golle wrote: > When building the Linux kernel, the linker generates a hash of all > versions of tools involved in a build called BuildID in ELF header. > This breaks reproducibility accross different buildhosts eventhough > OpenWrt builds the toolchain from source -- the build-id hash ends up > to be the only thing which differs in the resulting builds. > > The cause is most likely a result of the build hosts' architectures, > OSs and standard C libraries being different. > > While in theory it is true that tools may produce a different output > depending on archtecture, OS and libc of the buildhost, in practice > this is (fortunately) hardly ever the case and hence it contradicts > ld(1) which states: > > 'The "md5" and "sha1" styles produces an identifier that is always > the same in an identical output file, but will be unique among all > nonidentical output files.' > > (the kernel is using sha1 style build-id, rebuilding the kernel on a > different buildhost results in everything being identical **except** > for the build-id) > > Hence, to achieve reproducible builds we will either have to resort to > identical containers/VMs for building or get rid of the BuildID hash > alltogether (or use a different build-id style) > > At this point, this seems to be what Debian is doing as well in order > to achieve reproducible kernel builds, see[1]. > > [1]: https://wiki.debian.org/SameKernel#How_this_works > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk > index 1e82f7d739..9c8d5fbe97 100644 > --- a/include/kernel-defaults.mk > +++ b/include/kernel-defaults.mk > @@ -46,6 +46,7 @@ else > if [ -d $(LINUX_DIR)/user_headers ]; then \ > rm -rf $(LINUX_DIR)/user_headers; \ > fi > + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=none/g' I don't like running sed on the linux Makefile, as this interferes with creating patches for it. I think it would be better to simply override KBUILD_LDFLAGS_MODULE on the kernel/module build command line. - Felix
On Tue, Apr 05, 2022 at 05:05:43PM +0200, Felix Fietkau wrote: > On 05.04.22 03:14, Daniel Golle wrote: > > When building the Linux kernel, the linker generates a hash of all > > versions of tools involved in a build called BuildID in ELF header. > > This breaks reproducibility accross different buildhosts eventhough > > OpenWrt builds the toolchain from source -- the build-id hash ends up > > to be the only thing which differs in the resulting builds. > > > > The cause is most likely a result of the build hosts' architectures, > > OSs and standard C libraries being different. > > > > While in theory it is true that tools may produce a different output > > depending on archtecture, OS and libc of the buildhost, in practice > > this is (fortunately) hardly ever the case and hence it contradicts > > ld(1) which states: > > > > 'The "md5" and "sha1" styles produces an identifier that is always > > the same in an identical output file, but will be unique among all > > nonidentical output files.' > > > > (the kernel is using sha1 style build-id, rebuilding the kernel on a > > different buildhost results in everything being identical **except** > > for the build-id) > > > > Hence, to achieve reproducible builds we will either have to resort to > > identical containers/VMs for building or get rid of the BuildID hash > > alltogether (or use a different build-id style) > > > > At this point, this seems to be what Debian is doing as well in order > > to achieve reproducible kernel builds, see[1]. > > > > [1]: https://wiki.debian.org/SameKernel#How_this_works > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > > > diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk > > index 1e82f7d739..9c8d5fbe97 100644 > > --- a/include/kernel-defaults.mk > > +++ b/include/kernel-defaults.mk > > @@ -46,6 +46,7 @@ else > > if [ -d $(LINUX_DIR)/user_headers ]; then \ > > rm -rf $(LINUX_DIR)/user_headers; \ > > fi > > + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=none/g' > I don't like running sed on the linux Makefile, as this interferes with > creating patches for it. I think it would be better to simply override > KBUILD_LDFLAGS_MODULE on the kernel/module build command line. You probably meant LDFLAGS_vmlinux because from what I understand KBUILD_LDFLAGS_MODULE only applies when building modules but not when linking vmlinux. As ld only cares about the last mentioned --build-id= parameter supplied, we can override it using KBUILD_LDFLAGS (which should apply to both, vmlinux.elf as well as modules). I haven't tried any of that yet though.
On 05.04.22 20:51, Daniel Golle wrote: > On Tue, Apr 05, 2022 at 05:05:43PM +0200, Felix Fietkau wrote: >> On 05.04.22 03:14, Daniel Golle wrote: >> > When building the Linux kernel, the linker generates a hash of all >> > versions of tools involved in a build called BuildID in ELF header. >> > This breaks reproducibility accross different buildhosts eventhough >> > OpenWrt builds the toolchain from source -- the build-id hash ends up >> > to be the only thing which differs in the resulting builds. >> > >> > The cause is most likely a result of the build hosts' architectures, >> > OSs and standard C libraries being different. >> > >> > While in theory it is true that tools may produce a different output >> > depending on archtecture, OS and libc of the buildhost, in practice >> > this is (fortunately) hardly ever the case and hence it contradicts >> > ld(1) which states: >> > >> > 'The "md5" and "sha1" styles produces an identifier that is always >> > the same in an identical output file, but will be unique among all >> > nonidentical output files.' >> > >> > (the kernel is using sha1 style build-id, rebuilding the kernel on a >> > different buildhost results in everything being identical **except** >> > for the build-id) >> > >> > Hence, to achieve reproducible builds we will either have to resort to >> > identical containers/VMs for building or get rid of the BuildID hash >> > alltogether (or use a different build-id style) >> > >> > At this point, this seems to be what Debian is doing as well in order >> > to achieve reproducible kernel builds, see[1]. >> > >> > [1]: https://wiki.debian.org/SameKernel#How_this_works >> > Signed-off-by: Daniel Golle <daniel@makrotopia.org> >> > >> > diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk >> > index 1e82f7d739..9c8d5fbe97 100644 >> > --- a/include/kernel-defaults.mk >> > +++ b/include/kernel-defaults.mk >> > @@ -46,6 +46,7 @@ else >> > if [ -d $(LINUX_DIR)/user_headers ]; then \ >> > rm -rf $(LINUX_DIR)/user_headers; \ >> > fi >> > + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=none/g' >> I don't like running sed on the linux Makefile, as this interferes with >> creating patches for it. I think it would be better to simply override >> KBUILD_LDFLAGS_MODULE on the kernel/module build command line. > > You probably meant LDFLAGS_vmlinux because from what I understand > KBUILD_LDFLAGS_MODULE only applies when building modules but not when > linking vmlinux. > As ld only cares about the last mentioned --build-id= parameter > supplied, we can override it using KBUILD_LDFLAGS (which should apply > to both, vmlinux.elf as well as modules). > I haven't tried any of that yet though.Right, I overlooked that one. Either way, you likely need to patch the kernel in order to not have to override the full set of linker arguments. I still think explicit patching + variable override is preferable over sed based Makefile patching. - Felix
Daniel Golle <daniel@makrotopia.org> writes: > You probably meant LDFLAGS_vmlinux because from what I understand > KBUILD_LDFLAGS_MODULE only applies when building modules but not when > linking vmlinux. > As ld only cares about the last mentioned --build-id= parameter > supplied, we can override it using KBUILD_LDFLAGS (which should apply > to both, vmlinux.elf as well as modules). > I haven't tried any of that yet though. How about simply making this configurable upstream? If more than one distro needs it, then I can't imagine it should be a problem to get have it accepted. Possibly with a "depends on EXPERT" or similar. Bjørn
Hi, > On 5. Apr 2022, at 21:33, Bjørn Mork <bjorn@mork.no> wrote: > > Daniel Golle <daniel@makrotopia.org> writes: > >> You probably meant LDFLAGS_vmlinux because from what I understand >> KBUILD_LDFLAGS_MODULE only applies when building modules but not when >> linking vmlinux. >> As ld only cares about the last mentioned --build-id= parameter >> supplied, we can override it using KBUILD_LDFLAGS (which should apply >> to both, vmlinux.elf as well as modules). >> I haven't tried any of that yet though. > > How about simply making this configurable upstream? If more than one > distro needs it, then I can't imagine it should be a problem to get have > it accepted. Possibly with a "depends on EXPERT" or similar. > I created a patch which we could test downstream: https://github.com/openwrt/openwrt/pull/9669 Since it’s used by friends over at Debian too, we might want to get it upstream.
diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk index 1e82f7d739..9c8d5fbe97 100644 --- a/include/kernel-defaults.mk +++ b/include/kernel-defaults.mk @@ -46,6 +46,7 @@ else if [ -d $(LINUX_DIR)/user_headers ]; then \ rm -rf $(LINUX_DIR)/user_headers; \ fi + $(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=none/g' endef endif
When building the Linux kernel, the linker generates a hash of all versions of tools involved in a build called BuildID in ELF header. This breaks reproducibility accross different buildhosts eventhough OpenWrt builds the toolchain from source -- the build-id hash ends up to be the only thing which differs in the resulting builds. The cause is most likely a result of the build hosts' architectures, OSs and standard C libraries being different. While in theory it is true that tools may produce a different output depending on archtecture, OS and libc of the buildhost, in practice this is (fortunately) hardly ever the case and hence it contradicts ld(1) which states: 'The "md5" and "sha1" styles produces an identifier that is always the same in an identical output file, but will be unique among all nonidentical output files.' (the kernel is using sha1 style build-id, rebuilding the kernel on a different buildhost results in everything being identical **except** for the build-id) Hence, to achieve reproducible builds we will either have to resort to identical containers/VMs for building or get rid of the BuildID hash alltogether (or use a different build-id style) At this point, this seems to be what Debian is doing as well in order to achieve reproducible kernel builds, see[1]. [1]: https://wiki.debian.org/SameKernel#How_this_works Signed-off-by: Daniel Golle <daniel@makrotopia.org> --- To investigate the issue of non-reproducible kernel images accross buildhosts I compated the files build by OpenWrt's buildbots with Paul's rebuilder script running on his (Aarch64) Mac. The resulting bzImage binaries were compared by first using scripts/extract-vmlinux to extract the ELF contents from bzImages and then compared using dffoscope: Format-specific differences are supported for ELF binaries but no file-specific differences were detected; falling back to a binary diff. @@ -1254946,16 +1254946,16 @@ 01326210: 0400 0000 0800 0000 0c00 0000 5865 6e00 ............Xen. 01326220: 0000 0000 0080 ffff 0400 0000 0800 0000 ................ 01326230: 0400 0000 5865 6e00 0000 0000 0000 0000 ....Xen......... 01326240: 0400 0000 2000 0000 0500 0000 474e 5500 .... .......GNU. 01326250: 0100 01c0 0400 0000 df00 0000 0000 0000 ................ 01326260: 0200 01c0 0400 0000 0700 0000 0000 0000 ................ 01326270: 0400 0000 1400 0000 0300 0000 474e 5500 ............GNU. -01326280: b737 4be0 66fd dc7f 9fc8 24b6 6354 d8f9 .7K.f.....$.cT.. -01326290: a42c b698 0600 0000 0100 0000 0001 0000 .,.............. +01326280: 2008 eb34 e442 f784 6573 d75b 9bd2 b23f ..4.B..es.[...? +01326290: 536e a33e 0600 0000 0100 0000 0001 0000 Sn.>............ 013262a0: 4c69 6e75 7800 0000 0000 0000 0400 0000 Linux........... 013262b0: 0800 0000 1200 0000 5865 6e00 b004 0001 ........Xen..... 013262c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 013262d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 013262e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 013262f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 01326300: 0000 0000 0000 0000 0000 0000 0000 0000 ................ (all the remaining ELF is bit-by-bit identical) Using file(1) revealed that the difference is exactly the build-id: linux1.elf: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=b7374be066fddc7f9fc824b66354d8f9a42cb698, stripped linux2.elf: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=2008eb34e442f7846573d75b9bd2b23f536ea33e, stripped include/kernel-defaults.mk | 1 + 1 file changed, 1 insertion(+)