diff mbox series

[PATCH/RFC] kernel-defaults.mk: get rid of BuildID

Message ID YkuX4d0lVot5i27k@makrotopia.org
State Superseded
Delegated to: Daniel Golle
Headers show
Series [PATCH/RFC] kernel-defaults.mk: get rid of BuildID | expand

Commit Message

Daniel Golle April 5, 2022, 1:14 a.m. UTC
When building the Linux kernel, the linker generates a hash of all
versions of tools involved in a build called BuildID in ELF header.
This breaks reproducibility accross different buildhosts eventhough
OpenWrt builds the toolchain from source -- the build-id hash ends up
to be the only thing which differs in the resulting builds.

The cause is most likely a result of the build hosts' architectures,
OSs and standard C libraries being different.

While in theory it is true that tools may produce a different output
depending on archtecture, OS and libc of the buildhost, in practice
this is (fortunately) hardly ever the case and hence it contradicts
ld(1) which states:

 'The "md5" and "sha1" styles produces an identifier that is always
  the same in an identical output file, but will be unique among all
  nonidentical output files.'

(the kernel is using sha1 style build-id, rebuilding the kernel on a
different buildhost results in everything being identical **except**
for the build-id)

Hence, to achieve reproducible builds we will either have to resort to
identical containers/VMs for building or get rid of the BuildID hash
alltogether (or use a different build-id style)

At this point, this seems to be what Debian is doing as well in order
to achieve reproducible kernel builds, see[1].

[1]: https://wiki.debian.org/SameKernel#How_this_works
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
To investigate the issue of non-reproducible kernel images accross
buildhosts I compated the files build by OpenWrt's buildbots with
Paul's rebuilder script running on his (Aarch64) Mac.
The resulting bzImage binaries were compared by first using
scripts/extract-vmlinux to extract the ELF contents from bzImages
and then compared using dffoscope:

Format-specific differences are supported for ELF binaries but no file-specific differences were detected; falling back to a binary diff.
@@ -1254946,16 +1254946,16 @@
 01326210: 0400 0000 0800 0000 0c00 0000 5865 6e00  ............Xen.
 01326220: 0000 0000 0080 ffff 0400 0000 0800 0000  ................
 01326230: 0400 0000 5865 6e00 0000 0000 0000 0000  ....Xen.........
 01326240: 0400 0000 2000 0000 0500 0000 474e 5500  .... .......GNU.
 01326250: 0100 01c0 0400 0000 df00 0000 0000 0000  ................
 01326260: 0200 01c0 0400 0000 0700 0000 0000 0000  ................
 01326270: 0400 0000 1400 0000 0300 0000 474e 5500  ............GNU.
-01326280: b737 4be0 66fd dc7f 9fc8 24b6 6354 d8f9  .7K.f.....$.cT..
-01326290: a42c b698 0600 0000 0100 0000 0001 0000  .,..............
+01326280: 2008 eb34 e442 f784 6573 d75b 9bd2 b23f   ..4.B..es.[...?
+01326290: 536e a33e 0600 0000 0100 0000 0001 0000  Sn.>............
 013262a0: 4c69 6e75 7800 0000 0000 0000 0400 0000  Linux...........
 013262b0: 0800 0000 1200 0000 5865 6e00 b004 0001  ........Xen.....
 013262c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
 013262d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
 013262e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
 013262f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
 01326300: 0000 0000 0000 0000 0000 0000 0000 0000  ................
(all the remaining ELF is bit-by-bit identical)

Using file(1) revealed that the difference is exactly the build-id:
linux1.elf: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=b7374be066fddc7f9fc824b66354d8f9a42cb698, stripped
linux2.elf: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=2008eb34e442f7846573d75b9bd2b23f536ea33e, stripped

 include/kernel-defaults.mk | 1 +
 1 file changed, 1 insertion(+)

Comments

Petr Štetiar April 5, 2022, 6:28 a.m. UTC | #1
Daniel Golle <daniel@makrotopia.org> [2022-04-05 02:14:09]:

Hi,

thanks a lot for your and Paul's reproducible efforts!

> diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk
> index 1e82f7d739..9c8d5fbe97 100644
> --- a/include/kernel-defaults.mk
> +++ b/include/kernel-defaults.mk
> @@ -46,6 +46,7 @@ else
>  	if [ -d $(LINUX_DIR)/user_headers ]; then \
>  		rm -rf $(LINUX_DIR)/user_headers; \
>  	fi

BTW we likely have LINUX_VERMAGIC md5 hash generated over kernel config
symbols:

 grep '=[ym]' $(LINUX_DIR)/.config.set | LC_ALL=C sort | $(MKHASH) md5 > $(LINUX_DIR)/.vermagic
 LINUX_VERMAGIC:=$(strip $(shell cat $(LINUX_DIR)/.vermagic 2>/dev/null))

So it makes me wonder if we could use something like this instead (untested):

> +	$(SED) -i $(LINUX_DIR)/Makefile  -e 's/--build-id=.*/--build-id=none/g'
+	$(SED) -i $(LINUX_DIR)/Makefile  -e 's/--build-id=.*/--build-id=0x$(LINUX_VERMAGIC)/g'

From ld(1) `--build-id=style` help:

 or "0x hexstring " to use a chosen bit string specified as an even number of
 hexadecimal digits ("-" and ":" characters between digit pairs are ignored).

Having some kind of build ID is sometimes handy, for example during troubleshooting.

Cheers,

Petr
Paul Spooren April 5, 2022, 9:34 a.m. UTC | #2
Hi,

> To investigate the issue of non-reproducible kernel images accross
> buildhosts I compated the files build by OpenWrt's buildbots with
> Paul's rebuilder script running on his (Aarch64) Mac.

Sorry there was a misunderstanding. I’m building on macOS to find extra issues but the “rebuild” files I sent you earlier are created by the GitHub CI using Ubuntu on x86[1].

[1]: https://github.com/aparcar/openwrt-rebuilder/actions/runs/2089052638

> +	$(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=0x$(LINUX_VERMAGIC)/g’

Looks good, I’ll try this.

Paul
Paul Spooren April 5, 2022, 1:11 p.m. UTC | #3
Hi,

>> +	$(SED) -i $(LINUX_DIR)/Makefile -e 's/--build-id=.*/--build-id=0x$(LINUX_VERMAGIC)/g’

This doesn’t fly since LINUX_VERMAGIC (based on .vermagic) is based on the Kernel configuration and only available after the Configuration step. I moved it from the Prepare to the end of Configuration and it works fine:

diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk
index 1e82f7d739..63da2ea038 100644
--- a/include/kernel-defaults.mk
+++ b/include/kernel-defaults.mk
@@ -119,6 +119,7 @@ define Kernel/Configure/Default
        }
        $(_SINGLE) [ -d $(LINUX_DIR)/user_headers ] || $(KERNEL_MAKE) INSTALL_HDR_PATH=$(LINUX_DIR)/user_headers headers_install
        grep '=[ym]' $(LINUX_DIR)/.config.set | LC_ALL=C sort | $(MKHASH) md5 > $(LINUX_DIR)/.vermagic
+       $(SED) "s/--build-id=.*/--build-id=0x$$$$(cat $(LINUX_DIR)/.vermagic)/g" $(LINUX_DIR)/Makefile
 endef

It works as expected:

ubuntu@primary:~/a$ cat /home/ubuntu/a/build_dir/target-x86_64_musl/linux-x86_64/linux-5.10.109/Makefile | grep build-id
KBUILD_LDFLAGS_MODULE += --build-id=0xc096cf71d9bd3c319494033a0e38394b

ubuntu@primary:~$ file kernela 
kernela: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[md5/uuid]=c096cf71d9bd3c319494033a0e38394b, stripped

ubuntu@primary:~/a$ make -C target/linux val.LINUX_VERMAGIC
make: Entering directory '/home/ubuntu/a/target/linux'
c096cf71d9bd3c319494033a0e38394b
make: Leaving directory '/home/ubuntu/a/target/linux’

Best,
Paul
Michael Richardson April 5, 2022, 3:04 p.m. UTC | #4
please forgive me stupidity, I couldn't understand the last part of your recommendation:

Daniel Golle <daniel@makrotopia.org> wrote:
    > Hence, to achieve reproducible builds we will either have to resort to
    > identical containers/VMs for building or get rid of the BuildID hash
    > alltogether (or use a different build-id style)

A) identical containers/VMs
B) get rid of BuildID
C) use a different build-id style

    > At this point, this seems to be what Debian is doing as well in order
    > to achieve reproducible kernel builds, see[1].

And which is Debian doing? :-)

    > [1]: https://wiki.debian.org/SameKernel#How_this_works

I read this page and I think it's temporarily (B), but unclear if they are
going to (A).


(I prefer a path that leads to the build-id meaning that the same versions of
the same compilers on the same host OS were used, but it would be nice not to
require the same compile of those compilers...)
Felix Fietkau April 5, 2022, 3:05 p.m. UTC | #5
On 05.04.22 03:14, Daniel Golle wrote:
> When building the Linux kernel, the linker generates a hash of all
> versions of tools involved in a build called BuildID in ELF header.
> This breaks reproducibility accross different buildhosts eventhough
> OpenWrt builds the toolchain from source -- the build-id hash ends up
> to be the only thing which differs in the resulting builds.
> 
> The cause is most likely a result of the build hosts' architectures,
> OSs and standard C libraries being different.
> 
> While in theory it is true that tools may produce a different output
> depending on archtecture, OS and libc of the buildhost, in practice
> this is (fortunately) hardly ever the case and hence it contradicts
> ld(1) which states:
> 
>   'The "md5" and "sha1" styles produces an identifier that is always
>    the same in an identical output file, but will be unique among all
>    nonidentical output files.'
> 
> (the kernel is using sha1 style build-id, rebuilding the kernel on a
> different buildhost results in everything being identical **except**
> for the build-id)
> 
> Hence, to achieve reproducible builds we will either have to resort to
> identical containers/VMs for building or get rid of the BuildID hash
> alltogether (or use a different build-id style)
> 
> At this point, this seems to be what Debian is doing as well in order
> to achieve reproducible kernel builds, see[1].
> 
> [1]: https://wiki.debian.org/SameKernel#How_this_works
> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> 
> diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk
> index 1e82f7d739..9c8d5fbe97 100644
> --- a/include/kernel-defaults.mk
> +++ b/include/kernel-defaults.mk
> @@ -46,6 +46,7 @@ else
>   	if [ -d $(LINUX_DIR)/user_headers ]; then \
>   		rm -rf $(LINUX_DIR)/user_headers; \
>   	fi
> +	$(SED) -i $(LINUX_DIR)/Makefile  -e 's/--build-id=.*/--build-id=none/g'
I don't like running sed on the linux Makefile, as this interferes with 
creating patches for it. I think it would be better to simply override 
KBUILD_LDFLAGS_MODULE on the kernel/module build command line.

- Felix
Daniel Golle April 5, 2022, 6:51 p.m. UTC | #6
On Tue, Apr 05, 2022 at 05:05:43PM +0200, Felix Fietkau wrote:
> On 05.04.22 03:14, Daniel Golle wrote:
> > When building the Linux kernel, the linker generates a hash of all
> > versions of tools involved in a build called BuildID in ELF header.
> > This breaks reproducibility accross different buildhosts eventhough
> > OpenWrt builds the toolchain from source -- the build-id hash ends up
> > to be the only thing which differs in the resulting builds.
> > 
> > The cause is most likely a result of the build hosts' architectures,
> > OSs and standard C libraries being different.
> > 
> > While in theory it is true that tools may produce a different output
> > depending on archtecture, OS and libc of the buildhost, in practice
> > this is (fortunately) hardly ever the case and hence it contradicts
> > ld(1) which states:
> > 
> >   'The "md5" and "sha1" styles produces an identifier that is always
> >    the same in an identical output file, but will be unique among all
> >    nonidentical output files.'
> > 
> > (the kernel is using sha1 style build-id, rebuilding the kernel on a
> > different buildhost results in everything being identical **except**
> > for the build-id)
> > 
> > Hence, to achieve reproducible builds we will either have to resort to
> > identical containers/VMs for building or get rid of the BuildID hash
> > alltogether (or use a different build-id style)
> > 
> > At this point, this seems to be what Debian is doing as well in order
> > to achieve reproducible kernel builds, see[1].
> > 
> > [1]: https://wiki.debian.org/SameKernel#How_this_works
> > Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> > 
> > diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk
> > index 1e82f7d739..9c8d5fbe97 100644
> > --- a/include/kernel-defaults.mk
> > +++ b/include/kernel-defaults.mk
> > @@ -46,6 +46,7 @@ else
> >   	if [ -d $(LINUX_DIR)/user_headers ]; then \
> >   		rm -rf $(LINUX_DIR)/user_headers; \
> >   	fi
> > +	$(SED) -i $(LINUX_DIR)/Makefile  -e 's/--build-id=.*/--build-id=none/g'
> I don't like running sed on the linux Makefile, as this interferes with
> creating patches for it. I think it would be better to simply override
> KBUILD_LDFLAGS_MODULE on the kernel/module build command line.

You probably meant LDFLAGS_vmlinux because from what I understand
KBUILD_LDFLAGS_MODULE only applies when building modules but not when
linking vmlinux.
As ld only cares about the last mentioned --build-id= parameter
supplied, we can override it using KBUILD_LDFLAGS (which should apply
to both, vmlinux.elf as well as modules).
I haven't tried any of that yet though.
Felix Fietkau April 5, 2022, 7:13 p.m. UTC | #7
On 05.04.22 20:51, Daniel Golle wrote:
> On Tue, Apr 05, 2022 at 05:05:43PM +0200, Felix Fietkau wrote:
>> On 05.04.22 03:14, Daniel Golle wrote:
>> > When building the Linux kernel, the linker generates a hash of all
>> > versions of tools involved in a build called BuildID in ELF header.
>> > This breaks reproducibility accross different buildhosts eventhough
>> > OpenWrt builds the toolchain from source -- the build-id hash ends up
>> > to be the only thing which differs in the resulting builds.
>> > 
>> > The cause is most likely a result of the build hosts' architectures,
>> > OSs and standard C libraries being different.
>> > 
>> > While in theory it is true that tools may produce a different output
>> > depending on archtecture, OS and libc of the buildhost, in practice
>> > this is (fortunately) hardly ever the case and hence it contradicts
>> > ld(1) which states:
>> > 
>> >   'The "md5" and "sha1" styles produces an identifier that is always
>> >    the same in an identical output file, but will be unique among all
>> >    nonidentical output files.'
>> > 
>> > (the kernel is using sha1 style build-id, rebuilding the kernel on a
>> > different buildhost results in everything being identical **except**
>> > for the build-id)
>> > 
>> > Hence, to achieve reproducible builds we will either have to resort to
>> > identical containers/VMs for building or get rid of the BuildID hash
>> > alltogether (or use a different build-id style)
>> > 
>> > At this point, this seems to be what Debian is doing as well in order
>> > to achieve reproducible kernel builds, see[1].
>> > 
>> > [1]: https://wiki.debian.org/SameKernel#How_this_works
>> > Signed-off-by: Daniel Golle <daniel@makrotopia.org>
>> > 
>> > diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk
>> > index 1e82f7d739..9c8d5fbe97 100644
>> > --- a/include/kernel-defaults.mk
>> > +++ b/include/kernel-defaults.mk
>> > @@ -46,6 +46,7 @@ else
>> >   	if [ -d $(LINUX_DIR)/user_headers ]; then \
>> >   		rm -rf $(LINUX_DIR)/user_headers; \
>> >   	fi
>> > +	$(SED) -i $(LINUX_DIR)/Makefile  -e 's/--build-id=.*/--build-id=none/g'
>> I don't like running sed on the linux Makefile, as this interferes with
>> creating patches for it. I think it would be better to simply override
>> KBUILD_LDFLAGS_MODULE on the kernel/module build command line.
> 
> You probably meant LDFLAGS_vmlinux because from what I understand
> KBUILD_LDFLAGS_MODULE only applies when building modules but not when
> linking vmlinux.
> As ld only cares about the last mentioned --build-id= parameter
> supplied, we can override it using KBUILD_LDFLAGS (which should apply
> to both, vmlinux.elf as well as modules).
> I haven't tried any of that yet though.Right, I overlooked that one. Either way, you likely need to patch the 
kernel in order to not have to override the full set of linker 
arguments. I still think explicit patching + variable override is 
preferable over sed based Makefile patching.

- Felix
Bjørn Mork April 5, 2022, 7:33 p.m. UTC | #8
Daniel Golle <daniel@makrotopia.org> writes:

> You probably meant LDFLAGS_vmlinux because from what I understand
> KBUILD_LDFLAGS_MODULE only applies when building modules but not when
> linking vmlinux.
> As ld only cares about the last mentioned --build-id= parameter
> supplied, we can override it using KBUILD_LDFLAGS (which should apply
> to both, vmlinux.elf as well as modules).
> I haven't tried any of that yet though.

How about simply making this configurable upstream?  If more than one
distro needs it, then I can't imagine it should be a problem to get have
it accepted.  Possibly with a "depends on EXPERT" or similar.


Bjørn
Paul Spooren April 7, 2022, 10:29 a.m. UTC | #9
Hi,

> On 5. Apr 2022, at 21:33, Bjørn Mork <bjorn@mork.no> wrote:
> 
> Daniel Golle <daniel@makrotopia.org> writes:
> 
>> You probably meant LDFLAGS_vmlinux because from what I understand
>> KBUILD_LDFLAGS_MODULE only applies when building modules but not when
>> linking vmlinux.
>> As ld only cares about the last mentioned --build-id= parameter
>> supplied, we can override it using KBUILD_LDFLAGS (which should apply
>> to both, vmlinux.elf as well as modules).
>> I haven't tried any of that yet though.
> 
> How about simply making this configurable upstream?  If more than one
> distro needs it, then I can't imagine it should be a problem to get have
> it accepted.  Possibly with a "depends on EXPERT" or similar.
> 

I created a patch which we could test downstream:

https://github.com/openwrt/openwrt/pull/9669

Since it’s used by friends over at Debian too, we might want to get it upstream.
diff mbox series

Patch

diff --git a/include/kernel-defaults.mk b/include/kernel-defaults.mk
index 1e82f7d739..9c8d5fbe97 100644
--- a/include/kernel-defaults.mk
+++ b/include/kernel-defaults.mk
@@ -46,6 +46,7 @@  else
 	if [ -d $(LINUX_DIR)/user_headers ]; then \
 		rm -rf $(LINUX_DIR)/user_headers; \
 	fi
+	$(SED) -i $(LINUX_DIR)/Makefile  -e 's/--build-id=.*/--build-id=none/g'
   endef
 endif