diff mbox series

[PULL,5/5] linux-user: Fix qemu-arm to run static armhf binaries

Message ID 20230719155235.244478-6-deller@gmx.de
State New
Headers show
Series [PULL,1/5] linux-user: Fix qemu brk() to not zero bytes on current page | expand

Commit Message

Helge Deller July 19, 2023, 3:52 p.m. UTC
qemu-user crashes immediately when running static binaries on the armhf
architecture. The problem is the memory layout where the executable is
loaded before the interpreter library, in which case the reserved brk
region clashes with the interpreter code and is released before qemu
tries to start the program.

At load time qemu calculates a brk value for interpreter and executable
each.  The fix is to choose the higher one of both.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Andreas Schwab <schwab@suse.de>
Cc: qemu-stable@nongnu.org
Reported-by:  Venkata.Pyla@toshiba-tsip.com
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981
---
 linux-user/elfload.c | 7 +++++++
 1 file changed, 7 insertions(+)

--
2.41.0

Comments

Michael Tokarev July 21, 2023, 3:14 p.m. UTC | #1
19.07.2023 18:52, Helge Deller wrote:
> qemu-user crashes immediately when running static binaries on the armhf
> architecture. The problem is the memory layout where the executable is
> loaded before the interpreter library, in which case the reserved brk
> region clashes with the interpreter code and is released before qemu
> tries to start the program.
> 
> At load time qemu calculates a brk value for interpreter and executable
> each.  The fix is to choose the higher one of both.
> 
> Signed-off-by: Helge Deller <deller@gmx.de>
> Cc: Andreas Schwab <schwab@suse.de>
> Cc: qemu-stable@nongnu.org
> Reported-by:  Venkata.Pyla@toshiba-tsip.com
> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981
> ---
>   linux-user/elfload.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index a26200d9f3..94951630b1 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
> 
>       if (elf_interpreter) {
>           load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
> +        /*
> +         * adjust brk address if the interpreter was loaded above the main
> +         * executable, e.g. happens with static binaries on armhf
> +         */
> +        if (interp_info.brk > info->brk) {
> +            info->brk = interp_info.brk;
> +        }

So, this is kinda amusing.
This broke arm64, ppc64el and s390x:

arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null'
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault

(it was just a quick test from debian qemu-user package).

Reverting this patch makes it work again..

*Sigh*.

(This is currently broken in debian unstable too, - this is how I
noticed the breakage).

Thanks,

/mjt
Michael Tokarev July 21, 2023, 3:20 p.m. UTC | #2
21.07.2023 18:14, Michael Tokarev пишет:
> 19.07.2023 18:52, Helge Deller wrote:
>> qemu-user crashes immediately when running static binaries on the armhf
>> architecture. The problem is the memory layout where the executable is
>> loaded before the interpreter library, in which case the reserved brk
>> region clashes with the interpreter code and is released before qemu
>> tries to start the program.
>>
>> At load time qemu calculates a brk value for interpreter and executable
>> each.  The fix is to choose the higher one of both.
>>
>> Signed-off-by: Helge Deller <deller@gmx.de>
>> Cc: Andreas Schwab <schwab@suse.de>
>> Cc: qemu-stable@nongnu.org
>> Reported-by:  Venkata.Pyla@toshiba-tsip.com
>> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981
>> ---
>>   linux-user/elfload.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
>> index a26200d9f3..94951630b1 100644
>> --- a/linux-user/elfload.c
>> +++ b/linux-user/elfload.c
>> @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
>>
>>       if (elf_interpreter) {
>>           load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
>> +        /*
>> +         * adjust brk address if the interpreter was loaded above the main
>> +         * executable, e.g. happens with static binaries on armhf
>> +         */
>> +        if (interp_info.brk > info->brk) {
>> +            info->brk = interp_info.brk;
>> +        }

I've added printf() inside this if() block, on arm64 it produces:

   fixing brk: interp_info.brk=0x5502875358 info=>brk=0x5500032ec0

FWIW,

/mjt
Helge Deller July 21, 2023, 9:37 p.m. UTC | #3
On 7/21/23 17:14, Michael Tokarev wrote:
> 19.07.2023 18:52, Helge Deller wrote:
>> qemu-user crashes immediately when running static binaries on the armhf
>> architecture. The problem is the memory layout where the executable is
>> loaded before the interpreter library, in which case the reserved brk
>> region clashes with the interpreter code and is released before qemu
>> tries to start the program.
>>
>> At load time qemu calculates a brk value for interpreter and executable
>> each.  The fix is to choose the higher one of both.
>>
>> Signed-off-by: Helge Deller <deller@gmx.de>
>> Cc: Andreas Schwab <schwab@suse.de>
>> Cc: qemu-stable@nongnu.org
>> Reported-by:  Venkata.Pyla@toshiba-tsip.com
>> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981
>> ---
>>   linux-user/elfload.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
>> index a26200d9f3..94951630b1 100644
>> --- a/linux-user/elfload.c
>> +++ b/linux-user/elfload.c
>> @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
>>
>>       if (elf_interpreter) {
>>           load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
>> +        /*
>> +         * adjust brk address if the interpreter was loaded above the main
>> +         * executable, e.g. happens with static binaries on armhf
>> +         */
>> +        if (interp_info.brk > info->brk) {
>> +            info->brk = interp_info.brk;
>> +        }
>
> So, this is kinda amusing.
> This broke arm64, ppc64el and s390x:
>
> arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null'
> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> Segmentation fault
>
> (it was just a quick test from debian qemu-user package).
>
> Reverting this patch makes it work again..
>
> *Sigh*.

Argh, that's really unfortunate.
I just tested myself.
Running static busybox binary did work for me:
# ./qemu-aarch64 busybox
BusyBox v1.30.1 (Debian 1:1.30.1-6+b3) multi-call binary.
BusyBox is copyrighted by many authors between 1998-2015.
....

I'd like to test dynamic binary as well, but I'm currently failing
to set up an aarch64 chroot here.
Sadly I won't have time to do any further testing until sunday evening
(travelling over the weekend).
Maybe someone else can try? I leave it up to Peter if he wants to revert
that patch right now, or if it can wait a few days until I'm back?

Helge
Michael Tokarev July 22, 2023, 7:37 a.m. UTC | #4
22.07.2023 00:37, Helge Deller wrote:
..
>> So, this is kinda amusing.
>> This broke arm64, ppc64el and s390x:
>>
>> arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null'
>> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
>> Segmentation fault

Note: I run it on native arm64, so it is arm64 on arm64, -
this is a quick test I had from the debian qemu autopkgtest (which
run /bin/ls under qemu and natively and compares the result).  I
haven't tried to reproduce it locally on amd64 host, - I did it on
a debian arm64 porterbox (which I was logged on anyway to debug a
different issue on armel, with qemu git tree already cloned).

> Argh, that's really unfortunate.
> I just tested myself.
> Running static busybox binary did work for me:
> # ./qemu-aarch64 busybox

It didn't trigger with ls, but it did when I run something from
emulated /bin/sh.

This whole email was more like a heads-up/warning, to collect more
details later, - and maybe someone knows what the problem is already
if it is obvious.

..
> Maybe someone else can try? I leave it up to Peter if he wants to revert
> that patch right now, or if it can wait a few days until I'm back?

For the time being, how about a quick-n-hacky band-aid, to include
this fixup only where the original prob actually triggered in the
first place?

Like, if the target is armel?  Something like

#if defined(TARGET_ARM) && !defined(TARGET_AARCH64)

or what's the right preprocessor condition is?

Thanks,

/mjt
Helge Deller July 24, 2023, 9:45 a.m. UTC | #5
On 7/21/23 17:14, Michael Tokarev wrote:
> 19.07.2023 18:52, Helge Deller wrote:
>> qemu-user crashes immediately when running static binaries on the armhf
>> architecture. The problem is the memory layout where the executable is
>> loaded before the interpreter library, in which case the reserved brk
>> region clashes with the interpreter code and is released before qemu
>> tries to start the program.
>>
>> At load time qemu calculates a brk value for interpreter and executable
>> each.  The fix is to choose the higher one of both.
>>
>> Signed-off-by: Helge Deller <deller@gmx.de>
>> Cc: Andreas Schwab <schwab@suse.de>
>> Cc: qemu-stable@nongnu.org
>> Reported-by:  Venkata.Pyla@toshiba-tsip.com
>> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1040981
>> ---
>>   linux-user/elfload.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
>> index a26200d9f3..94951630b1 100644
>> --- a/linux-user/elfload.c
>> +++ b/linux-user/elfload.c
>> @@ -3615,6 +3615,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)
>>
>>       if (elf_interpreter) {
>>           load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
>> +        /*
>> +         * adjust brk address if the interpreter was loaded above the main
>> +         * executable, e.g. happens with static binaries on armhf
>> +         */
>> +        if (interp_info.brk > info->brk) {
>> +            info->brk = interp_info.brk;
>> +        }
>
> So, this is kinda amusing.
> This broke arm64, ppc64el and s390x:
>
> arm64$ ./qemu-aarch64 /bin/sh -c '/bin/ls -dCFl *[t]* >/dev/null'
> qemu: uncaught target signal 11 (Segmentation fault) - core dumped
> Segmentation fault
>
> (it was just a quick test from debian qemu-user package).
>
> Reverting this patch makes it work again..

It seems the whole loading of binaries on aarch64 is somewhat wrong.
My patch just triggers the whole thing to blow up.

This is how the memory-map looks on physical hardware:
  deller@amdahl:~$ uname -a
Linux amdahl 5.10.0-23-arm64 #1 SMP Debian 5.10.179-2 (2023-07-14) aarch64 GNU/Linux
deller@amdahl:~$ cat /proc/self/maps
aaaaafb70000-aaaaafb78000 r-xp 00000000 fe:02 131743                     /bin/cat
aaaaafb88000-aaaaafb89000 r--p 00008000 fe:02 131743                     /bin/cat
aaaaafb89000-aaaaafb8a000 rw-p 00009000 fe:02 131743                     /bin/cat
aaaae022b000-aaaae024c000 rw-p 00000000 00:00 0                          [heap]
ffff9ce78000-ffff9ce9a000 rw-p 00000000 00:00 0
ffff9ce9a000-ffff9cff6000 r-xp 00000000 fe:02 790863                     /lib/aarch64-linux-gnu/libc-2.31.so
ffff9cff6000-ffff9d005000 ---p 0015c000 fe:02 790863                     /lib/aarch64-linux-gnu/libc-2.31.so
ffff9d005000-ffff9d009000 r--p 0015b000 fe:02 790863                     /lib/aarch64-linux-gnu/libc-2.31.so
ffff9d009000-ffff9d00b000 rw-p 0015f000 fe:02 790863                     /lib/aarch64-linux-gnu/libc-2.31.so
ffff9d00b000-ffff9d00e000 rw-p 00000000 00:00 0
ffff9d00e000-ffff9d02f000 r-xp 00000000 fe:02 790820                     /lib/aarch64-linux-gnu/ld-2.31.so
ffff9d033000-ffff9d035000 rw-p 00000000 00:00 0
ffff9d03c000-ffff9d03e000 r--p 00000000 00:00 0                          [vvar]
ffff9d03e000-ffff9d03f000 r-xp 00000000 00:00 0                          [vdso]
ffff9d03f000-ffff9d040000 r--p 00021000 fe:02 790820                     /lib/aarch64-linux-gnu/ld-2.31.so
ffff9d040000-ffff9d042000 rw-p 00022000 fe:02 790820                     /lib/aarch64-linux-gnu/ld-2.31.so
ffffe9ea3000-ffffe9ec4000 rw-p 00000000 00:00 0                          [stack]

this is on qemu-linux-user-aarch64:
I have no name!@paq:/# cat /proc/self/maps
5500000000-5500009000 r-xp 00000000 08:01 2380521                        /usr/bin/cat
5500009000-550001f000 ---p 00000000 00:00 0
550001f000-5500020000 r--p 0000f000 08:01 2380521                        /usr/bin/cat
5500020000-5500021000 rw-p 00010000 08:01 2380521                        /usr/bin/cat
5500021000-5500042000 rw-p 00000000 00:00 0
5502021000-5502022000 ---p 00000000 00:00 0
5502022000-5502822000 rw-p 00000000 00:00 0                              [stack]
5502822000-5502848000 r-xp 00000000 08:01 2382563                        /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
5502848000-5502860000 ---p 00000000 00:00 0
5502860000-5502862000 r--p 0002e000 08:01 2382563                        /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
5502862000-5502864000 rw-p 00030000 08:01 2382563                        /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
5502864000-5502865000 r-xp 00000000 00:00 0
5502865000-5502867000 rw-p 00000000 00:00 0
5502870000-55029f7000 r-xp 00000000 08:01 2382566                        /usr/lib/aarch64-linux-gnu/libc.so.6
55029f7000-5502a0d000 ---p 00187000 08:01 2382566                        /usr/lib/aarch64-linux-gnu/libc.so.6
5502a0d000-5502a10000 r--p 0018d000 08:01 2382566                        /usr/lib/aarch64-linux-gnu/libc.so.6
5502a10000-5502a12000 rw-p 00190000 08:01 2382566                        /usr/lib/aarch64-linux-gnu/libc.so.6
5502a12000-5502a1f000 rw-p 00000000 00:00 0

Here I got:
interp_info.brk 0x5502863358
info->brk       0x5500020458
diff            0x2842f00   40 MB

I think we need to make sure that the shared libs (ld, glibc) gets loaded
at the top of the memory to free up heap space. Right now (without my patch)
there were 40MB heap, with my patch if clashed with ld.so.

Helge
diff mbox series

Patch

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a26200d9f3..94951630b1 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3615,6 +3615,13 @@  int load_elf_binary(struct linux_binprm *bprm, struct image_info *info)

     if (elf_interpreter) {
         load_elf_interp(elf_interpreter, &interp_info, bprm->buf);
+        /*
+         * adjust brk address if the interpreter was loaded above the main
+         * executable, e.g. happens with static binaries on armhf
+         */
+        if (interp_info.brk > info->brk) {
+            info->brk = interp_info.brk;
+        }

         /* If the program interpreter is one of these two, then assume
            an iBCS2 image.  Otherwise assume a native linux image.  */