configure: Use lld --image-base for --disable-pie user mode binaries
diff mbox series

Message ID 20191116052815.nop3xkmd4umqsdsb@google.com
State New
Headers show
Series
  • configure: Use lld --image-base for --disable-pie user mode binaries
Related show

Commit Message

Fangrui Song Nov. 16, 2019, 5:28 a.m. UTC
For lld, --image-base is the preferred way to set the base address.
lld does not actually implement -Ttext-segment, but treats it as an alias for
-Ttext. -Ttext-segment=0x60000000 combined with --no-rosegment can
create a 1.6GB executable.

Fix the problem by using --image-base for lld. GNU ld and gold will
still get -Ttext-segment. Also delete the ld --verbose fallback introduced
in 2013, which is no longer relevant or correct (the default linker
script has changed).

Signed-off-by: Fangrui Song <i@maskray.me>
---
  configure | 33 ++++++++++++---------------------
  1 file changed, 12 insertions(+), 21 deletions(-)

Comments

Fangrui Song Nov. 20, 2019, 9:02 p.m. UTC | #1
On 2019-11-15, Fangrui Song wrote:
>For lld, --image-base is the preferred way to set the base address.
>lld does not actually implement -Ttext-segment, but treats it as an alias for
>-Ttext. -Ttext-segment=0x60000000 combined with --no-rosegment can
>create a 1.6GB executable.
>
>Fix the problem by using --image-base for lld. GNU ld and gold will
>still get -Ttext-segment. Also delete the ld --verbose fallback introduced
>in 2013, which is no longer relevant or correct (the default linker
>script has changed).
>
>Signed-off-by: Fangrui Song <i@maskray.me>
>---
>  configure | 33 ++++++++++++---------------------
>  1 file changed, 12 insertions(+), 21 deletions(-)
>
>diff --git a/configure b/configure
>index 6099be1d84..2d45af0d09 100755
>--- a/configure
>+++ b/configure
>@@ -6336,43 +6336,34 @@ fi
>
>  # Probe for the need for relocating the user-only binary.
>  if ( [ "$linux_user" = yes ] || [ "$bsd_user" = yes ] ) && [ "$pie" = no ]; then
>-  textseg_addr=
>+  image_base=
>    case "$cpu" in
>      arm | i386 | ppc* | s390* | sparc* | x86_64 | x32)
>-      # ??? Rationale for choosing this address
>-      textseg_addr=0x60000000
>+      # An arbitrary address that makes it unlikely to collide with user
>+      # programs.
>+      image_base=0x60000000
>        ;;
>      mips)
>        # A 256M aligned address, high in the address space, with enough
>        # room for the code_gen_buffer above it before the stack.
>-      textseg_addr=0x60000000
>+      image_base=0x60000000
>        ;;
>    esac
>-  if [ -n "$textseg_addr" ]; then
>+  if [ -n "$image_base" ]; then
>      cat > $TMPC <<EOF
>      int main(void) { return 0; }
>  EOF
>-    textseg_ldflags="-Wl,-Ttext-segment=$textseg_addr"
>-    if ! compile_prog "" "$textseg_ldflags"; then
>-      # In case ld does not support -Ttext-segment, edit the default linker
>-      # script via sed to set the .text start addr.  This is needed on FreeBSD
>-      # at least.
>-      if ! $ld --verbose >/dev/null 2>&1; then
>+    image_base_ldflags="-Wl,--image-base=$image_base"
>+    if ! compile_prog "" "$image_base_ldflags"; then
>+      image_base_ldflags="-Wl,-Ttext-segment=$image_base"
>+      if ! compile_prog "" "$image_base_ldflags"; then
>          error_exit \
>              "We need to link the QEMU user mode binaries at a" \
>              "specific text address. Unfortunately your linker" \
>-            "doesn't support either the -Ttext-segment option or" \
>-            "printing the default linker script with --verbose." \
>+            "supports neither --image-base nor -Ttext-segment. " \
>              "If you don't want the user mode binaries, pass the" \
>              "--disable-user option to configure."
>        fi
>-
>-      $ld --verbose | sed \
>-        -e '1,/==================================================/d' \
>-        -e '/==================================================/,$d' \
>-        -e "s/[.] = [0-9a-fx]* [+] SIZEOF_HEADERS/. = $textseg_addr + SIZEOF_HEADERS/" \
>-        -e "s/__executable_start = [0-9a-fx]*/__executable_start = $textseg_addr/" > config-host.ld
>-      textseg_ldflags="-Wl,-T../config-host.ld"
>      fi
>    fi
>  fi
>@@ -7945,7 +7936,7 @@ if test "$gprof" = "yes" ; then
>  fi
>
>  if test "$target_linux_user" = "yes" || test "$target_bsd_user" = "yes" ; then
>-  ldflags="$ldflags $textseg_ldflags"
>+  ldflags="$ldflags $image_base_ldflags"
>  fi
>
>  # Newer kernels on s390 check for an S390_PGSTE program header and
>-- 
>2.24.0
>

Ping :)
Fangrui Song Nov. 27, 2019, 6:36 p.m. UTC | #2
On 2019-11-20, Fangrui Song wrote:
>On 2019-11-15, Fangrui Song wrote:
>>For lld, --image-base is the preferred way to set the base address.
>>lld does not actually implement -Ttext-segment, but treats it as an alias for
>>-Ttext. -Ttext-segment=0x60000000 combined with --no-rosegment can
>>create a 1.6GB executable.
>>
>>Fix the problem by using --image-base for lld. GNU ld and gold will
>>still get -Ttext-segment. Also delete the ld --verbose fallback introduced
>>in 2013, which is no longer relevant or correct (the default linker
>>script has changed).
>>
>>Signed-off-by: Fangrui Song <i@maskray.me>
>>---
>> configure | 33 ++++++++++++---------------------
>> 1 file changed, 12 insertions(+), 21 deletions(-)
>>
>>diff --git a/configure b/configure
>>index 6099be1d84..2d45af0d09 100755
>>--- a/configure
>>+++ b/configure
>>@@ -6336,43 +6336,34 @@ fi
>>
>> # Probe for the need for relocating the user-only binary.
>> if ( [ "$linux_user" = yes ] || [ "$bsd_user" = yes ] ) && [ "$pie" = no ]; then
>>-  textseg_addr=
>>+  image_base=
>>   case "$cpu" in
>>     arm | i386 | ppc* | s390* | sparc* | x86_64 | x32)
>>-      # ??? Rationale for choosing this address
>>-      textseg_addr=0x60000000
>>+      # An arbitrary address that makes it unlikely to collide with user
>>+      # programs.
>>+      image_base=0x60000000
>>       ;;
>>     mips)
>>       # A 256M aligned address, high in the address space, with enough
>>       # room for the code_gen_buffer above it before the stack.
>>-      textseg_addr=0x60000000
>>+      image_base=0x60000000
>>       ;;
>>   esac
>>-  if [ -n "$textseg_addr" ]; then
>>+  if [ -n "$image_base" ]; then
>>     cat > $TMPC <<EOF
>>     int main(void) { return 0; }
>> EOF
>>-    textseg_ldflags="-Wl,-Ttext-segment=$textseg_addr"
>>-    if ! compile_prog "" "$textseg_ldflags"; then
>>-      # In case ld does not support -Ttext-segment, edit the default linker
>>-      # script via sed to set the .text start addr.  This is needed on FreeBSD
>>-      # at least.
>>-      if ! $ld --verbose >/dev/null 2>&1; then
>>+    image_base_ldflags="-Wl,--image-base=$image_base"
>>+    if ! compile_prog "" "$image_base_ldflags"; then
>>+      image_base_ldflags="-Wl,-Ttext-segment=$image_base"
>>+      if ! compile_prog "" "$image_base_ldflags"; then
>>         error_exit \
>>             "We need to link the QEMU user mode binaries at a" \
>>             "specific text address. Unfortunately your linker" \
>>-            "doesn't support either the -Ttext-segment option or" \
>>-            "printing the default linker script with --verbose." \
>>+            "supports neither --image-base nor -Ttext-segment. " \
>>             "If you don't want the user mode binaries, pass the" \
>>             "--disable-user option to configure."
>>       fi
>>-
>>-      $ld --verbose | sed \
>>-        -e '1,/==================================================/d' \
>>-        -e '/==================================================/,$d' \
>>-        -e "s/[.] = [0-9a-fx]* [+] SIZEOF_HEADERS/. = $textseg_addr + SIZEOF_HEADERS/" \
>>-        -e "s/__executable_start = [0-9a-fx]*/__executable_start = $textseg_addr/" > config-host.ld
>>-      textseg_ldflags="-Wl,-T../config-host.ld"
>>     fi
>>   fi
>> fi
>>@@ -7945,7 +7936,7 @@ if test "$gprof" = "yes" ; then
>> fi
>>
>> if test "$target_linux_user" = "yes" || test "$target_bsd_user" = "yes" ; then
>>-  ldflags="$ldflags $textseg_ldflags"
>>+  ldflags="$ldflags $image_base_ldflags"
>> fi
>>
>> # Newer kernels on s390 check for an S390_PGSTE program header and
>>-- 
>>2.24.0
>>
>
>Ping :)

Ping :)
Alex Bennée Nov. 27, 2019, 7:01 p.m. UTC | #3
Fangrui Song <i@maskray.me> writes:

> For lld, --image-base is the preferred way to set the base address.
> lld does not actually implement -Ttext-segment, but treats it as an alias for
> -Ttext. -Ttext-segment=0x60000000 combined with --no-rosegment can
> create a 1.6GB executable.
>
> Fix the problem by using --image-base for lld. GNU ld and gold will
> still get -Ttext-segment. Also delete the ld --verbose fallback introduced
> in 2013, which is no longer relevant or correct (the default linker
> script has changed).
>
> Signed-off-by: Fangrui Song <i@maskray.me>

This patch no longer applies cleanly to configure so I couldn't test it.

> ---
>   configure | 33 ++++++++++++---------------------
>   1 file changed, 12 insertions(+), 21 deletions(-)
>
> diff --git a/configure b/configure
> index 6099be1d84..2d45af0d09 100755
> --- a/configure
> +++ b/configure
> @@ -6336,43 +6336,34 @@ fi
>   
>   # Probe for the need for relocating the user-only binary.
>   if ( [ "$linux_user" = yes ] || [ "$bsd_user" = yes ] ) && [ "$pie" = no ]; then
> -  textseg_addr=
> +  image_base=
>     case "$cpu" in
>       arm | i386 | ppc* | s390* | sparc* | x86_64 | x32)
> -      # ??? Rationale for choosing this address
> -      textseg_addr=0x60000000
> +      # An arbitrary address that makes it unlikely to collide with user
> +      # programs.
> +      image_base=0x60000000

The comment probably belongs up above when we define the empty variable
unless it really is specifically about these targets.

Renaming textseg_addr seems like unnecessary churn for this patch. 

>         ;;
>       mips)
>         # A 256M aligned address, high in the address space, with enough
>         # room for the code_gen_buffer above it before the stack.
> -      textseg_addr=0x60000000
> +      image_base=0x60000000
>         ;;
>     esac
> -  if [ -n "$textseg_addr" ]; then
> +  if [ -n "$image_base" ]; then
>       cat > $TMPC <<EOF
>       int main(void) { return 0; }
>   EOF
> -    textseg_ldflags="-Wl,-Ttext-segment=$textseg_addr"
> -    if ! compile_prog "" "$textseg_ldflags"; then
> -      # In case ld does not support -Ttext-segment, edit the default linker
> -      # script via sed to set the .text start addr.  This is needed on FreeBSD
> -      # at least.
> -      if ! $ld --verbose >/dev/null 2>&1; then
> +    image_base_ldflags="-Wl,--image-base=$image_base"
> +    if ! compile_prog "" "$image_base_ldflags"; then
> +      image_base_ldflags="-Wl,-Ttext-segment=$image_base"
> +      if ! compile_prog "" "$image_base_ldflags"; then
>           error_exit \
>               "We need to link the QEMU user mode binaries at a" \
>               "specific text address. Unfortunately your linker" \
> -            "doesn't support either the -Ttext-segment option or" \
> -            "printing the default linker script with --verbose." \
> +            "supports neither --image-base nor -Ttext-segment. " \
>               "If you don't want the user mode binaries, pass the" \
>               "--disable-user option to configure."
>         fi
> -
> -      $ld --verbose | sed \
> -        -e '1,/==================================================/d' \
> -        -e '/==================================================/,$d' \
> -        -e "s/[.] = [0-9a-fx]* [+] SIZEOF_HEADERS/. = $textseg_addr + SIZEOF_HEADERS/" \
> -        -e "s/__executable_start = [0-9a-fx]*/__executable_start = $textseg_addr/" > config-host.ld
> -      textseg_ldflags="-Wl,-T../config-host.ld"
>       fi
>     fi
>   fi
> @@ -7945,7 +7936,7 @@ if test "$gprof" = "yes" ; then
>   fi
>   
>   if test "$target_linux_user" = "yes" || test "$target_bsd_user" = "yes" ; then
> -  ldflags="$ldflags $textseg_ldflags"
> +  ldflags="$ldflags $image_base_ldflags"
>   fi
>   
>   # Newer kernels on s390 check for an S390_PGSTE program header and
Richard Henderson Dec. 1, 2019, 9:48 p.m. UTC | #4
On 11/27/19 6:36 PM, Fangrui Song wrote:
> On 2019-11-20, Fangrui Song wrote:
>> On 2019-11-15, Fangrui Song wrote:
>>> For lld, --image-base is the preferred way to set the base address.
>>> lld does not actually implement -Ttext-segment, but treats it as an alias for
>>> -Ttext. -Ttext-segment=0x60000000 combined with --no-rosegment can
>>> create a 1.6GB executable.
>>>
>>> Fix the problem by using --image-base for lld. GNU ld and gold will
>>> still get -Ttext-segment. Also delete the ld --verbose fallback introduced
>>> in 2013, which is no longer relevant or correct (the default linker
>>> script has changed).
>>>
>>> Signed-off-by: Fangrui Song <i@maskray.me>
>>> ---
>>> configure | 33 ++++++++++++---------------------
>>> 1 file changed, 12 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/configure b/configure
>>> index 6099be1d84..2d45af0d09 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -6336,43 +6336,34 @@ fi
>>>
>>> # Probe for the need for relocating the user-only binary.
>>> if ( [ "$linux_user" = yes ] || [ "$bsd_user" = yes ] ) && [ "$pie" = no ];
>>> then
>>> -  textseg_addr=
>>> +  image_base=
>>>   case "$cpu" in
>>>     arm | i386 | ppc* | s390* | sparc* | x86_64 | x32)
>>> -      # ??? Rationale for choosing this address
>>> -      textseg_addr=0x60000000
>>> +      # An arbitrary address that makes it unlikely to collide with user
>>> +      # programs.

Please don't replace this ??? with an arbitrary rationale, which clearly
doesn't apply to all of these hosts.

>>> +      image_base=0x60000000
>>>       ;;
>>>     mips)
>>>       # A 256M aligned address, high in the address space, with enough
>>>       # room for the code_gen_buffer above it before the stack.

This is the only one with a proper rationale.

That said, I'm not sure that the proper way to handle this issue with lld is to
drop this code entirely.

The best way to handle the underlying issue -- address conflict between
interpreter and guest binary -- is PIE, for which this code is skipped.

After that, we go to some pain to choose a guest_base address that allows the
guest binary to load around the interpreter's reserved addresses.

So what's left that this messing about with link addresses buys us?


r~
Fangrui Song Dec. 2, 2019, 4:06 a.m. UTC | #5
Thanks for reviewing this patch!

On 2019-12-01, Richard Henderson wrote:
>On 11/27/19 6:36 PM, Fangrui Song wrote:
>> On 2019-11-20, Fangrui Song wrote:
>>> On 2019-11-15, Fangrui Song wrote:
>>>> For lld, --image-base is the preferred way to set the base address.
>>>> lld does not actually implement -Ttext-segment, but treats it as an alias for
>>>> -Ttext. -Ttext-segment=0x60000000 combined with --no-rosegment can
>>>> create a 1.6GB executable.
>>>>
>>>> Fix the problem by using --image-base for lld. GNU ld and gold will
>>>> still get -Ttext-segment. Also delete the ld --verbose fallback introduced
>>>> in 2013, which is no longer relevant or correct (the default linker
>>>> script has changed).
>>>>
>>>> Signed-off-by: Fangrui Song <i@maskray.me>
>>>> ---
>>>> configure | 33 ++++++++++++---------------------
>>>> 1 file changed, 12 insertions(+), 21 deletions(-)
>>>>
>>>> diff --git a/configure b/configure
>>>> index 6099be1d84..2d45af0d09 100755
>>>> --- a/configure
>>>> +++ b/configure
>>>> @@ -6336,43 +6336,34 @@ fi
>>>>
>>>> # Probe for the need for relocating the user-only binary.
>>>> if ( [ "$linux_user" = yes ] || [ "$bsd_user" = yes ] ) && [ "$pie" = no ];
>>>> then
>>>> -  textseg_addr=
>>>> +  image_base=
>>>>   case "$cpu" in
>>>>     arm | i386 | ppc* | s390* | sparc* | x86_64 | x32)
>>>> -      # ??? Rationale for choosing this address
>>>> -      textseg_addr=0x60000000
>>>> +      # An arbitrary address that makes it unlikely to collide with user
>>>> +      # programs.
>
>Please don't replace this ??? with an arbitrary rationale, which clearly
>doesn't apply to all of these hosts.

In
https://lists.nongnu.org/archive/html/qemu-devel/2019-11/msg04646.html
it was suggested to move the comment around a bit.
I am not puzzled where and what I should say in the comment.
Can you (or other maintainers) kindly edit the comment for me?
I do not know enough about qemu to provide a good rationale here.

>>>> +      image_base=0x60000000
>>>>       ;;
>>>>     mips)
>>>>       # A 256M aligned address, high in the address space, with enough
>>>>       # room for the code_gen_buffer above it before the stack.
>
>This is the only one with a proper rationale.
>
>That said, I'm not sure that the proper way to handle this issue with lld is to
>drop this code entirely.

The patch changes a feature that lld does not support: -Ttext-segment,
to use --image-base instead.

Due to the prevalence of -z separate-code in GNU ld, -Ttext-segment is
no longer appropriate. I suggested that GNU linkers implement the
feature https://sourceware.org/bugzilla/show_bug.cgi?id=25207 .

What gets deleted is the sed script. As I explained in the commit
message, it is no longer relevant. It probably applies to an old GNU ld
that FreeBSD used. FreeBSD has switched to lld now.

>The best way to handle the underlying issue -- address conflict between
>interpreter and guest binary -- is PIE, for which this code is skipped.
>
>After that, we go to some pain to choose a guest_base address that allows the
>guest binary to load around the interpreter's reserved addresses.
>
>So what's left that this messing about with link addresses buys us?

I agree that --enable-pie will be a better solution, but dropping the
support now will break at least FreeBSD. Its kernel supports running an
ET_DYN executable but it does not perform address randomization.
--disable-pie also appears to be used by ChromeOS developers who
reported https://bugs.llvm.org/show_bug.cgi?id=43997 . I can communicate
to them that migrating to --enable-pie is the way going forward.

Patch
diff mbox series

diff --git a/configure b/configure
index 6099be1d84..2d45af0d09 100755
--- a/configure
+++ b/configure
@@ -6336,43 +6336,34 @@  fi
  
  # Probe for the need for relocating the user-only binary.
  if ( [ "$linux_user" = yes ] || [ "$bsd_user" = yes ] ) && [ "$pie" = no ]; then
-  textseg_addr=
+  image_base=
    case "$cpu" in
      arm | i386 | ppc* | s390* | sparc* | x86_64 | x32)
-      # ??? Rationale for choosing this address
-      textseg_addr=0x60000000
+      # An arbitrary address that makes it unlikely to collide with user
+      # programs.
+      image_base=0x60000000
        ;;
      mips)
        # A 256M aligned address, high in the address space, with enough
        # room for the code_gen_buffer above it before the stack.
-      textseg_addr=0x60000000
+      image_base=0x60000000
        ;;
    esac
-  if [ -n "$textseg_addr" ]; then
+  if [ -n "$image_base" ]; then
      cat > $TMPC <<EOF
      int main(void) { return 0; }
  EOF
-    textseg_ldflags="-Wl,-Ttext-segment=$textseg_addr"
-    if ! compile_prog "" "$textseg_ldflags"; then
-      # In case ld does not support -Ttext-segment, edit the default linker
-      # script via sed to set the .text start addr.  This is needed on FreeBSD
-      # at least.
-      if ! $ld --verbose >/dev/null 2>&1; then
+    image_base_ldflags="-Wl,--image-base=$image_base"
+    if ! compile_prog "" "$image_base_ldflags"; then
+      image_base_ldflags="-Wl,-Ttext-segment=$image_base"
+      if ! compile_prog "" "$image_base_ldflags"; then
          error_exit \
              "We need to link the QEMU user mode binaries at a" \
              "specific text address. Unfortunately your linker" \
-            "doesn't support either the -Ttext-segment option or" \
-            "printing the default linker script with --verbose." \
+            "supports neither --image-base nor -Ttext-segment. " \
              "If you don't want the user mode binaries, pass the" \
              "--disable-user option to configure."
        fi
-
-      $ld --verbose | sed \
-        -e '1,/==================================================/d' \
-        -e '/==================================================/,$d' \
-        -e "s/[.] = [0-9a-fx]* [+] SIZEOF_HEADERS/. = $textseg_addr + SIZEOF_HEADERS/" \
-        -e "s/__executable_start = [0-9a-fx]*/__executable_start = $textseg_addr/" > config-host.ld
-      textseg_ldflags="-Wl,-T../config-host.ld"
      fi
    fi
  fi
@@ -7945,7 +7936,7 @@  if test "$gprof" = "yes" ; then
  fi
  
  if test "$target_linux_user" = "yes" || test "$target_bsd_user" = "yes" ; then
-  ldflags="$ldflags $textseg_ldflags"
+  ldflags="$ldflags $image_base_ldflags"
  fi
  
  # Newer kernels on s390 check for an S390_PGSTE program header and