diff mbox

[RFC] Relocatable host tools

Message ID CAAXf6LX2uJ7p+SzdZq2LKL0JsHXB0PzO_1Lx5tTjcBfDq-7wZg@mail.gmail.com
State Not Applicable
Headers show

Commit Message

Thomas De Schampheleire Dec. 11, 2015, 2:11 p.m. UTC
Hi,

Recently, a few patches by Yann E. Morin have entered Buildroot
master, which make sure that all host tools (output/host/usr/bin/...)
have a correct RPATH encoded, namely
/absolute/buildroot/output/host/usr/lib.

I believe this is one step closer to having a relocatable host SDK for
Buildroot. While I'm not claiming (nor have I tested) a completely
relocatable host SDK, by changing the RPATH from an absolute path to
'$ORIGIN/../lib'.

'$ORIGIN' here is a literal string encoded in the ELF binary and
replaced at runtime by the location of the binary. So, a binary
originally created in /home/foo/buildroot/output/host/usr/bin and run
from there will expect its libraries in
/home/foo/buildroot/output/host/usr/lib, but the same binary copied
and run from /tmp/hostsdk/bin will expect its libraries in
/tmp/hostsdk/lib.

One could even set a combined RPATH:
    $ORIGIN/../lib:/absolute/buildroot/output/host/usr/lib
Then, if one copies the binary but not the entire host directory,
things still work as 'expected'.

Reliably passing the string $ORIGIN from buildroot make to each
package such that the resulting binary still has the untouched $ORIGIN
string proves to be difficult. Depending on how the package Makefile
is treating the LDFLAGS, the dollar will be expanded or not.

My original attempt was to change package/Makefile.in directly and
pass -Wl,-rpath='\$$\$$ORIGIN/../lib' which worked for most cases, but
e.g. not for host-kmod (where the combination \$$\$$ became a shell $$
which expanded to a PID). The web is full of other developers that
face the same type of problems.

The alternative solution is to change the RPATH /after/ the package is
built, e.g. using patchelf. I have successfully used
    patchelf --set-rpath '$ORIGIN/../lib' <host-binary>
(this in fact sets RUNPATH rather than RPATH, which is fine too).

Exactly how/where this fits into the build process is up for
discussion. For my proof-of-concept I have abused the check-host-rpath
script. Here, host-patchelf was first built manually, but this should
be added as an automatic dependency.
This location is unfortunate, as it is run as an instrumentation hook.
If it fails, the package seems to be installed correctly (but with an
incorrect RPATH). Instead, HOST_FOO_POST_INSTALL_HOOKS is probably the
better place.

Perhaps, a configure option should be added that enables/disables this
feature, to avoid an unnecessary host-patchelf dependency for people
that don't need it.

I'm pasting my proof-of-concept below, just for reference. In no way
is this supposed to be code ready for inclusion.

               |sed -r -e '/.* \(R(UN)?PATH\) +Library r(un)?path:
\[(.+)\]$/!d' \


Looking forward to your feedback.

/Thomas

Comments

Thomas Petazzoni Dec. 11, 2015, 3:28 p.m. UTC | #1
Thomas,

On Fri, 11 Dec 2015 15:11:27 +0100, Thomas De Schampheleire wrote:

> I believe this is one step closer to having a relocatable host SDK for
> Buildroot. While I'm not claiming (nor have I tested) a completely
> relocatable host SDK, by changing the RPATH from an absolute path to
> '$ORIGIN/../lib'.
> 
> '$ORIGIN' here is a literal string encoded in the ELF binary and
> replaced at runtime by the location of the binary. So, a binary
> originally created in /home/foo/buildroot/output/host/usr/bin and run
> from there will expect its libraries in
> /home/foo/buildroot/output/host/usr/lib, but the same binary copied
> and run from /tmp/hostsdk/bin will expect its libraries in
> /tmp/hostsdk/lib.

Have you seen:

  http://patchwork.ozlabs.org/patch/494781/
  http://patchwork.ozlabs.org/patch/494782/
  http://patchwork.ozlabs.org/patch/494783/
  http://patchwork.ozlabs.org/patch/494784/
  http://patchwork.ozlabs.org/patch/494785/

Thomas
Thomas De Schampheleire Dec. 11, 2015, 3:55 p.m. UTC | #2
On Fri, Dec 11, 2015 at 4:28 PM, Thomas Petazzoni
<thomas.petazzoni@free-electrons.com> wrote:
> Thomas,
>
> On Fri, 11 Dec 2015 15:11:27 +0100, Thomas De Schampheleire wrote:
>
>> I believe this is one step closer to having a relocatable host SDK for
>> Buildroot. While I'm not claiming (nor have I tested) a completely
>> relocatable host SDK, by changing the RPATH from an absolute path to
>> '$ORIGIN/../lib'.
>>
>> '$ORIGIN' here is a literal string encoded in the ELF binary and
>> replaced at runtime by the location of the binary. So, a binary
>> originally created in /home/foo/buildroot/output/host/usr/bin and run
>> from there will expect its libraries in
>> /home/foo/buildroot/output/host/usr/lib, but the same binary copied
>> and run from /tmp/hostsdk/bin will expect its libraries in
>> /tmp/hostsdk/lib.
>
> Have you seen:
>
>   http://patchwork.ozlabs.org/patch/494781/
>   http://patchwork.ozlabs.org/patch/494782/
>   http://patchwork.ozlabs.org/patch/494783/
>   http://patchwork.ozlabs.org/patch/494784/
>   http://patchwork.ozlabs.org/patch/494785/
>

Ah, clearly not! Thanks for the reference.
(I did search in my gmail, but had deleted old mails a while back)

Are there blocking issues with these patches? There are still some
open comments on v5 but no objections in that iteration.
(I haven't reviewed the code myself yet).

Samuel, could you also share your view on the status of these patches?
Are you planning a new iteration with Baruch's comments fixed?

Thanks,
Thomas
Thomas Petazzoni Dec. 11, 2015, 3:57 p.m. UTC | #3
Thomas,

On Fri, 11 Dec 2015 16:55:09 +0100, Thomas De Schampheleire wrote:

> Ah, clearly not! Thanks for the reference.
> (I did search in my gmail, but had deleted old mails a while back)
> 
> Are there blocking issues with these patches? There are still some
> open comments on v5 but no objections in that iteration.
> (I haven't reviewed the code myself yet).
> 
> Samuel, could you also share your view on the status of these patches?
> Are you planning a new iteration with Baruch's comments fixed?

My objection is that the host rpath checks is done in shell scripts,
and the target rpath mungling in python scripts. I'd like to use either
one or the other language, but not both.

I believe Samuel said he wanted to respin his series, to use python
scripts to also do the host rpath checks that were added by Yann.

Best regards,

Thomas
Yann E. MORIN Dec. 11, 2015, 5:28 p.m. UTC | #4
Thomas², Samuel, All,

On 2015-12-11 16:57 +0100, Thomas Petazzoni spake thusly:
> On Fri, 11 Dec 2015 16:55:09 +0100, Thomas De Schampheleire wrote:
> 
> > Ah, clearly not! Thanks for the reference.
> > (I did search in my gmail, but had deleted old mails a while back)
> > 
> > Are there blocking issues with these patches? There are still some
> > open comments on v5 but no objections in that iteration.
> > (I haven't reviewed the code myself yet).
> > 
> > Samuel, could you also share your view on the status of these patches?
> > Are you planning a new iteration with Baruch's comments fixed?
> 
> My objection is that the host rpath checks is done in shell scripts,
> and the target rpath mungling in python scripts. I'd like to use either
> one or the other language, but not both.

I agree that we should not have two scripts for that. Either we have a
shell script (enhance and expand the existing one), or we replace it
(e.g. with a Python script).

> I believe Samuel said he wanted to respin his series, to use python
> scripts to also do the host rpath checks that were added by Yann.

What I do not like in the current Python script from Samuel, is that it
basically just calls subprocess() to run patchelf. If that's all a
python script does, I believe it is better done in a shell script.

However, if we want to do it in Python (meh...), we should do it
natively in Python. There are Python modules to deal with ELF files:

    https://pypi.python.org/pypi/elflib/        (last uploaded 2009-04-13)
    https://github.com/tbursztyka/python-elf    (last changed 2013-10-17)
    https://github.com/eliben/pyelftools        (last changed 2015-12-11)

The first two are rather old, but the last has had recent activity
(today). There are probably tons of others.

If we can do somethhing like:

    for f in scan_dir(HOST_DIR):
        if not libelf.is_ELF(f):
            continue
        elf = libelf.open(f)
        if elf.has_rpath():
            elf.delete_rpath()
        elf.set_rpath("$ORIGIN/../lib)
        elf.save()
        elf.close()

Then I would be very happy! ;-)

Of course, maybe there is a better pythonic way to do that, and it
depends on the ELF python module API and what it can do. But you get the
idea...

Of course, it means we would need that as a host-package, because we
can't expect it to be installed on the user's system. However, that
would mean *always* building our host-python, which would be a shame...
:-(

(Note: Python may be a very good programming language, but I would not
want we do a Python script just for the sake of doing a Python script.
Let's use the best tool for the task, which would be a shell script if
all we do is call other utilities.)

Alos note that I have a (slighty-related) pending series to do
ELF-related cleanups (and also non ELF-related cleanups) in the
target/ directory:
    https://git.buildroot.org/~ymorin/git/buildroot/log/?h=yem/hooks

It would do:
  - remove symlinks to shared libs
  - rename shared libs to their SONAME
  - look for, or remove RPATH in ELF files
  - deduplicate files into hard-links
  - turn symlinks to hard-links

This series is quite old now, but I haven't yet had time to finalise it.
Maybe I can further work on it during the holidays season...

Regards,
Yann E. MORIN.
Thomas De Schampheleire Dec. 11, 2015, 8:25 p.m. UTC | #5
Thomas, Yann, Samuel, all,

On Fri, Dec 11, 2015 at 6:28 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
> Thomas², Samuel, All,
>
> On 2015-12-11 16:57 +0100, Thomas Petazzoni spake thusly:
>> On Fri, 11 Dec 2015 16:55:09 +0100, Thomas De Schampheleire wrote:
>>
>> > Ah, clearly not! Thanks for the reference.
>> > (I did search in my gmail, but had deleted old mails a while back)
>> >
>> > Are there blocking issues with these patches? There are still some
>> > open comments on v5 but no objections in that iteration.
>> > (I haven't reviewed the code myself yet).
>> >
>> > Samuel, could you also share your view on the status of these patches?
>> > Are you planning a new iteration with Baruch's comments fixed?
>>
>> My objection is that the host rpath checks is done in shell scripts,
>> and the target rpath mungling in python scripts. I'd like to use either
>> one or the other language, but not both.
>
> I agree that we should not have two scripts for that. Either we have a
> shell script (enhance and expand the existing one), or we replace it
> (e.g. with a Python script).
>
>> I believe Samuel said he wanted to respin his series, to use python
>> scripts to also do the host rpath checks that were added by Yann.
>
> What I do not like in the current Python script from Samuel, is that it
> basically just calls subprocess() to run patchelf. If that's all a
> python script does, I believe it is better done in a shell script.
>
> However, if we want to do it in Python (meh...), we should do it
> natively in Python. There are Python modules to deal with ELF files:
>
[..]
>
> Of course, it means we would need that as a host-package, because we
> can't expect it to be installed on the user's system. However, that
> would mean *always* building our host-python, which would be a shame...
> :-(
>
> (Note: Python may be a very good programming language, but I would not
> want we do a Python script just for the sake of doing a Python script.
> Let's use the best tool for the task, which would be a shell script if
> all we do is call other utilities.)


I think Python is a great language for certain scripting tasks. It
provides, among others, very easy list manipulation, things that can
be pretty cumbersome in shell scripts.
However, when the bulk of a script is to call another program, I too
prefer shell scripts. Having read the proposed Python script, the main
advantage of Python would be the manipulation of the (multiple)
RPATHs, their filtering, and combination into one string with :
delimiters. So, mainly this is the compute_rpath function. I'm pretty
sure this can also be achieved, with reasonable readability and
understandability, with a shell script and standard UNIX utilities.

>
> Alos note that I have a (slighty-related) pending series to do
> ELF-related cleanups (and also non ELF-related cleanups) in the
> target/ directory:
>     https://git.buildroot.org/~ymorin/git/buildroot/log/?h=yem/hooks
>
> It would do:
>   - remove symlinks to shared libs
>   - rename shared libs to their SONAME
>   - look for, or remove RPATH in ELF files
>   - deduplicate files into hard-links
>   - turn symlinks to hard-links
>
> This series is quite old now, but I haven't yet had time to finalise it.
> Maybe I can further work on it during the holidays season...

Now, that is just /too/ much shell scripting ;-)


/Thomas
Samuel Martin Dec. 13, 2015, 3:09 p.m. UTC | #6
Thomas², Yann, all,

On Fri, Dec 11, 2015 at 9:25 PM, Thomas De Schampheleire
<patrickdepinguin@gmail.com> wrote:
> Thomas, Yann, Samuel, all,
>
> On Fri, Dec 11, 2015 at 6:28 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
>> Thomas², Samuel, All,
>>
>> On 2015-12-11 16:57 +0100, Thomas Petazzoni spake thusly:
>>> On Fri, 11 Dec 2015 16:55:09 +0100, Thomas De Schampheleire wrote:
>>>
>>> > Ah, clearly not! Thanks for the reference.
>>> > (I did search in my gmail, but had deleted old mails a while back)
>>> >
>>> > Are there blocking issues with these patches? There are still some
>>> > open comments on v5 but no objections in that iteration.
>>> > (I haven't reviewed the code myself yet).
>>> >
>>> > Samuel, could you also share your view on the status of these patches?
These patches are still "alive" ;-), you can find the latest version
(still wip) here:
https://github.com/tSed/buildroot/tree/sma/host-pkg-rpath/wip

Since the v5, this branch includes:
- rebasing on top of master;
- tooling to track host-sytem leak in the SDK and target filesystems;
- reorganization of the files;
- a couple of fixes...

It still a WIP, code should be clean or almost, but not the branch
that clearly needs to be rebased!

>>> > Are you planning a new iteration with Baruch's comments fixed?
I answered them I think.

>>>
>>> My objection is that the host rpath checks is done in shell scripts,
>>> and the target rpath mungling in python scripts. I'd like to use either
>>> one or the other language, but not both.
>>
>> I agree that we should not have two scripts for that. Either we have a
>> shell script (enhance and expand the existing one), or we replace it
>> (e.g. with a Python script).
I second.

>>
>>> I believe Samuel said he wanted to respin his series, to use python
>>> scripts to also do the host rpath checks that were added by Yann.
>>
>> What I do not like in the current Python script from Samuel, is that it
>> basically just calls subprocess() to run patchelf. If that's all a
>> python script does, I believe it is better done in a shell script.
I agree...

>>
>> However, if we want to do it in Python (meh...), we should do it
>> natively in Python. There are Python modules to deal with ELF files:
>>
>>    https://pypi.python.org/pypi/elflib/        (last uploaded 2009-04-13)
>>    https://github.com/tbursztyka/python-elf    (last changed 2013-10-17)
>>    https://github.com/eliben/pyelftools        (last changed 2015-12-11)
I have already played a bit with pyelftools packages, but one of its
limitation is it can only read ELF files, not update some of their
sections :-/

>>
> [..]
>>
>> Of course, it means we would need that as a host-package, because we
>> can't expect it to be installed on the user's system. However, that
>> would mean *always* building our host-python, which would be a shame...
>> :-(
Well...True and false:
False: because python is already a pre-requirement for Buildroot, so
technically we could just install just python module as a generic
host-package and invoke the host system python interpreter; but true
because doing like this means totally by-pass teh Buildroot philosophy
and our way of doing thing :-/

Also, I have concerns about Buildroot depending or carrying many
external python packages for jobs belonging to the its global infra...

>>
>> (Note: Python may be a very good programming language, but I would not
>> want we do a Python script just for the sake of doing a Python script.
>> Let's use the best tool for the task, which would be a shell script if
>> all we do is call other utilities.)
>
>
> I think Python is a great language for certain scripting tasks. It
> provides, among others, very easy list manipulation, things that can
> be pretty cumbersome in shell scripts.
> However, when the bulk of a script is to call another program, I too
> prefer shell scripts. Having read the proposed Python script, the main
> advantage of Python would be the manipulation of the (multiple)
> RPATHs, their filtering, and combination into one string with :
> delimiters. So, mainly this is the compute_rpath function. I'm pretty
> sure this can also be achieved, with reasonable readability and
> understandability, with a shell script and standard UNIX utilities.
That's exactly the reason why I choose to do it in python,
manipulating strings/paths is easier (at least for me) in python than
in shell script... I can certainly improve skill on this ;-)

I can try rewrite it in shell script... Be prepared to review some...
not so pretty things ;-)

>
>>
>> Alos note that I have a (slighty-related) pending series to do
>> ELF-related cleanups (and also non ELF-related cleanups) in the
>> target/ directory:
>>     https://git.buildroot.org/~ymorin/git/buildroot/log/?h=yem/hooks
>>
>> It would do:
>>   - remove symlinks to shared libs
>>   - rename shared libs to their SONAME
>>   - look for, or remove RPATH in ELF files
>>   - deduplicate files into hard-links
>>   - turn symlinks to hard-links
>>
>> This series is quite old now, but I haven't yet had time to finalise it.
>> Maybe I can further work on it during the holidays season...
>
> Now, that is just /too/ much shell scripting ;-)
>
>
> /Thomas


Regards,
diff mbox

Patch

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -58,8 +58,9 @@  GLOBAL_INSTRUMENTATION_HOOKS += step_tim
 # This hook checks that host packages that need libraries that we build
 # have a proper DT_RPATH or DT_RUNPATH tag
 define check_host_rpath
+    $(if $(filter-out host-patchelf,$(3)),\
     $(if $(filter install-host,$(2)),\
-        $(if $(filter end,$(1)),support/scripts/check-host-rpath $(3)
$(HOST_DIR)))
+        $(if $(filter end,$(1)),support/scripts/check-host-rpath $(3)
$(HOST_DIR))))
 endef
 GLOBAL_INSTRUMENTATION_HOOKS += check_host_rpath

diff --git a/support/scripts/check-host-rpath b/support/scripts/check-host-rpath
--- a/support/scripts/check-host-rpath
+++ b/support/scripts/check-host-rpath
@@ -19,6 +19,7 @@  main() {
     ret=0
     while read file; do
         elf_needs_rpath "${file}" "${hostdir}" || continue
+        set_elf_rpath "${file}" "${hostdir}"
         check_elf_has_rpath "${file}" "${hostdir}" && continue
         if [ ${ret} -eq 0 ]; then
             ret=1
@@ -49,6 +50,12 @@  elf_needs_rpath() {
     return 1
 }

+set_elf_rpath() {
+    local file="${1}"
+    local hostdir="${2}"
+    ${hostdir}/usr/bin/patchelf --set-rpath '$ORIGIN/../lib' $file
+}
+
 check_elf_has_rpath() {
     local file="${1}"
     local hostdir="${2}"
@@ -58,7 +65,7 @@  check_elf_has_rpath() {
         for dir in ${rpath//:/ }; do
             # Remove duplicate and trailing '/' for proper match
             dir="$( sed -r -e 's:/+:/:g; s:/$::;' <<<"${dir}" )"
-            [ "${dir}" = "${hostdir}/usr/lib" ] && return 0
+            [ "${dir}" = '$ORIGIN/../lib' ] && return 0
         done
     done < <( readelf -d "${file}"
          \