Message ID | 1488550733-3956-3-git-send-email-wg@grandegger.com |
---|---|
State | Superseded |
Headers | show |
Hello, On Fri, 3 Mar 2017 15:18:46 +0100, Wolfgang Grandegger wrote: > From: Samuel Martin <s.martin49@gmail.com> > > This commit introduces a fix-rpath script able to scan a tree, > detect ELF files, check their RPATH and fix it in a proper way. > The RPATH fixup is done by the patchelf utility using the option > "--make-rpath-relative". > > Signed-off-by: Samuel Martin <s.martin49@gmail.com> > Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> > --- > support/scripts/fix-rpath | 106 ++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 106 insertions(+) > create mode 100755 support/scripts/fix-rpath Now that patchelf is much smarter, I am wondering if we still really need a utility script just to call patchelf. Have you tried bringing this logic into the make logic? Thanks, Thomas
Hello, Am 03.03.2017 um 15:48 schrieb Thomas Petazzoni: > Hello, > > On Fri, 3 Mar 2017 15:18:46 +0100, Wolfgang Grandegger wrote: >> From: Samuel Martin <s.martin49@gmail.com> >> >> This commit introduces a fix-rpath script able to scan a tree, >> detect ELF files, check their RPATH and fix it in a proper way. >> The RPATH fixup is done by the patchelf utility using the option >> "--make-rpath-relative". >> >> Signed-off-by: Samuel Martin <s.martin49@gmail.com> >> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> >> --- >> support/scripts/fix-rpath | 106 ++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 106 insertions(+) >> create mode 100755 support/scripts/fix-rpath > > Now that patchelf is much smarter, I am wondering if we still really > need a utility script just to call patchelf. Have you tried bringing > this logic into the make logic? Well, I think three magic "find" oneliners could do the job. I will try and see. Wolfgang.
Hi Wolfgang, I realize that this patch is going to be dropped because it is going to be done directly from the makefiles, but I have some hopefully useful comments still. First of all: make sure the original author is explicitly in Cc. git send-email normally does that automatically so I'm surprised Samuel isn't there. On 03-03-17 15:18, Wolfgang Grandegger wrote: > From: Samuel Martin <s.martin49@gmail.com> > > This commit introduces a fix-rpath script able to scan a tree, > detect ELF files, check their RPATH and fix it in a proper way. > The RPATH fixup is done by the patchelf utility using the option > "--make-rpath-relative". > > Signed-off-by: Samuel Martin <s.martin49@gmail.com> > Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> > --- > support/scripts/fix-rpath | 106 ++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 106 insertions(+) > create mode 100755 support/scripts/fix-rpath > > diff --git a/support/scripts/fix-rpath b/support/scripts/fix-rpath > new file mode 100755 > index 0000000..bde2c17 > --- /dev/null > +++ b/support/scripts/fix-rpath > @@ -0,0 +1,106 @@ > +#!/usr/bin/env bash > + > +# Copyright (C) 2016 Samuel Martin <s.martin49@gmail.com> > +# > +# This program is free software; you can redistribute it and/or modify > +# it under the terms of the GNU General Public License as published by > +# the Free Software Foundation; either version 2 of the License, or > +# (at your option) any later version. > +# > +# This program is distributed in the hope that it will be useful, > +# but WITHOUT ANY WARRANTY; without even the implied warranty of > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > +# General Public License for more details. > +# > +# You should have received a copy of the GNU General Public License > +# along with this program; if not, write to the Free Software > +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > + > +#set -e > +#set -v Don't include debugging cruft in the final patch. > + > +usage() { > + cat <<EOF >&2 > +Usage: ${0} TREE_KIND TREE_ROOT > + > +Description: > + > + This script scans a tree and sanitize ELF files' RPATH found in there. > + > + Sanitization behaves the same whatever the kindd of the processed tree, but > + the resulting RPATH differs. > + > + Sanitization action: > + - remove RPATH pointing outside of the tree > + - for RPATH pointing in the tree: > + - if they point to standard location (/lib, /usr/lib): remove them > + - otherwise: make them relative using \$ORIGIN > + > + For the target tree: > + - scan the whole tree for sanitization > + > + For the staging tree : > + - scan the whole tree for sanitization > + > + For the host tree: > + - skip the staging tree for sanitization > + - add \$HOST_DIR/{lib,usr/lib} to RPATH (as relative pathes) > + > +Arguments: > + > + TREE_KIND Kind of tree to be processed. > + Allowed values: host, target, staging > + > + TREE_ROOT Path to the root of the tree to be scaned > + > +Environment: > + > + PATCHELF patchelf program to use > + (default: patchelf) > +EOF > +} > + > +: ${PATCHELF:=patchelf} > + > +main() { > + local tree="${1}" > + local basedir="$(readlink -f "${2}")" > + > + local find_args=( "${basedir}" ) > + local sanitize_extra_args=( ) > + > + case "${tree}" in > + host) > + # do not process the sysroot (only contains target binaries) > + find_args+=( "-name" "sysroot" "-prune" "-o" ) I believe we should specify a full path here. If there is any other directory in the tree that happens to be called 'sysroot' for whatever reason, it shouldn't be pruned. When this is done directly from make, it's pretty easy: -path $(STAGING_DIR) -prune > + > + # do not process the external toolchain installation directory to > + # to avoid breaking it. > + find_args+=( "-path" "*/opt/ext-toolchain" "-prune" "-o" ) Same here: $(TOOLCHAIN_EXTERNAL_DOWNLOAD_INSTALL_DIR) which has the full path. Regards, Arnout > + > + ;; > + staging|target) > + # discard ${hostdir}/lib and ${hostdir}/usr/lib > + sanitize_extra_args+=( "--no-standard-lib-dirs" ) > + > + ;; > + *) > + usage > + exit 1 > + ;; > + esac > + > + find_args+=( "-type" "f" "-print" ) > + > + while read file ; do > + # some files are not writable > + chmod u+w $file > + # call patchelf for each regular file; error will silently be ignored. > + ${PATCHELF} --debug --make-rpath-relative "${basedir}" ${sanitize_extra_args[@]} "${file}" >> /dev/null 2>&1 > + done < <(find ${find_args[@]}) > + > + # ignore errors > + return 0 > +} > + > +main ${@} >
Hello Arnout, Am 04.03.2017 um 12:39 schrieb Arnout Vandecappelle: > Hi Wolfgang, > > I realize that this patch is going to be dropped because it is going to be done > directly from the makefiles, but I have some hopefully useful comments still. If the logic is not too complex. > > First of all: make sure the original author is explicitly in Cc. git send-email > normally does that automatically so I'm surprised Samuel isn't there. I think the mailing-list is stripping the CC. On the mail I got, Samuel is on CC. I did use "git send-email" with an "-cc" to Samuel. > On 03-03-17 15:18, Wolfgang Grandegger wrote: >> From: Samuel Martin <s.martin49@gmail.com> >> >> This commit introduces a fix-rpath script able to scan a tree, >> detect ELF files, check their RPATH and fix it in a proper way. >> The RPATH fixup is done by the patchelf utility using the option >> "--make-rpath-relative". >> >> Signed-off-by: Samuel Martin <s.martin49@gmail.com> >> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> >> --- >> support/scripts/fix-rpath | 106 ++++++++++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 106 insertions(+) >> create mode 100755 support/scripts/fix-rpath >> >> diff --git a/support/scripts/fix-rpath b/support/scripts/fix-rpath >> new file mode 100755 >> index 0000000..bde2c17 >> --- /dev/null >> +++ b/support/scripts/fix-rpath >> @@ -0,0 +1,106 @@ >> +#!/usr/bin/env bash >> + >> +# Copyright (C) 2016 Samuel Martin <s.martin49@gmail.com> >> +# >> +# This program is free software; you can redistribute it and/or modify >> +# it under the terms of the GNU General Public License as published by >> +# the Free Software Foundation; either version 2 of the License, or >> +# (at your option) any later version. >> +# >> +# This program is distributed in the hope that it will be useful, >> +# but WITHOUT ANY WARRANTY; without even the implied warranty of >> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> +# General Public License for more details. >> +# >> +# You should have received a copy of the GNU General Public License >> +# along with this program; if not, write to the Free Software >> +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA >> + >> +#set -e >> +#set -v > > Don't include debugging cruft in the final patch. I agree! >> + >> +usage() { >> + cat <<EOF >&2 >> +Usage: ${0} TREE_KIND TREE_ROOT >> + >> +Description: >> + >> + This script scans a tree and sanitize ELF files' RPATH found in there. >> + >> + Sanitization behaves the same whatever the kindd of the processed tree, but >> + the resulting RPATH differs. >> + >> + Sanitization action: >> + - remove RPATH pointing outside of the tree >> + - for RPATH pointing in the tree: >> + - if they point to standard location (/lib, /usr/lib): remove them >> + - otherwise: make them relative using \$ORIGIN >> + >> + For the target tree: >> + - scan the whole tree for sanitization >> + >> + For the staging tree : >> + - scan the whole tree for sanitization >> + >> + For the host tree: >> + - skip the staging tree for sanitization >> + - add \$HOST_DIR/{lib,usr/lib} to RPATH (as relative pathes) >> + >> +Arguments: >> + >> + TREE_KIND Kind of tree to be processed. >> + Allowed values: host, target, staging >> + >> + TREE_ROOT Path to the root of the tree to be scaned >> + >> +Environment: >> + >> + PATCHELF patchelf program to use >> + (default: patchelf) >> +EOF >> +} >> + >> +: ${PATCHELF:=patchelf} >> + >> +main() { >> + local tree="${1}" >> + local basedir="$(readlink -f "${2}")" >> + >> + local find_args=( "${basedir}" ) >> + local sanitize_extra_args=( ) >> + >> + case "${tree}" in >> + host) >> + # do not process the sysroot (only contains target binaries) >> + find_args+=( "-name" "sysroot" "-prune" "-o" ) > > I believe we should specify a full path here. If there is any other directory > in the tree that happens to be called 'sysroot' for whatever reason, it > shouldn't be pruned. When this is done directly from make, it's pretty easy: > -path $(STAGING_DIR) -prune OK. > >> + >> + # do not process the external toolchain installation directory to >> + # to avoid breaking it. >> + find_args+=( "-path" "*/opt/ext-toolchain" "-prune" "-o" ) > > Same here: $(TOOLCHAIN_EXTERNAL_DOWNLOAD_INSTALL_DIR) which has the full path. OK. > > Regards, > Arnout > >> + >> + ;; >> + staging|target) >> + # discard ${hostdir}/lib and ${hostdir}/usr/lib >> + sanitize_extra_args+=( "--no-standard-lib-dirs" ) >> + >> + ;; >> + *) >> + usage >> + exit 1 >> + ;; >> + esac >> + >> + find_args+=( "-type" "f" "-print" ) >> + >> + while read file ; do >> + # some files are not writable >> + chmod u+w $file This is just a hack because some elf files are not writable. To make all regular file writable is wrong. We may also want to reset the original permission. It needs more logic, unfortunately. >> + # call patchelf for each regular file; error will silently be ignored. >> + ${PATCHELF} --debug --make-rpath-relative "${basedir}" ${sanitize_extra_args[@]} "${file}" >> /dev/null 2>&1 >> + done < <(find ${find_args[@]}) >> + >> + # ignore errors Simply ignoring errors here is fast, but it also does not allow to print warnings. Wolfgang.
Hello, Am 03.03.2017 um 17:39 schrieb Wolfgang Grandegger: > Hello, > > Am 03.03.2017 um 15:48 schrieb Thomas Petazzoni: >> Hello, >> >> On Fri, 3 Mar 2017 15:18:46 +0100, Wolfgang Grandegger wrote: >>> From: Samuel Martin <s.martin49@gmail.com> >>> >>> This commit introduces a fix-rpath script able to scan a tree, >>> detect ELF files, check their RPATH and fix it in a proper way. >>> The RPATH fixup is done by the patchelf utility using the option >>> "--make-rpath-relative". >>> >>> Signed-off-by: Samuel Martin <s.martin49@gmail.com> >>> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> >>> --- >>> support/scripts/fix-rpath | 106 >>> ++++++++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 106 insertions(+) >>> create mode 100755 support/scripts/fix-rpath >> >> Now that patchelf is much smarter, I am wondering if we still really >> need a utility script just to call patchelf. Have you tried bringing >> this logic into the make logic? > > Well, I think three magic "find" oneliners could do the job. I will try > and see. Here are my current oneliners in the Makefile. It mainly shows what needs to be done. Unfortunately, the scanning for ELF files may take rather long: >>> Sanitizing RPATH in target and staging directory time find /opt/bdo/x86_host/target -type f -print0 | xargs -0 -I {} sh -c \ "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/target \ --no-standard-lib-dirs {}; fi" real 0m9.644s user 0m0.032s sys 0m0.360s time find /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot -type f -print0 | xargs -0 -I {} sh -c \ "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot \ --no-standard-lib-dirs {}; fi" real 0m46.433s user 0m0.240s sys 0m1.980s >>> Rendering the SDK relocatable cp /opt/bdo/x86_host/host/usr/bin/patchelf /opt/bdo/x86_host/host/usr/bin/patchelf.__copy__ time find /opt/bdo/x86_host/host -path /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot -prune -o \ -path /opt/bdo/x86_host/host/usr/bin/patchelf -prune -o -type f -print0 | xargs -0 -I {} sh -c \ "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/host {}; \ fi" mv /opt/bdo/dcu_host/host/usr/bin/patchelf.__copy__ /opt/bdo/dcu_host/host/usr/bin/patchelf real 0m23.154s user 0m0.144s sys 0m1.124s Using "file" to test if it's an ELF files is much slower. Using "patchelf --print-rpath {} >/dev/null 2>&1" is much smarter and allows to run the path sanitation without ignoring errors. Because some ELF files are not writeable, we need a chmod first. For the host tree, we also need special handling for patchelf to work around the "file in use" issue. "xargs" could be speedup using "-P NUM". Would that be an option? We could also introduce BR2_TOOLCHAIN_RELOCATABLE. Any other idea to speed up the scripts above? Wolfgang.
Am 07.03.2017 um 00:49 schrieb Arnout Vandecappelle: > > > On 06-03-17 10:07, Wolfgang Grandegger wrote: >> Hello, >> >> Am 03.03.2017 um 17:39 schrieb Wolfgang Grandegger: >>> Hello, >>> >>> Am 03.03.2017 um 15:48 schrieb Thomas Petazzoni: >>>> Hello, >>>> >>>> On Fri, 3 Mar 2017 15:18:46 +0100, Wolfgang Grandegger wrote: >>>>> From: Samuel Martin <s.martin49@gmail.com> >>>>> >>>>> This commit introduces a fix-rpath script able to scan a tree, >>>>> detect ELF files, check their RPATH and fix it in a proper way. >>>>> The RPATH fixup is done by the patchelf utility using the option >>>>> "--make-rpath-relative". >>>>> >>>>> Signed-off-by: Samuel Martin <s.martin49@gmail.com> >>>>> Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> >>>>> --- >>>>> support/scripts/fix-rpath | 106 >>>>> ++++++++++++++++++++++++++++++++++++++++++++++ >>>>> 1 file changed, 106 insertions(+) >>>>> create mode 100755 support/scripts/fix-rpath >>>> >>>> Now that patchelf is much smarter, I am wondering if we still really >>>> need a utility script just to call patchelf. Have you tried bringing >>>> this logic into the make logic? >>> >>> Well, I think three magic "find" oneliners could do the job. I will try >>> and see. >> >> Here are my current oneliners in the Makefile. It mainly shows what needs >> to be done. > > I think they're sufficiently complicated to warrant offloading to a script > after all. Especially since you need the same structure three times, so it would > have to be factored into a macro, which makes the makefile more complicated. > Alternatively, you could keep the find command in make, and use -exec > support/scripts/fix-rpath '{}' ';' - this avoids all the tricky xargs stuff. +1 >> Unfortunately, the scanning for ELF files may take rather >> long: >> >> >>> Sanitizing RPATH in target and staging directory >> time find /opt/bdo/x86_host/target -type f -print0 | xargs -0 -I {} sh -c \ >> "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ > > You can actually include this in the find command itself, it's almost twice as > fast: > > find /opt/bdo/x86_host/target -type f \ > -exec patchelf --print-rpath '{}' ';' \ > -exec support/scripts/fix-rpath '{}' ';' OK, it' two times faster, indeed >> /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/target \ >> --no-standard-lib-dirs {}; fi" >> >> real 0m9.644s > > Can you quantify which bit is taking the time here? Is it the find itself, or > the patchelf --print-rpath, or the final patchelf? The initial ELF file testing takes most of the time. The processing of the ELF file itself doesn't matter much: $ find target -type f | wc -l 3902 $ find host/usr/x86_64-buildroot-linux-gnu/sysroot -type f | wc -l 21413 $ find host -path host/usr/x86_64-buildroot-linux-gnu/sysroot -prune -o -type f | wc -l 10861 > Also, how does the time compare to the rest of the finalize step and creating a > tarball? The rest of "target-finalize" takes half a second and the tar: $ time tar cjf /tmp/target.tar.bz2 target real 0m21.742s But it depends on the compression, of course. >> user 0m0.032s >> sys 0m0.360s >> >> time find /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot -type f -print0 | xargs -0 -I {} sh -c \ > > We could optimise this a little by eliminating directories that certainly don't > contain interesting files, like /usr/include. Or alternatively, explicitly > select only "\( -path lib -o -path bin -o -path sbin \)". $ find host/usr/x86_64-buildroot-linux-gnu/sysroot/usr/include/ -type f | wc -l 6946 $ find host/usr/x86_64-buildroot-linux-gnu/sysroot/usr/share/ -type f | wc -l 9273 But there is one ELF file in ".../usr/share". > However, now I think of it: why do we do this for staging? Binaries in staging > are never executed... Is it just to eliminate all references to HOST_DIR from > the binaries? Good question! So far I followed Samuels proposal (now on CC). >> "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ >> /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot \ >> --no-standard-lib-dirs {}; fi" >> >> real 0m46.433s >> user 0m0.240s >> sys 0m1.980s >> >> >>> Rendering the SDK relocatable >> cp /opt/bdo/x86_host/host/usr/bin/patchelf /opt/bdo/x86_host/host/usr/bin/patchelf.__copy__ > > Why do you need this? To make sure patchelf itself is processed as well? Does > it contain an invalid rpath? If yes, isn't it easier to patch the patchelf build > system so it uses $ORIGIN already? We cannot update the "patchelf" binary while it's in use. There no need to touch it if it already uses a proper rpath, of course. Currently it uses: $ readelf -d host/usr/bin/patchelf ... 0x0000000000000001 (NEEDED) Gemeinsame Bibliothek [libstdc++.so.6] 0x0000000000000001 (NEEDED) Gemeinsame Bibliothek [libgcc_s.so.1] 0x0000000000000001 (NEEDED) Gemeinsame Bibliothek [libc.so.6] 0x000000000000000f (RPATH) Bibliothek rpath: [/opt/bdo/dcu_host/host/usr/lib] "patchelf --make-relative" will drop the rpath above, because the first two needed libs are not in the listed rpath but in "host/usr/x86_64-buildroot-linux-gnu/lib64". Running patchelf with "LD_DEBUG" tells me that it will take the libraries from the host (/usr/lib). Just wondering if that's correct!? > >> time find /opt/bdo/x86_host/host -path /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot -prune -o \ >> -path /opt/bdo/x86_host/host/usr/bin/patchelf -prune -o -type f -print0 | xargs -0 -I {} sh -c \ >> "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ >> /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/host {}; \ >> fi" >> mv /opt/bdo/dcu_host/host/usr/bin/patchelf.__copy__ /opt/bdo/dcu_host/host/usr/bin/patchelf >> >> real 0m23.154s >> user 0m0.144s >> sys 0m1.124s >> >> >> Using "file" to test if it's an ELF files is much slower. Using >> "patchelf --print-rpath {} >/dev/null 2>&1" is much smarter and allows >> to run the path sanitation without ignoring errors. > > readelf -h is still a little faster, but it also matches files without rpath. Yes, 38 vs. 44 seconds. > To just check if it's an ELF file, it's actually enough to do "cmp -n 4" with > another ELF file (e.g. patchelf itself). That gives even more false positives > (e.g. object files). So if one of those is chosen, a further check with patchelf > --print-rpath is still needed, or errors have to be ignored. Yep. >> >> Because some ELF files are not writeable, we need a chmod first. > > Shouldn't we restore the original permissions? Maybe! There are actually two libraries not being writeable. Can't tell if that's by purpose. >> For the >> host tree, we also need special handling for patchelf to work around the >> "file in use" issue. >> >> "xargs" could be speedup using "-P NUM". Would that be an option? > > Yes, we could use PARALLEL_JOBS. Using -P8 on my i7 quad-core laptop is almost 8 times faster. Wolfgang.
Am 07.03.2017 um 18:40 schrieb Arnout Vandecappelle: > > > On 07-03-17 10:13, Wolfgang Grandegger wrote: >> Am 07.03.2017 um 00:49 schrieb Arnout Vandecappelle: >>> >>> >>> On 06-03-17 10:07, Wolfgang Grandegger wrote: >>>> Hello, >>>> > [snip] >>>> >>> Sanitizing RPATH in target and staging directory >>>> time find /opt/bdo/x86_host/target -type f -print0 | xargs -0 -I {} sh -c \ >>>> "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ >>> >>> You can actually include this in the find command itself, it's almost twice as >>> fast: >>> >>> find /opt/bdo/x86_host/target -type f \ >>> -exec patchelf --print-rpath '{}' ';' \ >>> -exec support/scripts/fix-rpath '{}' ';' >> >> OK, it' two times faster, indeed > > Actually with such a construct, and assuming we don't bother with restoring the > permissions, a helper shell script isn't needed because we can just chain the > chmod and patchelf calls with further -exec calls. Of course with -exec, xargs > -P is not possible so you have to evaluate a little what is the best approach. "-P" is very effective... but that's fine tuning. >>>> /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/target \ >>>> --no-standard-lib-dirs {}; fi" >>>> >>>> real 0m9.644s >>> >>> Can you quantify which bit is taking the time here? Is it the find itself, or >>> the patchelf --print-rpath, or the final patchelf? >> >> The initial ELF file testing takes most of the time. The processing of the ELF >> file itself doesn't matter much: >> >> $ find target -type f | wc -l >> 3902 >> $ find host/usr/x86_64-buildroot-linux-gnu/sysroot -type f | wc -l >> 21413 >> $ find host -path host/usr/x86_64-buildroot-linux-gnu/sysroot -prune -o -type f | wc -l >> 10861 > > What kind of config is this? Many packages? Or just an internal toolchain? It's a config including QT5... also doubling the build time . $ find sysroot/usr/lib/qt/ -type f| wc -l 3355 $ find sysroot/usr/include/qt5/ -type f| wc -l 4168 >>> Also, how does the time compare to the rest of the finalize step and creating a >>> tarball? >> >> The rest of "target-finalize" takes half a second and the tar: >> >> $ time tar cjf /tmp/target.tar.bz2 target >> real 0m21.742s >> >> But it depends on the compression, of course. >> >>>> user 0m0.032s >>>> sys 0m0.360s >>>> >>>> time find /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot -type f -print0 | xargs -0 -I {} sh -c \ >>> >>> We could optimise this a little by eliminating directories that certainly don't >>> contain interesting files, like /usr/include. Or alternatively, explicitly >>> select only "\( -path lib -o -path bin -o -path sbin \)". >> >> >> $ find host/usr/x86_64-buildroot-linux-gnu/sysroot/usr/include/ -type f | wc -l >> 6946 >> $ find host/usr/x86_64-buildroot-linux-gnu/sysroot/usr/share/ -type f | wc -l >> 9273 >> >> But there is one ELF file in ".../usr/share". > > Ah yes, I also found one: > /usr/share/bash-completion/helpers/gst-completion-helper-1.0 > > So in that case we can't skip share. We certainly can skip include, I think, > though the benefit is perhaps limited. > > Of course, if the entire staging can be skipped it's even easier :-) Yep. >>> However, now I think of it: why do we do this for staging? Binaries in staging >>> are never executed... Is it just to eliminate all references to HOST_DIR from >>> the binaries? >> >> Good question! So far I followed Samuels proposal (now on CC). >> >>>> "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ >>>> /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot \ >>>> --no-standard-lib-dirs {}; fi" >>>> >>>> real 0m46.433s >>>> user 0m0.240s >>>> sys 0m1.980s >>>> >>>> >>> Rendering the SDK relocatable >>>> cp /opt/bdo/x86_host/host/usr/bin/patchelf /opt/bdo/x86_host/host/usr/bin/patchelf.__copy__ >>> >>> Why do you need this? To make sure patchelf itself is processed as well? Does >>> it contain an invalid rpath? If yes, isn't it easier to patch the patchelf build >>> system so it uses $ORIGIN already? >> >> We cannot update the "patchelf" binary while it's in use. > > Yes, but that's avoided with the -name patchelf -prune bit. Yep. >> There no >> need to touch it if it already uses a proper rpath, of course. >> Currently it uses: >> >> $ readelf -d host/usr/bin/patchelf >> ... >> 0x0000000000000001 (NEEDED) Gemeinsame Bibliothek [libstdc++.so.6] >> 0x0000000000000001 (NEEDED) Gemeinsame Bibliothek [libgcc_s.so.1] >> 0x0000000000000001 (NEEDED) Gemeinsame Bibliothek [libc.so.6] >> 0x000000000000000f (RPATH) Bibliothek rpath: [/opt/bdo/dcu_host/host/usr/lib] >> >> "patchelf --make-relative" will drop the rpath above, because the first >> two needed libs are not in the listed rpath but in "host/usr/x86_64-buildroot-linux-gnu/lib64". > > That's the staging dir - it should certainly NOT use anything from there. No, the staging dir is "host/usr/x86_64-buildroot-linux-gnu/sysroot/". >> Running patchelf with "LD_DEBUG" tells me that it will take the libraries >> from the host (/usr/lib). Just wondering if that's correct!? > > Yes it's correct, those 3 libraries are standard host libraries that can be > found in the standard paths. Hm, what are the libraries in "host/usr/x86_64-buildroot-linux-gnu/lib64" then good for? They work if I use "LD_LIBRARY_PATH" to run the executable. > So I've checked where this rpath comes from. Turns out it is added by > Buildroot, through HOST_LDFLAGS. This is in fact needed to make sure that an > executable that uses libraries from HOST_DIR works - see commit > 4fdecac9d692b8d6f071ba6ad938b6ad68b675fd. So we can either: Shouldn't "host/usr/x86_64-buildroot-linux-gnu/lib64" not treated in a similar way? Just wondering! I have attached the debug output of patchelf for the host tree. > - keep this __copy__ manipulation in the patchelf step; or > - override HOST_LDFLAGS in patchelf.mk. > > I'm cool either way, so since you already have this workaround, just keep it. > Perhaps a comment above it explaining why would be useful though. OK. >>>> time find /opt/bdo/x86_host/host -path /opt/bdo/x86_host/host/usr/x86_64-buildroot-linux-gnu/sysroot -prune -o \ >>>> -path /opt/bdo/x86_host/host/usr/bin/patchelf -prune -o -type f -print0 | xargs -0 -I {} sh -c \ >>>> "if patchelf --print-rpath {} >/dev/null 2>&1; then chmod +w {}; \ >>>> /opt/bdo/x86_host/host/usr/bin/patchelf --make-rpath-relative /opt/bdo/x86_host/host {}; \ >>>> fi" >>>> mv /opt/bdo/dcu_host/host/usr/bin/patchelf.__copy__ /opt/bdo/dcu_host/host/usr/bin/patchelf >>>> >>>> real 0m23.154s >>>> user 0m0.144s >>>> sys 0m1.124s >>>> >>>> >>>> Using "file" to test if it's an ELF files is much slower. Using >>>> "patchelf --print-rpath {} >/dev/null 2>&1" is much smarter and allows >>>> to run the path sanitation without ignoring errors. >>> >>> readelf -h is still a little faster, but it also matches files without rpath. >> >> Yes, 38 vs. 44 seconds. >> >>> To just check if it's an ELF file, it's actually enough to do "cmp -n 4" with >>> another ELF file (e.g. patchelf itself). That gives even more false positives >>> (e.g. object files). So if one of those is chosen, a further check with patchelf >>> --print-rpath is still needed, or errors have to be ignored. >> >> Yep. > > To be evaluated if the speedup from using cmp is worth the false positives. I wrote a little C program just checking the first 4 bytes of the file. The saving is 19 vs. 22 seconds. With "readelf" I have similar results. patchelf is written in C++... maybe that's the reason why it's slower. >>>> Because some ELF files are not writeable, we need a chmod first. >>> >>> Shouldn't we restore the original permissions? >> >> Maybe! There are actually two libraries not being writeable. Can't tell if >> that's by purpose. > > Perhaps as an easier way to restore permissions, we could do it in patchelf > itself: add something like --force that does a chmod if needed, similar to how > some editors do it. Good idea! > There are actually quite a few packages that install library read-only, e.g. > Python and openssl. I guess it is on purpose, but I'm not sure if it is important. I think restoring the permission is the correct solution. Wolfgang.
On 08-03-17 10:25, Wolfgang Grandegger wrote: > Am 07.03.2017 um 18:40 schrieb Arnout Vandecappelle: >> >> [snip] >> So in that case we can't skip share. We certainly can skip include, I think, >> though the benefit is perhaps limited. >> >> Of course, if the entire staging can be skipped it's even easier :-) > > Yep. I propose that you start with the simple option, but don't do it for staging for the time being. Until Samuel comes with an argument why staging is necessary :-) [snip] >>> "patchelf --make-relative" will drop the rpath above, because the first >>> two needed libs are not in the listed rpath but in >>> "host/usr/x86_64-buildroot-linux-gnu/lib64". >> >> That's the staging dir - it should certainly NOT use anything from there. > > No, the staging dir is "host/usr/x86_64-buildroot-linux-gnu/sysroot/". My bad. Still, it's a cross-directory. It contains libgcc and other stuff from the cross-compiler. So it is NOT meant to be used by the host binaries. It's easier to see when you do actual cross-compilation, then you'll see the binaries there are for the target, not the host. > >>> Running patchelf with "LD_DEBUG" tells me that it will take the libraries >>> from the host (/usr/lib). Just wondering if that's correct!? >> >> Yes it's correct, those 3 libraries are standard host libraries that can be >> found in the standard paths. > > Hm, what are the libraries in "host/usr/x86_64-buildroot-linux-gnu/lib64" then > good for? They work if I use "LD_LIBRARY_PATH" to run the executable. Since your target is x86_64, just like your host, running a target binary will just work as long as it can find the libraries (i.e. as long as you set LD_LIBRARY_PATH). [snip] >> To be evaluated if the speedup from using cmp is worth the false positives. > > I wrote a little C program just checking the first 4 bytes of the file. The > saving is 19 vs. 22 seconds. With "readelf" I have similar results. You mean 19 seconds for the C program vs 22 seconds for cmp? That's what I would expect indeed because cmp basically does the same. In my measurements, readelf still was significantly slower - I didn't write down the numbers but from memory it was like 25%. > patchelf is > written in C++... maybe that's the reason why it's slower. Well, it's rather because patchelf reads the entire file into a std::vector, while readelf will just seek to the bits that are requested. By the way, did you already post your patches to the patchelf list? Regards, Arnout [snip]
Am 12.03.2017 um 21:53 schrieb Arnout Vandecappelle: > > > On 08-03-17 10:25, Wolfgang Grandegger wrote: >> Am 07.03.2017 um 18:40 schrieb Arnout Vandecappelle: >>> >>> > [snip] >>> So in that case we can't skip share. We certainly can skip include, I think, >>> though the benefit is perhaps limited. >>> >>> Of course, if the entire staging can be skipped it's even easier :-) >> >> Yep. > > I propose that you start with the simple option, but don't do it for staging > for the time being. Until Samuel comes with an argument why staging is necessary :-) > > [snip] >>>> "patchelf --make-relative" will drop the rpath above, because the first >>>> two needed libs are not in the listed rpath but in >>>> "host/usr/x86_64-buildroot-linux-gnu/lib64". >>> >>> That's the staging dir - it should certainly NOT use anything from there. >> >> No, the staging dir is "host/usr/x86_64-buildroot-linux-gnu/sysroot/". > > My bad. Still, it's a cross-directory. It contains libgcc and other stuff from > the cross-compiler. So it is NOT meant to be used by the host binaries. > > It's easier to see when you do actual cross-compilation, then you'll see the > binaries there are for the target, not the host. > > >> >>>> Running patchelf with "LD_DEBUG" tells me that it will take the libraries >>>> from the host (/usr/lib). Just wondering if that's correct!? >>> >>> Yes it's correct, those 3 libraries are standard host libraries that can be >>> found in the standard paths. >> >> Hm, what are the libraries in "host/usr/x86_64-buildroot-linux-gnu/lib64" then >> good for? They work if I use "LD_LIBRARY_PATH" to run the executable. > > Since your target is x86_64, just like your host, running a target binary will > just work as long as it can find the libraries (i.e. as long as you set > LD_LIBRARY_PATH). > > [snip] >>> To be evaluated if the speedup from using cmp is worth the false positives. >> >> I wrote a little C program just checking the first 4 bytes of the file. The >> saving is 19 vs. 22 seconds. With "readelf" I have similar results. > > You mean 19 seconds for the C program vs 22 seconds for cmp? That's what I > would expect indeed because cmp basically does the same. In my measurements, > readelf still was significantly slower - I didn't write down the numbers but > from memory it was like 25%. No, 19 sec for the C program just checking the first 4 bytes. 22 with patchelf. And with readelf I think it was close to 19 sec as well. I will doublecheck, though. > >> patchelf is >> written in C++... maybe that's the reason why it's slower. > > Well, it's rather because patchelf reads the entire file into a std::vector, > while readelf will just seek to the bits that are requested. > > > > By the way, did you already post your patches to the patchelf list? I'm currently preparing a new patch series. I did not find a mailing list for the patchelf project. Maybe I have to use the GIT hub interface. Wolfgang.
On 12-03-17 22:10, Wolfgang Grandegger wrote: > Am 12.03.2017 um 21:53 schrieb Arnout Vandecappelle: >> >> [snip] >> You mean 19 seconds for the C program vs 22 seconds for cmp? That's what I >> would expect indeed because cmp basically does the same. In my measurements, >> readelf still was significantly slower - I didn't write down the numbers but >> from memory it was like 25%. > > No, 19 sec for the C program just checking the first 4 bytes. 22 with patchelf. > And with readelf I think it was close to 19 sec as well. I will doublecheck, > though. I think I had a bigger difference, but anyway even 20% isn't really worth it, probably. > >> >>> patchelf is >>> written in C++... maybe that's the reason why it's slower. >> >> Well, it's rather because patchelf reads the entire file into a std::vector, >> while readelf will just seek to the bits that are requested. >> >> >> >> By the way, did you already post your patches to the patchelf list? > > I'm currently preparing a new patch series. I did not find a mailing list for > the patchelf project. Maybe I have to use the GIT hub interface. Yes, github pull request is the way to go. Regards, Arnout
Hello Arnout, Am 13.03.2017 um 18:08 schrieb Arnout Vandecappelle: > > > On 12-03-17 22:10, Wolfgang Grandegger wrote: >> Am 12.03.2017 um 21:53 schrieb Arnout Vandecappelle: >>> >>> > [snip] >>> You mean 19 seconds for the C program vs 22 seconds for cmp? That's what I >>> would expect indeed because cmp basically does the same. In my measurements, >>> readelf still was significantly slower - I didn't write down the numbers but >>> from memory it was like 25%. >> >> No, 19 sec for the C program just checking the first 4 bytes. 22 with patchelf. >> And with readelf I think it was close to 19 sec as well. I will doublecheck, >> though. > > I think I had a bigger difference, but anyway even 20% isn't really worth it, > probably. Well, I just learned that it's very difficult to do reproducible measurements with my laptop due to frequency scaling, boost, temperature, etc. :(. >> >>> >>>> patchelf is >>>> written in C++... maybe that's the reason why it's slower. >>> >>> Well, it's rather because patchelf reads the entire file into a std::vector, >>> while readelf will just seek to the bits that are requested. >>> >>> >>> >>> By the way, did you already post your patches to the patchelf list? >> >> I'm currently preparing a new patch series. I did not find a mailing list for >> the patchelf project. Maybe I have to use the GIT hub interface. > > Yes, github pull request is the way to go. OK, will do with a CC to this list. More soon... Wolfgang.
diff --git a/support/scripts/fix-rpath b/support/scripts/fix-rpath new file mode 100755 index 0000000..bde2c17 --- /dev/null +++ b/support/scripts/fix-rpath @@ -0,0 +1,106 @@ +#!/usr/bin/env bash + +# Copyright (C) 2016 Samuel Martin <s.martin49@gmail.com> +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + +#set -e +#set -v + +usage() { + cat <<EOF >&2 +Usage: ${0} TREE_KIND TREE_ROOT + +Description: + + This script scans a tree and sanitize ELF files' RPATH found in there. + + Sanitization behaves the same whatever the kindd of the processed tree, but + the resulting RPATH differs. + + Sanitization action: + - remove RPATH pointing outside of the tree + - for RPATH pointing in the tree: + - if they point to standard location (/lib, /usr/lib): remove them + - otherwise: make them relative using \$ORIGIN + + For the target tree: + - scan the whole tree for sanitization + + For the staging tree : + - scan the whole tree for sanitization + + For the host tree: + - skip the staging tree for sanitization + - add \$HOST_DIR/{lib,usr/lib} to RPATH (as relative pathes) + +Arguments: + + TREE_KIND Kind of tree to be processed. + Allowed values: host, target, staging + + TREE_ROOT Path to the root of the tree to be scaned + +Environment: + + PATCHELF patchelf program to use + (default: patchelf) +EOF +} + +: ${PATCHELF:=patchelf} + +main() { + local tree="${1}" + local basedir="$(readlink -f "${2}")" + + local find_args=( "${basedir}" ) + local sanitize_extra_args=( ) + + case "${tree}" in + host) + # do not process the sysroot (only contains target binaries) + find_args+=( "-name" "sysroot" "-prune" "-o" ) + + # do not process the external toolchain installation directory to + # to avoid breaking it. + find_args+=( "-path" "*/opt/ext-toolchain" "-prune" "-o" ) + + ;; + staging|target) + # discard ${hostdir}/lib and ${hostdir}/usr/lib + sanitize_extra_args+=( "--no-standard-lib-dirs" ) + + ;; + *) + usage + exit 1 + ;; + esac + + find_args+=( "-type" "f" "-print" ) + + while read file ; do + # some files are not writable + chmod u+w $file + # call patchelf for each regular file; error will silently be ignored. + ${PATCHELF} --debug --make-rpath-relative "${basedir}" ${sanitize_extra_args[@]} "${file}" >> /dev/null 2>&1 + done < <(find ${find_args[@]}) + + # ignore errors + return 0 +} + +main ${@}