diff mbox series

[v7,3/8] core: implement per-package SDK and target

Message ID 20181228104335.22379-4-thomas.petazzoni@bootlin.com
State Superseded
Headers show
Series Top-level parallel build support | expand

Commit Message

Thomas Petazzoni Dec. 28, 2018, 10:43 a.m. UTC
This commit implements the core of the move to per-package SDK and
target directories. The main idea is that instead of having a global
output/host and output/target in which all packages install files, we
switch to per-package host and target directories, that only contain
their explicit dependencies.

There are two main benefits:

 - Packages will no longer discover dependencies that they do not
   explicitly indicate in their <pkg>_DEPENDENCIES variable.

 - We can support top-level parallel build properly, because a package
   only "sees" its own host directory and target directory, isolated
   from the build of other packages that can happen in parallel.

It works as follows:

 - A new output/per-package/ directory is created, which will contain
   one sub-directory per package, and inside it, a "host" directory
   and a "target" directory:

   output/per-package/busybox/target
   output/per-package/busybox/host
   output/per-package/host-fakeroot/target
   output/per-package/host-fakeroot/host

   This output/per-package/ directory is PER_PACKAGE_DIR.

 - The global TARGET_DIR and HOST_DIR variable now automatically point
   to the per-package directory when PKG is defined. So whenever a
   package references $(HOST_DIR) or $(TARGET_DIR) in its build
   process, it effectively references the per-package host/target
   directories. Note that STAGING_DIR is a sub-dir of HOST_DIR, so it
   is handled as well.

 - Of course, packages have dependencies, so those dependencies must
   be installed in the per-package host and target directories. To do
   so, we simply rsync (using hard links to save space and time) the
   host and target directories of the direct dependencies of the
   package to the current package host and target directories.

   We only need to take care of direct dependencies (and not
   recursively all dependencies), because we accumulate into those
   per-package host and target directories the files installed by the
   dependencies. Note that this only works because we make the
   assumption that one package does *not* overwrite files installed by
   another package.

   This is done for "extract dependencies" at the beginning of the
   extract step, and for "normal dependencies" at the beginning of the
   configure step.

This is basically enough to make per-package SDK and target work. The
only gotcha is that at the end of the build, output/target and
output/host are empty, which means that:

 - The filesystem image creation code cannot work.

 - We don't have a SDK to build code outside of Buildroot.

In order to fix this, this commit extends the target-finalize step so
that it starts by populating output/target and output/host by
rsync-ing into them the target and host directories of all packages
listed in the $(PACKAGES) variable. It is necessary to do this
sequentially in the target-finalize step and not in each
package. Doing it in package installation means that it can be done in
parallel. In that case, there is a chance that two rsyncs are creating
the same hardlink or directory at the same time, which makes one of
them fail.

This change to per-package directories has an impact on the RPATH
built into the host binaries, as those RPATH now point to various
per-package host directories, and no longer to the global host
directory. We do not try to rewrite such RPATHs during the build as
having such RPATHs is perfectly fine, but we still need to handle two
fallouts from this change:

 - The check-host-rpath script, which verifies at the end of each
   package installation that it has the appropriate RPATH, is modified
   to understand that a RPATH to $(PER_PACKAGE_DIR)/<pkg>/host/lib is
   a correct RPAT.

 - The fix-rpath script, which mungles the RPATH mainly for the SDK
   preparation, is modified to rewrite the RPATH to not point to
   per-package directories. Indeed the patchelf --make-rpath-relative
   call only works if the RPATH points to the ROOTDIR passed as
   argument, and this ROOTDIR is the global host directory. Rewriting
   the RPATH to not point to per-package host directories prior to
   this is an easy solution to this issue.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
 Config.in                        | 18 ++++++++++++++++++
 Makefile                         | 29 ++++++++++++++++++-----------
 package/pkg-generic.mk           |  7 ++++++-
 package/pkg-utils.mk             | 26 ++++++++++++++++++++++++++
 support/scripts/check-host-rpath | 27 ++++++++++++++++++++++-----
 support/scripts/fix-rpath        | 29 ++++++++++++++++++++++-------
 6 files changed, 112 insertions(+), 24 deletions(-)

Comments

Yann E. MORIN Dec. 30, 2018, 9:52 p.m. UTC | #1
Thomas, All,

I don;t have much to propose asa review, except very-minor issues.

On 2018-12-28 11:43 +0100, Thomas Petazzoni spake thusly:
> This commit implements the core of the move to per-package SDK and
> target directories. The main idea is that instead of having a global
> output/host and output/target in which all packages install files, we
> switch to per-package host and target directories, that only contain
> their explicit dependencies.
> 
> There are two main benefits:
> 
>  - Packages will no longer discover dependencies that they do not
>    explicitly indicate in their <pkg>_DEPENDENCIES variable.

Note that non-expressed dependencies may still be gathered, if they are
transitive dependencies.

[--SNIP--]
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
[--SNIP--]
> diff --git a/Makefile b/Makefile
> index 9de8e0c725..e01ec4c963 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -205,6 +205,7 @@ BR_GRAPH_OUT := $(or $(BR2_GRAPH_OUT),pdf)
>  BUILD_DIR := $(BASE_DIR)/build
>  BINARIES_DIR := $(BASE_DIR)/images
>  BASE_TARGET_DIR := $(BASE_DIR)/target
> +PER_PACKAGE_DIR := $(BASE_DIR)/per-package

Why don't you simply export this variable, like HOST_DIR and TARGET_DIR?
This would simplify calls to fix-rpath:

[--SNIP--]
> @@ -585,8 +587,8 @@ world: target-post-image
>  .PHONY: prepare-sdk
>  prepare-sdk: world
>  	@$(call MESSAGE,"Rendering the SDK relocatable")
> -	$(TOPDIR)/support/scripts/fix-rpath host
> -	$(TOPDIR)/support/scripts/fix-rpath staging
> +	PER_PACKAGE_DIR=$(PER_PACKAGE_DIR) $(TOPDIR)/support/scripts/fix-rpath host
> +	PER_PACKAGE_DIR=$(PER_PACKAGE_DIR) $(TOPDIR)/support/scripts/fix-rpath staging

... here.

[--SNIP--]
> @@ -973,7 +979,8 @@ savedefconfig: $(BUILD_DIR)/buildroot-config/conf prepare-kconfig
>  
>  # staging and target directories do NOT list these as
>  # dependencies anywhere else
> -$(BUILD_DIR) $(BASE_TARGET_DIR) $(HOST_DIR) $(BINARIES_DIR) $(LEGAL_INFO_DIR) $(REDIST_SOURCES_DIR_TARGET) $(REDIST_SOURCES_DIR_HOST):
> +$(BUILD_DIR) $(BASE_TARGET_DIR) $(HOST_DIR) $(BINARIES_DIR) $(LEGAL_INFO_DIR) \
> +	$(REDIST_SOURCES_DIR_TARGET) $(REDIST_SOURCES_DIR_HOST) $(PER_PACKAGE_DIR):

Don't ident this continuation line, it is misleading: the target of
the rule appear to be part of the commands.

[--SNIP--]
> diff --git a/support/scripts/check-host-rpath b/support/scripts/check-host-rpath
> index c8939569e2..d903a82958 100755
> --- a/support/scripts/check-host-rpath
> +++ b/support/scripts/check-host-rpath

As I was saying on IRC: when we are not using per-package directories,
then I was a bit surprised to see that this script (and fix_rpath,
below) still worked, when the support for PPD is not optional in it.

What confused me, was that PER_PACKAGE_DIR is always used, even when
BR2_PER_PACKAGE_DIRECTORIES is unset.

But as you pointed to on IRC, PER_PACKAGE_DIR is also always defined to
an non-empty string, that either points to a valid location when
BR2_PER_PACKAGE_DIRECTORIES=y, or points to a non-existent location
otherwise.

> @@ -77,6 +93,7 @@ check_elf_has_rpath() {
>              dir="$( sed -r -e 's:/+:/:g; s:/$::;' <<<"${dir}" )"
>              [ "${dir}" = "${hostdir}/lib" ] && return 0
>              [ "${dir}" = "\$ORIGIN/../lib" ] && return 0
> +            [[ ${dir} =~ ${perpackagedir}/[^/]*/host/lib ]] && return 0
                                            ^^^^^^^
That would also match the string '//' so maybe we want:

    ${perpackagedir}/([^/]+/)?host/lib

[--SNIP--]
> diff --git a/support/scripts/fix-rpath b/support/scripts/fix-rpath
> index fa138ca15a..926767a54f 100755
> --- a/support/scripts/fix-rpath
> +++ b/support/scripts/fix-rpath
> @@ -127,14 +127,29 @@ main() {
>  
>      while read file ; do
>          # check if it's an ELF file
> -        if ${PATCHELF} --print-rpath "${file}" > /dev/null 2>&1; then
> -            # make files writable if necessary
> -            changed=$(chmod -c u+w "${file}")
> -            # call patchelf to sanitize the rpath
> -            ${PATCHELF} --make-rpath-relative "${rootdir}" ${sanitize_extra_args[@]} "${file}"
> -            # restore the original permission
> -            test "${changed}" != "" && chmod u-w "${file}"
> +        rpath=$(${PATCHELF} --print-rpath "${file}" 2>&1)
> +        if test $? -ne 0 ; then
> +            continue
>          fi
> +
> +        # make files writable if necessary
> +        changed=$(chmod -c u+w "${file}")
> +
> +        # With per-package directory support, most RPATH of host
> +        # binaries will point to per-package directories. This won't
> +        # work with the --make-rpath-relative ${rootdir} invocation as
> +        # the per-package host directory is not within ${rootdir}. So,
> +        # we rewrite all RPATHs pointing to per-package directories so
> +        # that they point to the global host directry.
> +        changed_rpath=$(echo ${rpath} | sed "s@${PER_PACKAGE_DIR}/[^/]*/host@${HOST_DIR}@")

Ditto?

> +        if test "${rpath}" != "${changed_rpath}" ; then
> +            ${PATCHELF} --set-rpath ${changed_rpath} "${file}"

Can't you do that unconditioanlly? If it's changed, we need to set it;
if it's not changed, that set it to the initial value anyway...

> +
> +        # call patchelf to sanitize the rpath
> +        ${PATCHELF} --make-rpath-relative "${rootdir}" ${sanitize_extra_args[@]} "${file}"
> +        # restore the original permission
> +        test "${changed}" != "" && chmod u-w "${file}"
>      done < <(find "${rootdir}" ${find_args[@]})
>  
>      # Restore patched patchelf utility
> -- 
> 2.20.1
>
Thomas Petazzoni Dec. 31, 2018, 2:31 p.m. UTC | #2
Hello,

Thanks for the review!

On Sun, 30 Dec 2018 22:52:12 +0100, Yann E. MORIN wrote:

> I don;t have much to propose asa review, except very-minor issues.

Which is good :-) Hopefully I can get some Reviewed-by soon, and merge
this series.

> > There are two main benefits:
> > 
> >  - Packages will no longer discover dependencies that they do not
> >    explicitly indicate in their <pkg>_DEPENDENCIES variable.  
> 
> Note that non-expressed dependencies may still be gathered, if they are
> transitive dependencies.

Yes, of course, but what can we do about this ? It's pretty logical and
obvious that transitive dependencies will be copied, no ?

> > +PER_PACKAGE_DIR := $(BASE_DIR)/per-package  
> 
> Why don't you simply export this variable, like HOST_DIR and TARGET_DIR?
> This would simplify calls to fix-rpath:

Because I generally don't like those global exports. It pollutes the
namespace with some random variable that is really internal to
Buildroot. There is no reason for the build system of all packages to
even see this variable. In fact, I am personally not a big fan of
exporting TARGET_DIR, STAGING_DIR, etc. since they become visible in
packages, and people tend to use them in their package build system,
which is very wrong.

Of course, if the overall consensus is that PER_PACKAGE_DIR should be
exported, I'll do so because I don't want to hold this series just for
this detail.


> > -$(BUILD_DIR) $(BASE_TARGET_DIR) $(HOST_DIR) $(BINARIES_DIR) $(LEGAL_INFO_DIR) $(REDIST_SOURCES_DIR_TARGET) $(REDIST_SOURCES_DIR_HOST):
> > +$(BUILD_DIR) $(BASE_TARGET_DIR) $(HOST_DIR) $(BINARIES_DIR) $(LEGAL_INFO_DIR) \
> > +	$(REDIST_SOURCES_DIR_TARGET) $(REDIST_SOURCES_DIR_HOST) $(PER_PACKAGE_DIR):  
> 
> Don't ident this continuation line, it is misleading: the target of
> the rule appear to be part of the commands.

ACK.

> [--SNIP--]
> > diff --git a/support/scripts/check-host-rpath b/support/scripts/check-host-rpath
> > index c8939569e2..d903a82958 100755
> > --- a/support/scripts/check-host-rpath
> > +++ b/support/scripts/check-host-rpath  
> 
> As I was saying on IRC: when we are not using per-package directories,
> then I was a bit surprised to see that this script (and fix_rpath,
> below) still worked, when the support for PPD is not optional in it.
> 
> What confused me, was that PER_PACKAGE_DIR is always used, even when
> BR2_PER_PACKAGE_DIRECTORIES is unset.
> 
> But as you pointed to on IRC, PER_PACKAGE_DIR is also always defined to
> an non-empty string, that either points to a valid location when
> BR2_PER_PACKAGE_DIRECTORIES=y, or points to a non-existent location
> otherwise.

Yes, I'll add some comments about this.

> 
> > @@ -77,6 +93,7 @@ check_elf_has_rpath() {
> >              dir="$( sed -r -e 's:/+:/:g; s:/$::;' <<<"${dir}" )"
> >              [ "${dir}" = "${hostdir}/lib" ] && return 0
> >              [ "${dir}" = "\$ORIGIN/../lib" ] && return 0
> > +            [[ ${dir} =~ ${perpackagedir}/[^/]*/host/lib ]] && return 0  
>                                             ^^^^^^^
> That would also match the string '//' so maybe we want:
> 
>     ${perpackagedir}/([^/]+/)?host/lib

I'm not sure to understand the ? in ([^/]+/)?. We definitely want one
path component between $(PER_PACKAGE_DIR) and host/lib. So what about:

	${perpackagedir}/[^/]+/host/lib

> > +        if test "${rpath}" != "${changed_rpath}" ; then
> > +            ${PATCHELF} --set-rpath ${changed_rpath} "${file}"  
> 
> Can't you do that unconditioanlly? If it's changed, we need to set it;
> if it's not changed, that set it to the initial value anyway...

There is a reason to do it conditionally: patchelf is slow, you don't
want to run patchelf when you don't need to, and here it's trivial to
know whether it is useful or not to run it. For example, this condition
ensures that people not using per-package directory don't pay the cost
of running all those additional patchelf invocations.

Best regards,

Thomas
Yann E. MORIN Dec. 31, 2018, 2:45 p.m. UTC | #3
Thomas, All,

On 2018-12-31 15:31 +0100, Thomas Petazzoni spake thusly:
> On Sun, 30 Dec 2018 22:52:12 +0100, Yann E. MORIN wrote:
> > > There are two main benefits:
> > >  - Packages will no longer discover dependencies that they do not
> > >    explicitly indicate in their <pkg>_DEPENDENCIES variable.  
> > Note that non-expressed dependencies may still be gathered, if they are
> > transitive dependencies.
> Yes, of course, but what can we do about this ?

What I mean is, if

  - A needs B and C, but has a dependency only on B,
  - B depends on C,

then the dependency of A to C is fullfilled, even though it is missing
in the dependencies. I.e, it is a _hidden_ dependency. It is not
"explicitly indicate[d] in A_DEPENDENCIES" but still gathered.

And no, there is nothing we can do about it.

> It's pretty logical and
> obvious that transitive dependencies will be copied, no ?

Not as you wrote it.

So, you could rephrase as:

    Packages will now see only the dependencies they explicitly list in
    their <pkg>_DEPENDENCIES variable, and the recursive dependencies
    thereof.

> > > +PER_PACKAGE_DIR := $(BASE_DIR)/per-package  
> > 
> > Why don't you simply export this variable, like HOST_DIR and TARGET_DIR?
> > This would simplify calls to fix-rpath:
> 
> Because I generally don't like those global exports. It pollutes the
> namespace with some random variable that is really internal to
> Buildroot. There is no reason for the build system of all packages to
> even see this variable. In fact, I am personally not a big fan of
> exporting TARGET_DIR, STAGING_DIR, etc. since they become visible in
> packages, and people tend to use them in their package build system,
> which is very wrong.
> 
> Of course, if the overall consensus is that PER_PACKAGE_DIR should be
> exported, I'll do so because I don't want to hold this series just for
> this detail.

Oh, I would tend to agree with you.

I'm just pointing a discrepancy in the way those variables are handled,
and I think it is good to have some consistency, especially in this
difficult topic, even though said consistency's not very nice...

[--SNIP--]
> > > @@ -77,6 +93,7 @@ check_elf_has_rpath() {
> > >              dir="$( sed -r -e 's:/+:/:g; s:/$::;' <<<"${dir}" )"
> > >              [ "${dir}" = "${hostdir}/lib" ] && return 0
> > >              [ "${dir}" = "\$ORIGIN/../lib" ] && return 0
> > > +            [[ ${dir} =~ ${perpackagedir}/[^/]*/host/lib ]] && return 0  
> >                                             ^^^^^^^
> > That would also match the string '//' so maybe we want:
> > 
> >     ${perpackagedir}/([^/]+/)?host/lib
> 
> I'm not sure to understand the ? in ([^/]+/)?. We definitely want one
> path component between $(PER_PACKAGE_DIR) and host/lib. So what about:
> 
> 	${perpackagedir}/[^/]+/host/lib

Yes, that is it (and is what I eventually suggested in my review of
patch 5 (about fixing libtool.la files).

> > > +        if test "${rpath}" != "${changed_rpath}" ; then
> > > +            ${PATCHELF} --set-rpath ${changed_rpath} "${file}"  
> > 
> > Can't you do that unconditioanlly? If it's changed, we need to set it;
> > if it's not changed, that set it to the initial value anyway...
> 
> There is a reason to do it conditionally: patchelf is slow, you don't
> want to run patchelf when you don't need to, and here it's trivial to
> know whether it is useful or not to run it. For example, this condition
> ensures that people not using per-package directory don't pay the cost
> of running all those additional patchelf invocations.

OK.

Thanks! :-)

Regards,
Yann E. MORIN.
Jan Kundrát Jan. 8, 2019, 6:02 p.m. UTC | #4
On pátek 28. prosince 2018 11:43:30 CET, Thomas Petazzoni wrote:
> +ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
> +# rsync the contents of per-package directories
> +# $1: space-separated list of packages to rsync from
> +# $2: 'host' or 'target'
> +# $3: destination directory
> +define per-package-rsync
> +	mkdir -p $(3)
> +	$(foreach pkg,$(1),\
> +		rsync -a --link-dest=$(PER_PACKAGE_DIR)/$(pkg)/$(2)/ \
> +		$(PER_PACKAGE_DIR)/$(pkg)/$(2)/ \
> +		$(3)$(sep))
> +endef

Hi Thomas,
I gave this series (the ppsh-v7-work merged into master, actually, with a 
revert of 51395b14ed1a23858eef5d7f2bcf3a03cca6dfb3). The only immediate 
breakage I see so far on my config (ARM, systemd, glibc, linaro toolchain) 
is related to systemd-networkd and its /etc/resolv.conf symlink handling, 
but I am afraid that I see a bigger problem.

The basic skeleton defines a symlink (it's in 
system/skeleton/etc/resolv.conf) pointing to ../tmp/resolv.conf . This is 
overwritten in package/systemd/systemd.mk through 
SYSTEMD_INSTALL_RESOLVCONF_HOOK, and indeed it results in a correct symlink 
in the systemd's per-package target dir:

  per-package/systemd/target/etc/resolv.conf -> 
../run/systemd/resolve/resolv.conf

The problem is that at the rsync time, packages are processed in 
alphabetical order. If the very last package to be rsynced (in my case, 
this is zlib) does not transitively depend on systemd, then rsync will 
update the /etc/resolv.conf symlink back to one obtained from the default 
skeleton.

I think that this is -- potentially -- also a problem for any package "P2" 
which calls `ln -sf` from its *_INSTALL_TARGET_HOOKS to overwrite stuff 
which belongs to another package "P1". If any other package "P3" depends on 
"P1" and not on "P2", *and* if P3's name sorts after P2, then the P1's 
symlink gets preserved via P3.

This will not necessary be fixed by changing to do the rsync in a 
dependency order because "P3" can still be rsynced after "P2".

What is the cleanest fix here? Should this symlink overriding go to 
*_TARGET_FINALIZE_HOOKS? If the `ln -sf` was just in TARGET_FINALIZE_HOOKS, 
that would mean that the corresponding per-package/*/target would *not* 
contain these fixes which would be quite confusing, IMHO... OTOH, two hooks 
for overwriting would be ugly.

With kind regards,
Jan
Thomas Petazzoni Nov. 5, 2019, 4:38 p.m. UTC | #5
Hello Jan,

Yes, some feedback on an almost year old comment!

On Tue, 08 Jan 2019 19:02:56 +0100
Jan Kundrát <jan.kundrat@cesnet.cz> wrote:

> I gave this series (the ppsh-v7-work merged into master, actually, with a 
> revert of 51395b14ed1a23858eef5d7f2bcf3a03cca6dfb3). The only immediate 
> breakage I see so far on my config (ARM, systemd, glibc, linaro toolchain) 
> is related to systemd-networkd and its /etc/resolv.conf symlink handling, 
> but I am afraid that I see a bigger problem.
> 
> The basic skeleton defines a symlink (it's in 
> system/skeleton/etc/resolv.conf) pointing to ../tmp/resolv.conf . This is 
> overwritten in package/systemd/systemd.mk through 
> SYSTEMD_INSTALL_RESOLVCONF_HOOK, and indeed it results in a correct symlink 
> in the systemd's per-package target dir:
> 
>   per-package/systemd/target/etc/resolv.conf -> 
> ../run/systemd/resolve/resolv.conf
> 
> The problem is that at the rsync time, packages are processed in 
> alphabetical order. If the very last package to be rsynced (in my case, 
> this is zlib) does not transitively depend on systemd, then rsync will 
> update the /etc/resolv.conf symlink back to one obtained from the default 
> skeleton.

Indeed. Clearly, we have identified since quite a while that the
per-package directory mechanism only works if all packages install
distinct set of files, unless they have a dependency relationship.

> I think that this is -- potentially -- also a problem for any package "P2" 
> which calls `ln -sf` from its *_INSTALL_TARGET_HOOKS to overwrite stuff 
> which belongs to another package "P1". If any other package "P3" depends on 
> "P1" and not on "P2", *and* if P3's name sorts after P2, then the P1's 
> symlink gets preserved via P3.
> 
> This will not necessary be fixed by changing to do the rsync in a 
> dependency order because "P3" can still be rsynced after "P2".
> 
> What is the cleanest fix here? Should this symlink overriding go to 
> *_TARGET_FINALIZE_HOOKS? If the `ln -sf` was just in TARGET_FINALIZE_HOOKS, 
> that would mean that the corresponding per-package/*/target would *not* 
> contain these fixes which would be quite confusing, IMHO... OTOH, two hooks 
> for overwriting would be ugly.

For the case of resolv.conf, I think solving it through a
TARGET_FINALIZE_HOOKS is probably the most appropriate option for now.
We'll have to see later on if we have other cases like this, and if
there is a pattern that allows us to invent some useful bit of
additional infrastructure.

Or we could:

 (1) Move the /etc/resolv.conf symlink from system/skeleton/ to
     package/skeleton-init-sysv/skeleton/etc.

 (2) Tweak the systemd.mk so that it always creates resolv.conf, either
     pointing to ../run/systemd/resolve/resolv.conf when resolved is
     enabled, or to /tmp/resolv.conf otherwise.

This would avoid the overwriting entirely.

What do you think ?

Best regards,

Thomas
Carlos Santos Nov. 5, 2019, 7:05 p.m. UTC | #6
On Tue, Nov 5, 2019 at 1:38 PM Thomas Petazzoni
<thomas.petazzoni@bootlin.com> wrote:
>
> Hello Jan,
>
> Yes, some feedback on an almost year old comment!
>
> On Tue, 08 Jan 2019 19:02:56 +0100
> Jan Kundrát <jan.kundrat@cesnet.cz> wrote:
>
> > I gave this series (the ppsh-v7-work merged into master, actually, with a
> > revert of 51395b14ed1a23858eef5d7f2bcf3a03cca6dfb3). The only immediate
> > breakage I see so far on my config (ARM, systemd, glibc, linaro toolchain)
> > is related to systemd-networkd and its /etc/resolv.conf symlink handling,
> > but I am afraid that I see a bigger problem.
> >
> > The basic skeleton defines a symlink (it's in
> > system/skeleton/etc/resolv.conf) pointing to ../tmp/resolv.conf . This is
> > overwritten in package/systemd/systemd.mk through
> > SYSTEMD_INSTALL_RESOLVCONF_HOOK, and indeed it results in a correct symlink
> > in the systemd's per-package target dir:
> >
> >   per-package/systemd/target/etc/resolv.conf ->
> > ../run/systemd/resolve/resolv.conf
> >
> > The problem is that at the rsync time, packages are processed in
> > alphabetical order. If the very last package to be rsynced (in my case,
> > this is zlib) does not transitively depend on systemd, then rsync will
> > update the /etc/resolv.conf symlink back to one obtained from the default
> > skeleton.
>
> Indeed. Clearly, we have identified since quite a while that the
> per-package directory mechanism only works if all packages install
> distinct set of files, unless they have a dependency relationship.
>
> > I think that this is -- potentially -- also a problem for any package "P2"
> > which calls `ln -sf` from its *_INSTALL_TARGET_HOOKS to overwrite stuff
> > which belongs to another package "P1". If any other package "P3" depends on
> > "P1" and not on "P2", *and* if P3's name sorts after P2, then the P1's
> > symlink gets preserved via P3.
> >
> > This will not necessary be fixed by changing to do the rsync in a
> > dependency order because "P3" can still be rsynced after "P2".
> >
> > What is the cleanest fix here? Should this symlink overriding go to
> > *_TARGET_FINALIZE_HOOKS? If the `ln -sf` was just in TARGET_FINALIZE_HOOKS,
> > that would mean that the corresponding per-package/*/target would *not*
> > contain these fixes which would be quite confusing, IMHO... OTOH, two hooks
> > for overwriting would be ugly.
>
> For the case of resolv.conf, I think solving it through a
> TARGET_FINALIZE_HOOKS is probably the most appropriate option for now.
> We'll have to see later on if we have other cases like this, and if
> there is a pattern that allows us to invent some useful bit of
> additional infrastructure.
>
> Or we could:
>
>  (1) Move the /etc/resolv.conf symlink from system/skeleton/ to
>      package/skeleton-init-sysv/skeleton/etc.
>
>  (2) Tweak the systemd.mk so that it always creates resolv.conf, either
>      pointing to ../run/systemd/resolve/resolv.conf when resolved is
>      enabled, or to /tmp/resolv.conf otherwise.
>
> This would avoid the overwriting entirely.
>
> What do you think ?
>
> Best regards,
>
> Thomas

This would prevent NetworkManager from updating resolv.conf. See
https://bugs.busybox.net/show_bug.cgi?id=9881
Thomas Petazzoni Nov. 6, 2019, 7:57 a.m. UTC | #7
Hello,

On Tue, 5 Nov 2019 16:05:50 -0300
Carlos Santos <unixmania@gmail.com> wrote:

> > Or we could:
> >
> >  (1) Move the /etc/resolv.conf symlink from system/skeleton/ to
> >      package/skeleton-init-sysv/skeleton/etc.
> >
> >  (2) Tweak the systemd.mk so that it always creates resolv.conf, either
> >      pointing to ../run/systemd/resolve/resolv.conf when resolved is
> >      enabled, or to /tmp/resolv.conf otherwise.
> >
> > This would avoid the overwriting entirely.
> >
> > What do you think ?
> >
> > Best regards,
> >
> This would prevent NetworkManager from updating resolv.conf. See
> https://bugs.busybox.net/show_bug.cgi?id=9881

Then I guess the TARGET_FINALIZE_HOOKS is the only solution here.

But thinking more about this issue, it really means that no package
should overwrite any file installed by one of its dependencies: this
means that the check-uniq-files stuff that Yann recently removed was in
fact useful.

Indeed, if we have the following scenario:

 - Package A installs file /foo/bar

 - Package B depends on A, and overwrites /foo/bar with some contents

 - Package C depends on A

Then, at the end in the target root filesystem, we will have A's
version of /foo/bar, and not B's version. Just like Jan explained, the
final rsync will rsync the per-package target directories in this order
(due to alphabetic ordering):

 A, B, C

So, in the end, it is the version of /foo/bar in the per-package
directory of C that wins, and because C depends on A, the version of
/foo/bar in C's per-package directory is the one coming from package A.

So, it is really important for packages *not* to overwrite any file
installed by any other package, even if said package is in its
dependencies.

Am I missing something here ?

Best regards,

Thomas
Jan Kundrát Nov. 6, 2019, 8:13 a.m. UTC | #8
> Then I guess the TARGET_FINALIZE_HOOKS is the only solution here.

Yes, this sounds reasonable.

> But thinking more about this issue, it really means that no package
> should overwrite any file installed by one of its dependencies:

Agreed. Of course there are still scenarios where packages have to 
overwrite stuff which belongs to something else (think of packages 
"registering themselves" within a file managed by a package they depend 
on). These can be solved by explicitly configuring these packages so that 
this conflicting resource exists in just one location, i.e., outside of the 
per-package namespace.

I've been doing this with (still out-of-tree) sysrepo etc like this:

https://gerrit.cesnet.cz/plugins/gitiles/github/buildroot/buildroot/+/refs/heads/cesnet/2019-06-10/package/sysrepo/sysrepo.mk#41

How does merging the ppsh work look?

With kind regards,
Jan
diff mbox series

Patch

diff --git a/Config.in b/Config.in
index f965e9d6d8..2d8a07fda7 100644
--- a/Config.in
+++ b/Config.in
@@ -696,6 +696,24 @@  config BR2_REPRODUCIBLE
 	  This is labeled as an experimental feature, as not all
 	  packages behave properly to ensure reproducibility.
 
+config BR2_PER_PACKAGE_DIRECTORIES
+	bool "Use per-package directories (experimental)"
+	help
+	  This option will change the build process of Buildroot
+	  package to use per-package target and host directories.
+
+	  This is useful for two related purposes:
+
+	    - Cleanly isolate the build of each package, so that a
+	      given package only "sees" the dependencies it has
+	      explicitly expressed, and not other packages that may
+	      have by chance been built before.
+
+	    - Enable top-level parallel build.
+
+	  This is labeled as an experimental feature, as not all
+	  packages behave properly with per-package directories.
+
 endmenu
 
 comment "Security Hardening Options"
diff --git a/Makefile b/Makefile
index 9de8e0c725..e01ec4c963 100644
--- a/Makefile
+++ b/Makefile
@@ -205,6 +205,7 @@  BR_GRAPH_OUT := $(or $(BR2_GRAPH_OUT),pdf)
 BUILD_DIR := $(BASE_DIR)/build
 BINARIES_DIR := $(BASE_DIR)/images
 BASE_TARGET_DIR := $(BASE_DIR)/target
+PER_PACKAGE_DIR := $(BASE_DIR)/per-package
 # initial definition so that 'make clean' works for most users, even without
 # .config. HOST_DIR will be overwritten later when .config is included.
 HOST_DIR := $(BASE_DIR)/host
@@ -451,12 +452,13 @@  XZCAT := $(call qstrip,$(BR2_XZCAT))
 LZCAT := $(call qstrip,$(BR2_LZCAT))
 TAR_OPTIONS = $(call qstrip,$(BR2_TAR_OPTIONS)) -xf
 
-# packages compiled for the host go here
+ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
+HOST_DIR = $(if $(PKG),$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/host,$(call qstrip,$(BR2_HOST_DIR)))
+TARGET_DIR = $(if $(ROOTFS),$(ROOTFS_$(ROOTFS)_TARGET_DIR),$(if $(PKG),$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/target,$(BASE_TARGET_DIR)))
+else
 HOST_DIR := $(call qstrip,$(BR2_HOST_DIR))
-
-# The target directory is common to all packages,
-# but there is one that is specific to each filesystem.
 TARGET_DIR = $(if $(ROOTFS),$(ROOTFS_$(ROOTFS)_TARGET_DIR),$(BASE_TARGET_DIR))
+endif
 
 ifneq ($(HOST_DIR),$(BASE_DIR)/host)
 HOST_DIR_SYMLINK = $(BASE_DIR)/host
@@ -585,8 +587,8 @@  world: target-post-image
 .PHONY: prepare-sdk
 prepare-sdk: world
 	@$(call MESSAGE,"Rendering the SDK relocatable")
-	$(TOPDIR)/support/scripts/fix-rpath host
-	$(TOPDIR)/support/scripts/fix-rpath staging
+	PER_PACKAGE_DIR=$(PER_PACKAGE_DIR) $(TOPDIR)/support/scripts/fix-rpath host
+	PER_PACKAGE_DIR=$(PER_PACKAGE_DIR) $(TOPDIR)/support/scripts/fix-rpath staging
 	$(INSTALL) -m 755 $(TOPDIR)/support/misc/relocate-sdk.sh $(HOST_DIR)/relocate-sdk.sh
 	mkdir -p $(HOST_DIR)/share/buildroot
 	echo $(HOST_DIR) > $(HOST_DIR)/share/buildroot/sdk-location
@@ -702,15 +704,19 @@  $(TARGETS_ROOTFS): target-finalize
 # Avoid the rootfs name leaking down the dependency chain
 target-finalize: ROOTFS=
 
-host-finalize: $(HOST_DIR_SYMLINK)
+.PHONY: host-finalize
+host-finalize: $(PACKAGES) $(HOST_DIR) $(HOST_DIR_SYMLINK)
+	@$(call MESSAGE,"Finalizing host directory")
+	$(call per-package-rsync,$(sort $(PACKAGES)),host,$(HOST_DIR))
 
 .PHONY: staging-finalize
 staging-finalize:
 	@ln -snf $(STAGING_DIR) $(BASE_DIR)/staging
 
 .PHONY: target-finalize
-target-finalize: $(PACKAGES) host-finalize
+target-finalize: $(PACKAGES) $(TARGET_DIR) host-finalize
 	@$(call MESSAGE,"Finalizing target directory")
+	$(call per-package-rsync,$(sort $(PACKAGES)),target,$(TARGET_DIR))
 	# Check files that are touched by more than one package
 	./support/scripts/check-uniq-files -t target $(BUILD_DIR)/packages-file-list.txt
 	./support/scripts/check-uniq-files -t staging $(BUILD_DIR)/packages-file-list-staging.txt
@@ -765,7 +771,7 @@  endif
 	ln -sf ../usr/lib/os-release $(TARGET_DIR)/etc
 
 	@$(call MESSAGE,"Sanitizing RPATH in target tree")
-	$(TOPDIR)/support/scripts/fix-rpath target
+	PER_PACKAGE_DIR=$(PER_PACKAGE_DIR) $(TOPDIR)/support/scripts/fix-rpath target
 
 	@$(foreach d, $(call qstrip,$(BR2_ROOTFS_OVERLAY)), \
 		$(call MESSAGE,"Copying overlay $(d)"); \
@@ -973,7 +979,8 @@  savedefconfig: $(BUILD_DIR)/buildroot-config/conf prepare-kconfig
 
 # staging and target directories do NOT list these as
 # dependencies anywhere else
-$(BUILD_DIR) $(BASE_TARGET_DIR) $(HOST_DIR) $(BINARIES_DIR) $(LEGAL_INFO_DIR) $(REDIST_SOURCES_DIR_TARGET) $(REDIST_SOURCES_DIR_HOST):
+$(BUILD_DIR) $(BASE_TARGET_DIR) $(HOST_DIR) $(BINARIES_DIR) $(LEGAL_INFO_DIR) \
+	$(REDIST_SOURCES_DIR_TARGET) $(REDIST_SOURCES_DIR_HOST) $(PER_PACKAGE_DIR):
 	@mkdir -p $@
 
 # outputmakefile generates a Makefile in the output directory, if using a
@@ -1012,7 +1019,7 @@  printvars:
 clean:
 	rm -rf $(BASE_TARGET_DIR) $(BINARIES_DIR) $(HOST_DIR) $(HOST_DIR_SYMLINK) \
 		$(BUILD_DIR) $(BASE_DIR)/staging \
-		$(LEGAL_INFO_DIR) $(GRAPHS_DIR)
+		$(LEGAL_INFO_DIR) $(GRAPHS_DIR) $(PER_PACKAGE_DIR)
 
 .PHONY: distclean
 distclean: clean
diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index f5cab2b9c2..8ea86514d7 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -98,7 +98,7 @@  GLOBAL_INSTRUMENTATION_HOOKS += check_bin_arch
 # have a proper DT_RPATH or DT_RUNPATH tag
 define check_host_rpath
 	$(if $(filter install-host,$(2)),\
-		$(if $(filter end,$(1)),support/scripts/check-host-rpath $(3) $(HOST_DIR)))
+		$(if $(filter end,$(1)),support/scripts/check-host-rpath $(3) $(HOST_DIR) $(PER_PACKAGE_DIR)))
 endef
 GLOBAL_INSTRUMENTATION_HOOKS += check_host_rpath
 
@@ -133,6 +133,7 @@  endif
 # Retrieve the archive
 $(BUILD_DIR)/%/.stamp_downloaded:
 	@$(call step_start,download)
+	$(call prepare-per-package-directory,$($(PKG)_FINAL_DOWNLOAD_DEPENDENCIES))
 	$(foreach hook,$($(PKG)_PRE_DOWNLOAD_HOOKS),$(call $(hook))$(sep))
 # Only show the download message if it isn't already downloaded
 	$(Q)for p in $($(PKG)_ALL_DOWNLOADS); do \
@@ -159,6 +160,7 @@  $(BUILD_DIR)/%/.stamp_actual_downloaded:
 $(BUILD_DIR)/%/.stamp_extracted:
 	@$(call step_start,extract)
 	@$(call MESSAGE,"Extracting")
+	$(call prepare-per-package-directory,$($(PKG)_FINAL_EXTRACT_DEPENDENCIES))
 	$(foreach hook,$($(PKG)_PRE_EXTRACT_HOOKS),$(call $(hook))$(sep))
 	$(Q)mkdir -p $(@D)
 	$($(PKG)_EXTRACT_CMDS)
@@ -219,6 +221,7 @@  $(foreach dir,$(call qstrip,$(BR2_GLOBAL_PATCH_DIR)),\
 $(BUILD_DIR)/%/.stamp_configured:
 	@$(call step_start,configure)
 	@$(call MESSAGE,"Configuring")
+	$(call prepare-per-package-directory,$($(PKG)_FINAL_DEPENDENCIES))
 	$(foreach hook,$($(PKG)_PRE_CONFIGURE_HOOKS),$(call $(hook))$(sep))
 	$($(PKG)_CONFIGURE_CMDS)
 	$(foreach hook,$($(PKG)_POST_CONFIGURE_HOOKS),$(call $(hook))$(sep))
@@ -330,6 +333,7 @@  $(BUILD_DIR)/%/.stamp_target_installed:
 
 # Remove package sources
 $(BUILD_DIR)/%/.stamp_dircleaned:
+	$(if $(BR2_PER_PACKAGE_DIRECTORIES),rm -Rf $(PER_PACKAGE_DIR)/$(NAME))
 	rm -Rf $(@D)
 
 ################################################################################
@@ -886,6 +890,7 @@  $$($(2)_TARGET_SOURCE):			PKGDIR=$(pkgdir)
 $$($(2)_TARGET_ACTUAL_SOURCE):		PKG=$(2)
 $$($(2)_TARGET_ACTUAL_SOURCE):		PKGDIR=$(pkgdir)
 $$($(2)_TARGET_DIRCLEAN):		PKG=$(2)
+$$($(2)_TARGET_DIRCLEAN):		NAME=$(1)
 
 # Compute the name of the Kconfig option that correspond to the
 # package being enabled. We handle three cases: the special Linux
diff --git a/package/pkg-utils.mk b/package/pkg-utils.mk
index bffd79dfb0..34345a0d65 100644
--- a/package/pkg-utils.mk
+++ b/package/pkg-utils.mk
@@ -62,6 +62,32 @@  $$(error Package error: use $(2) instead of $(1). Please fix your .mk file)
 endif
 endef
 
+ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
+# rsync the contents of per-package directories
+# $1: space-separated list of packages to rsync from
+# $2: 'host' or 'target'
+# $3: destination directory
+define per-package-rsync
+	mkdir -p $(3)
+	$(foreach pkg,$(1),\
+		rsync -a --link-dest=$(PER_PACKAGE_DIR)/$(pkg)/$(2)/ \
+		$(PER_PACKAGE_DIR)/$(pkg)/$(2)/ \
+		$(3)$(sep))
+endef
+
+# prepares the per-package HOST_DIR and TARGET_DIR of the current
+# package, by rsync the host and target directories of the
+# dependencies of this package. The list of dependencies is passed as
+# argument, so that this function can be used to prepare with
+# different set of dependencies (download, extract, configure, etc.)
+#
+# $1: space-separated list of packages to rsync from
+define prepare-per-package-directory
+	$(call per-package-rsync,$(1),host,$(HOST_DIR))
+	$(call per-package-rsync,$(1),target,$(TARGET_DIR))
+endef
+endif
+
 #
 # legal-info helper functions
 #
diff --git a/support/scripts/check-host-rpath b/support/scripts/check-host-rpath
index c8939569e2..d903a82958 100755
--- a/support/scripts/check-host-rpath
+++ b/support/scripts/check-host-rpath
@@ -11,6 +11,7 @@  export LC_ALL=C
 main() {
     local pkg="${1}"
     local hostdir="${2}"
+    local perpackagedir="${3}"
     local file ret
 
     # Remove duplicate and trailing '/' for proper match
@@ -20,7 +21,7 @@  main() {
     while read file; do
         is_elf "${file}" || continue
         elf_needs_rpath "${file}" "${hostdir}" || continue
-        check_elf_has_rpath "${file}" "${hostdir}" && continue
+        check_elf_has_rpath "${file}" "${hostdir}" "${perpackagedir}" && continue
         if [ ${ret} -eq 0 ]; then
             ret=1
             printf "***\n"
@@ -44,6 +45,15 @@  is_elf() {
 # needs such an RPATH if at least of the libraries used by the ELF
 # executable is available in the host library directory. This function
 # returns 0 when a RPATH is needed, 1 otherwise.
+#
+# With per-package directory support, ${hostdir} will point to the
+# current package per-package host directory, and this is where this
+# function will check if the libraries needed by the executable are
+# located (or not). In practice, the ELF executable RPATH may point to
+# another package per-package host directory, but that is fine because
+# if such an executable is within the current package per-package host
+# directory, its libraries will also have been copied into the current
+# package per-package host directory.
 elf_needs_rpath() {
     local file="${1}"
     local hostdir="${2}"
@@ -62,13 +72,19 @@  elf_needs_rpath() {
 # This function checks whether at least one of the RPATH of the given
 # ELF executable (first argument) properly points to the host library
 # directory (second argument), either through an absolute RPATH or a
-# relative RPATH. Having such a RPATH will make sure the ELF
-# executable will find at runtime the shared libraries it depends
-# on. This function returns 0 when a proper RPATH was found, or 1
-# otherwise.
+# relative RPATH. In the context of per-package directory support,
+# ${hostdir} (second argument) points to the current package host
+# directory. However, it is perfectly valid for an ELF binary to have
+# a RPATH pointing to another package per-package host directory,
+# which is why such RPATH is also accepted (the per-package directory
+# gets passed as third argument). Having a RPATH pointing to the host
+# directory will make sure the ELF executable will find at runtime the
+# shared libraries it depends on. This function returns 0 when a
+# proper RPATH was found, or 1 otherwise.
 check_elf_has_rpath() {
     local file="${1}"
     local hostdir="${2}"
+    local perpackagedir="${3}"
     local rpath dir
 
     while read rpath; do
@@ -77,6 +93,7 @@  check_elf_has_rpath() {
             dir="$( sed -r -e 's:/+:/:g; s:/$::;' <<<"${dir}" )"
             [ "${dir}" = "${hostdir}/lib" ] && return 0
             [ "${dir}" = "\$ORIGIN/../lib" ] && return 0
+            [[ ${dir} =~ ${perpackagedir}/[^/]*/host/lib ]] && return 0
         done
     done < <( readelf -d "${file}"                                              \
               |sed -r -e '/.* \(R(UN)?PATH\) +Library r(un)?path: \[(.+)\]$/!d' \
diff --git a/support/scripts/fix-rpath b/support/scripts/fix-rpath
index fa138ca15a..926767a54f 100755
--- a/support/scripts/fix-rpath
+++ b/support/scripts/fix-rpath
@@ -127,14 +127,29 @@  main() {
 
     while read file ; do
         # check if it's an ELF file
-        if ${PATCHELF} --print-rpath "${file}" > /dev/null 2>&1; then
-            # make files writable if necessary
-            changed=$(chmod -c u+w "${file}")
-            # call patchelf to sanitize the rpath
-            ${PATCHELF} --make-rpath-relative "${rootdir}" ${sanitize_extra_args[@]} "${file}"
-            # restore the original permission
-            test "${changed}" != "" && chmod u-w "${file}"
+        rpath=$(${PATCHELF} --print-rpath "${file}" 2>&1)
+        if test $? -ne 0 ; then
+            continue
         fi
+
+        # make files writable if necessary
+        changed=$(chmod -c u+w "${file}")
+
+        # With per-package directory support, most RPATH of host
+        # binaries will point to per-package directories. This won't
+        # work with the --make-rpath-relative ${rootdir} invocation as
+        # the per-package host directory is not within ${rootdir}. So,
+        # we rewrite all RPATHs pointing to per-package directories so
+        # that they point to the global host directry.
+        changed_rpath=$(echo ${rpath} | sed "s@${PER_PACKAGE_DIR}/[^/]*/host@${HOST_DIR}@")
+        if test "${rpath}" != "${changed_rpath}" ; then
+            ${PATCHELF} --set-rpath ${changed_rpath} "${file}"
+        fi
+
+        # call patchelf to sanitize the rpath
+        ${PATCHELF} --make-rpath-relative "${rootdir}" ${sanitize_extra_args[@]} "${file}"
+        # restore the original permission
+        test "${changed}" != "" && chmod u-w "${file}"
     done < <(find "${rootdir}" ${find_args[@]})
 
     # Restore patched patchelf utility