diff mbox series

[v2,08/18] package/pkg-generic.mk: move python fixup to generic package infrastructure

Message ID 20210706142501.951345-9-herve.codina@bootlin.com
State Superseded
Headers show
Series Overwritten file detection and fixes, one more step to TLP build | expand

Commit Message

Herve Codina July 6, 2021, 2:24 p.m. UTC
Fixing _sysconfigdata*.{py,pyc} was previously done by python package
infrastructure. Some packages use python stuff without using python
package infrastructure.
These packages perform overwrites and need the specific python fixup
to fix them.

In order to be sure to fix all of these packages, the python fixup
is moved to the generic package infrastructure and applied to all
packages.
This follows the same principle as for the .la libtool files fixup.

Signed-off-by: Herve Codina <herve.codina@bootlin.com>
---
Changes v1 to v2:
 - Used '... -print0 | xargs -0 -r ...' construction.
 - Used '-deleted' to remove files.

 package/pkg-generic.mk | 17 +++++++++++++++++
 package/pkg-python.mk  | 12 ------------
 2 files changed, 17 insertions(+), 12 deletions(-)

Comments

Yann E. MORIN July 6, 2021, 7:50 p.m. UTC | #1
Hervé, All,

On 2021-07-06 16:24 +0200, Herve Codina spake thusly:
> Fixing _sysconfigdata*.{py,pyc} was previously done by python package
> infrastructure. Some packages use python stuff without using python
> package infrastructure.
> These packages perform overwrites and need the specific python fixup
> to fix them.
> 
> In order to be sure to fix all of these packages, the python fixup
> is moved to the generic package infrastructure and applied to all
> packages.
> This follows the same principle as for the .la libtool files fixup.
> 
> Signed-off-by: Herve Codina <herve.codina@bootlin.com>
> ---
> Changes v1 to v2:
>  - Used '... -print0 | xargs -0 -r ...' construction.
>  - Used '-deleted' to remove files.
> 
>  package/pkg-generic.mk | 17 +++++++++++++++++
>  package/pkg-python.mk  | 12 ------------
>  2 files changed, 17 insertions(+), 12 deletions(-)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index 6fbd4006da..8c19522fd8 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -103,6 +103,21 @@ define fixup-libtool-files
>  endef
>  endif
>  
> +# Make sure python _sysconfigdata*.py files only reference the current
> +# per-package directory.
> +
> +# $1: package name (lower case)
> +# $2: staging or host directory of the package
> +ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
> +define fixup-python-files
> +	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
> +		-name "_sysconfigdata*.py" -print0 | xargs -0 --no-run-if-empty \
> +		$(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$(1)/:g"
> +	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
> +		-name "_sysconfigdata*.pyc" -delete

Those two find commands are not equivalent to the original ones.

Before, we had:
    find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* [...]

After, we'd have, after replacing $(2):
    find $(HOST_DIR) \( -path '$(HOST_DIR)/lib/python*' -o -path '$(HOST_DIR)/usr/lib/python*' \) [...]
    find $(STAGING_DIR) \( -path '$(STAGING_DIR)/lib/python*' -o -path '$(STAGING_DIR)/usr/lib/python*' \) [...]

So this is slightly different... I don't get why you needed to duplicate
the calls, calling the macro twice... And even then, it would have still
been simpler to pass the two locations as starting points to find:

    find $(HOST_DIR)/lib/python* $(HOST_DIR)/usr/lib/python* [...]
    find $(STAGING_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* [...]

But really, $(HOST_DIR)/usr is just a symlink to . so there is no reason
to search it. Also, previously $(STAGING_DIR)/lib/ was not in the search
list, but now it is...

> +endef
> +endif
> +
>  # Functions to collect statistics about installed files
>  
>  # $(1): base directory to search in
> @@ -254,6 +269,8 @@ $(BUILD_DIR)/%/.stamp_configured:
>  	@$(call pkg_size_before,$(HOST_DIR),-host)
>  	$(call fixup-libtool-files,$(NAME),$(HOST_DIR))
>  	$(call fixup-libtool-files,$(NAME),$(STAGING_DIR))
> +	$(call fixup-python-files,$(NAME),$(HOST_DIR))
> +	$(call fixup-python-files,$(NAME),$(STAGING_DIR))
>  	$(foreach hook,$($(PKG)_POST_PREPARE_HOOKS),$(call $(hook))$(sep))

Even if the libtool fixups are not moved to the new _POST_PREPARE_HOOKS,
I still find it sad that we do not leverage those new hooks for the new
fixups... :-/

But since I said "OK-ish" in my previous review, I am going to stand by
it. A little reluctantly still... :-/

But then, re-reading the above about the two find commands: if you had
just moved the macro out of pkg-python.mk and into here, and just added:
    $(2)_POST_PREPARE_HOOKS += PKG_PYTHON_FIXUP_SYSCONFIGDATA

Then that would have been much simpler: 1) you would not have changed
the breadth of the search, and 2) you would have used the new hooks.

Oh, but you need the package name, that is passed as $(1)? No problem,
it is already available as $($(PKG)_NAME), like was done in the original
fixup macro (see below).

>  	$(foreach hook,$($(PKG)_PRE_CONFIGURE_HOOKS),$(call $(hook))$(sep))
>  	$($(PKG)_CONFIGURE_CMDS)
> diff --git a/package/pkg-python.mk b/package/pkg-python.mk
> index 1e4fd5ba33..e6b81bdfd3 100644
> --- a/package/pkg-python.mk
> +++ b/package/pkg-python.mk
> @@ -92,16 +92,6 @@ HOST_PKG_PYTHON_SETUPTOOLS_INSTALL_OPTS = \
>  	--root=/ \
>  	--single-version-externally-managed
>  
> -ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
> -define PKG_PYTHON_FIXUP_SYSCONFIGDATA
> -	find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* \
> -		-name "_sysconfigdata*.py" | xargs --no-run-if-empty \
> -		$(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g"
> -	find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* \
> -		-name "_sysconfigdata*.pyc" -delete
> -endef
> -endif
> -
>  ################################################################################
>  # inner-python-package -- defines how the configuration, compilation
>  # and installation of a Python package should be done, implements a
> @@ -246,8 +236,6 @@ $(2)_PYTHON_INTERPRETER = $$(HOST_DIR)/bin/$$($(2)_NEEDS_HOST_PYTHON)
>  endif
>  endif
>  
> -$(2)_PRE_CONFIGURE_HOOKS += PKG_PYTHON_FIXUP_SYSCONFIGDATA

See here: the fixup macro was already a hook! ;-)

Regards,
Yann E. MORIN.

>  #
>  # Build step. Only define it if not already defined by the package .mk
>  # file.
> -- 
> 2.31.1
>
Yann E. MORIN July 6, 2021, 9:22 p.m. UTC | #2
Hervé, All,

On 2021-07-06 21:50 +0200, Yann E. MORIN spake thusly:
> On 2021-07-06 16:24 +0200, Herve Codina spake thusly:
> > Fixing _sysconfigdata*.{py,pyc} was previously done by python package
> > infrastructure. Some packages use python stuff without using python
> > package infrastructure.
> > These packages perform overwrites and need the specific python fixup
> > to fix them.
> > 
> > In order to be sure to fix all of these packages, the python fixup
> > is moved to the generic package infrastructure and applied to all
> > packages.
[--SNIP--]
> But then, re-reading the above about the two find commands: if you had
> just moved the macro out of pkg-python.mk and into here, and just added:
>     $(2)_POST_PREPARE_HOOKS += PKG_PYTHON_FIXUP_SYSCONFIGDATA

And as usual, it would probably need another name, since it no longer is
in pkg-python.mk. This is a fixup, it's about python, so what about:

    $(2)_POST_PREPARE_HOOKS += FIXUP_PYTHON_SYSCONFIGDATA

Regards,
Yann E. MORIN.
Herve Codina July 7, 2021, 11:48 a.m. UTC | #3
Hi,

On Tue, 6 Jul 2021 21:50:48 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> > +# Make sure python _sysconfigdata*.py files only reference the current
> > +# per-package directory.
> > +
> > +# $1: package name (lower case)
> > +# $2: staging or host directory of the package
> > +ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
> > +define fixup-python-files
> > +	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
> > +		-name "_sysconfigdata*.py" -print0 | xargs -0 --no-run-if-empty \
> > +		$(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$(1)/:g"
> > +	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
> > +		-name "_sysconfigdata*.pyc" -delete  
> 
> Those two find commands are not equivalent to the original ones.
> 
> Before, we had:
>     find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* [...]
> 
> After, we'd have, after replacing $(2):
>     find $(HOST_DIR) \( -path '$(HOST_DIR)/lib/python*' -o -path '$(HOST_DIR)/usr/lib/python*' \) [...]
>     find $(STAGING_DIR) \( -path '$(STAGING_DIR)/lib/python*' -o -path '$(STAGING_DIR)/usr/lib/python*' \) [...]
> 
> So this is slightly different... I don't get why you needed to duplicate
> the calls, calling the macro twice... And even then, it would have still
> been simpler to pass the two locations as starting points to find:
> 
>     find $(HOST_DIR)/lib/python* $(HOST_DIR)/usr/lib/python* [...]
>     find $(STAGING_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* [...]
> 
> But really, $(HOST_DIR)/usr is just a symlink to . so there is no reason
> to search it. Also, previously $(STAGING_DIR)/lib/ was not in the search
> list, but now it is...
> 
> > +endef
> > +endif
> > +
> >  # Functions to collect statistics about installed files
> >  
> >  # $(1): base directory to search in
> > @@ -254,6 +269,8 @@ $(BUILD_DIR)/%/.stamp_configured:
> >  	@$(call pkg_size_before,$(HOST_DIR),-host)
> >  	$(call fixup-libtool-files,$(NAME),$(HOST_DIR))
> >  	$(call fixup-libtool-files,$(NAME),$(STAGING_DIR))
> > +	$(call fixup-python-files,$(NAME),$(HOST_DIR))
> > +	$(call fixup-python-files,$(NAME),$(STAGING_DIR))
> >  	$(foreach hook,$($(PKG)_POST_PREPARE_HOOKS),$(call $(hook))$(sep))  
> 
> Even if the libtool fixups are not moved to the new _POST_PREPARE_HOOKS,
> I still find it sad that we do not leverage those new hooks for the new
> fixups... :-/
> 
> But since I said "OK-ish" in my previous review, I am going to stand by
> it. A little reluctantly still... :-/
> 
> But then, re-reading the above about the two find commands: if you had
> just moved the macro out of pkg-python.mk and into here, and just added:
>     $(2)_POST_PREPARE_HOOKS += PKG_PYTHON_FIXUP_SYSCONFIGDATA
> 
> Then that would have been much simpler: 1) you would not have changed
> the breadth of the search, and 2) you would have used the new hooks.
> 
> Oh, but you need the package name, that is passed as $(1)? No problem,
> it is already available as $($(PKG)_NAME), like was done in the original
> fixup macro (see below).
> 


Well,

These finds are now called for all packages and .../lib/python* can be not
present. In this case, find fails.
Using '-path' avoid this failure if files/dirs are not present.

I wanted to mimic fixup-libtool-files.
And so performed 2 calls. One for HOST_DIR and the other for STAGING_DIR.

For HOST_DIR, the previous find searched in $(HOST_DIR)/lib,
for STAGING_DIR, it searched in $(STAGING_DIR)/usr/lib,

That's why I wrote the macro with 
-path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*'

We can simplify the call using:
  $(Q)find $(2) -path '$(2)/usr/lib/python*' -name ...
This assumes:
(1) python stuff will not install anything in $(STAGING_DIR)/lib/python*
(2) $(HOST_DIR)/usr will always be a symlink to $(HOST_DIR)/.
Arnout Vandecappelle July 7, 2021, 12:21 p.m. UTC | #4
On 07/07/2021 13:48, Herve Codina wrote:
> Hi,
> 
> On Tue, 6 Jul 2021 21:50:48 +0200
> "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
> 
>>> +# Make sure python _sysconfigdata*.py files only reference the current
>>> +# per-package directory.
>>> +
>>> +# $1: package name (lower case)
>>> +# $2: staging or host directory of the package
>>> +ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
>>> +define fixup-python-files
>>> +	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
>>> +		-name "_sysconfigdata*.py" -print0 | xargs -0 --no-run-if-empty \
>>> +		$(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$(1)/:g"
>>> +	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
>>> +		-name "_sysconfigdata*.pyc" -delete  
>>
>> Those two find commands are not equivalent to the original ones.
>>
>> Before, we had:
>>     find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* [...]
>>
>> After, we'd have, after replacing $(2):
>>     find $(HOST_DIR) \( -path '$(HOST_DIR)/lib/python*' -o -path '$(HOST_DIR)/usr/lib/python*' \) [...]
>>     find $(STAGING_DIR) \( -path '$(STAGING_DIR)/lib/python*' -o -path '$(STAGING_DIR)/usr/lib/python*' \) [...]
>>
>> So this is slightly different... I don't get why you needed to duplicate
>> the calls, calling the macro twice... And even then, it would have still
>> been simpler to pass the two locations as starting points to find:
>>
>>     find $(HOST_DIR)/lib/python* $(HOST_DIR)/usr/lib/python* [...]
>>     find $(STAGING_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* [...]
>>
>> But really, $(HOST_DIR)/usr is just a symlink to . so there is no reason
>> to search it. Also, previously $(STAGING_DIR)/lib/ was not in the search
>> list, but now it is...

[snip]
> 
> These finds are now called for all packages and .../lib/python* can be not
> present. In this case, find fails.

 A simpler way of doing that would be:

	$(if $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*),
		$(Q)find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*


> Using '-path' avoid this failure if files/dirs are not present.
> 
> I wanted to mimic fixup-libtool-files.
> And so performed 2 calls. One for HOST_DIR and the other for STAGING_DIR.
> 
> For HOST_DIR, the previous find searched in $(HOST_DIR)/lib,
> for STAGING_DIR, it searched in $(STAGING_DIR)/usr/lib,
> 
> That's why I wrote the macro with 
> -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*'
> 
> We can simplify the call using:
>   $(Q)find $(2) -path '$(2)/usr/lib/python*' -name ...
> This assumes:
> (1) python stuff will not install anything in $(STAGING_DIR)/lib/python*

 This you can safely assume.

> (2) $(HOST_DIR)/usr will always be a symlink to $(HOST_DIR)/.

 Although for now you can assume this, ideally we would like to have the
possibility to get rid of the usr->. symlink. So if you can avoid relying on it
it would be better.

 Regards,
 Arnout
Herve Codina July 7, 2021, 12:49 p.m. UTC | #5
Hi,

On Wed, 7 Jul 2021 14:21:16 +0200
Arnout Vandecappelle <arnout@mind.be> wrote:

> > 
> > These finds are now called for all packages and .../lib/python* can be not
> > present. In this case, find fails.  
> 
>  A simpler way of doing that would be:
> 
> 	$(if $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*),
> 		$(Q)find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*
> 

I think this does not solve the problem.
find fails if at least one of the 2 paths are not present.

Something like this (not tested) is needed.

	$(if $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*),
 		$(Q)find $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*)



> 
> > Using '-path' avoid this failure if files/dirs are not present.
> > 
> > I wanted to mimic fixup-libtool-files.
> > And so performed 2 calls. One for HOST_DIR and the other for STAGING_DIR.
> > 
> > For HOST_DIR, the previous find searched in $(HOST_DIR)/lib,
> > for STAGING_DIR, it searched in $(STAGING_DIR)/usr/lib,
> > 
> > That's why I wrote the macro with 
> > -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*'
> > 
> > We can simplify the call using:
> >   $(Q)find $(2) -path '$(2)/usr/lib/python*' -name ...
> > This assumes:
> > (1) python stuff will not install anything in $(STAGING_DIR)/lib/python*  
> 
>  This you can safely assume.

Ok.

> 
> > (2) $(HOST_DIR)/usr will always be a symlink to $(HOST_DIR)/.  
> 
>  Although for now you can assume this, ideally we would like to have the
> possibility to get rid of the usr->. symlink. So if you can avoid relying on it
> it would be better.

Right.
Assuming, the symlink not present, can we be sure that host python stuff will be
installed only in $(HOST_DIR)/lib/python* ?

Hervé
Arnout Vandecappelle July 7, 2021, 2:28 p.m. UTC | #6
On 07/07/2021 14:49, Herve Codina wrote:
> Hi,
> 
> On Wed, 7 Jul 2021 14:21:16 +0200
> Arnout Vandecappelle <arnout@mind.be> wrote:

[snip]

>>> (2) $(HOST_DIR)/usr will always be a symlink to $(HOST_DIR)/.  
>>
>>  Although for now you can assume this, ideally we would like to have the
>> possibility to get rid of the usr->. symlink. So if you can avoid relying on it
>> it would be better.
> 
> Right.
> Assuming, the symlink not present, can we be sure that host python stuff will be
> installed only in $(HOST_DIR)/lib/python* ?

 Yes - the usr symlink will only be removed if we're absolutely sure that
nothing installs in $(HOST_DIR)/usr (which means probably never...)

 Regards,
 Arnout
Yann E. MORIN July 7, 2021, 8:12 p.m. UTC | #7
Hervé, Arnout, All,

On 2021-07-07 14:49 +0200, Herve Codina spake thusly:
> On Wed, 7 Jul 2021 14:21:16 +0200
> Arnout Vandecappelle <arnout@mind.be> wrote:
> > > These finds are now called for all packages and .../lib/python* can be not
> > > present. In this case, find fails.  

OK, I see.

However, the commit log was lacking these explanations, because the
change is not simply "the python fixup is moved to the generic package
infrastructure", because the fixups had to be adapted.

> >  A simpler way of doing that would be:
> > 	$(if $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*),
> > 		$(Q)find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*
> I think this does not solve the problem.
> find fails if at least one of the 2 paths are not present.
> 
> Something like this (not tested) is needed.
> 
> 	$(if $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*),
>  		$(Q)find $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*)

Yes, this would keep the pre-existing behaviour.

However, I think the traditional way we usually do that would be with a
foreach:

    $(foreach dir, $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*), \
        $(Q)find $(dir) -name "_sysconfigdata*.py" -print0 \
        |xargs -0 --no-run-if-empty \
            $(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g"
        $(Q)find $(dir) -name "_sysconfigdata*.pyc" -delete
    )

And thus you're back to a basic, non-parameterised macro that can be
used as a POST_PREPARE_HOOK. Wee! :-)

Note: optimisation: we should be able to use a single find call to do
both, though:

    $(foreach dir, $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*), \
        $(Q)find $(dir) \( -name "_sysconfigdata*.pyc" -delete \) \
            -o \( -name "_sysconfigdata*.py" -print0 \) \
        |xargs -0 --no-run-if-empty \
            $(SED) 's:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g'
    )

Maybe the groupings '\( ... \)' are superfluous. Adapt appropriately...

> > > This assumes:
> > > (1) python stuff will not install anything in $(STAGING_DIR)/lib/python*  
> >  This you can safely assume.

Well, a package that would install its python bindings in /lib/python*
would anyway break at runtime, so this package would need to eb fixed to
install in /usr/lib/python*

So yes, nothing to look for in staging/lib/python*.

> > > (2) $(HOST_DIR)/usr will always be a symlink to $(HOST_DIR)/.  
> >  Although for now you can assume this, ideally we would like to have the
> > possibility to get rid of the usr->. symlink. So if you can avoid relying on it
> > it would be better.
> Right.
> Assuming, the symlink not present, can we be sure that host python stuff will be
> installed only in $(HOST_DIR)/lib/python* ?

I think, as Arnout said, that the usr symlink will always be present. So
we can also safely assume that there is nothing to look for in
host/usr/lib/python* either, because:

 1) that would have already been found in host/lib/python*

 2) that would also probably not work if usr was not a symlink, because
    our host python would not look there to find modules anyway

Regards,
Yann E. MORIN.
Herve Codina July 8, 2021, 3:38 p.m. UTC | #8
Hi,

On Wed, 7 Jul 2021 22:12:57 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> Hervé, Arnout, All,
> 
> On 2021-07-07 14:49 +0200, Herve Codina spake thusly:
> > On Wed, 7 Jul 2021 14:21:16 +0200
> > Arnout Vandecappelle <arnout@mind.be> wrote:  
> > > > These finds are now called for all packages and .../lib/python* can be not
> > > > present. In this case, find fails.    
> 
> OK, I see.
> 
> However, the commit log was lacking these explanations, because the
> change is not simply "the python fixup is moved to the generic package
> infrastructure", because the fixups had to be adapted.
> 
> > >  A simpler way of doing that would be:
> > > 	$(if $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*),
> > > 		$(Q)find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*  
> > I think this does not solve the problem.
> > find fails if at least one of the 2 paths are not present.
> > 
> > Something like this (not tested) is needed.
> > 
> > 	$(if $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*),
> >  		$(Q)find $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*)  
> 
> Yes, this would keep the pre-existing behaviour.
> 
> However, I think the traditional way we usually do that would be with a
> foreach:
> 
>     $(foreach dir, $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*), \
>         $(Q)find $(dir) -name "_sysconfigdata*.py" -print0 \
>         |xargs -0 --no-run-if-empty \
>             $(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g"
>         $(Q)find $(dir) -name "_sysconfigdata*.pyc" -delete
>     )
> 
> And thus you're back to a basic, non-parameterised macro that can be
> used as a POST_PREPARE_HOOK. Wee! :-)
> 
> Note: optimisation: we should be able to use a single find call to do
> both, though:
> 
>     $(foreach dir, $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*), \
>         $(Q)find $(dir) \( -name "_sysconfigdata*.pyc" -delete \) \
>             -o \( -name "_sysconfigdata*.py" -print0 \) \
>         |xargs -0 --no-run-if-empty \
>             $(SED) 's:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g'
>     )
> 
> Maybe the groupings '\( ... \)' are superfluous. Adapt appropriately...
> 

I tried this kind of things and I ran into trouble with $(wildcard ...)
Indeed, $(wildcard ...) does not see my directories.
A simple 'ls $(HOST_DIR)/lib/python*' just before $(foreach dir, $(wildcard ...), ...)
lists the directory.

Suppose this stupid package:
---- 8< -----
define TLP_PKGA_INSTALL_TARGET_CMDS
	echo "$$(date)" > $(TARGET_DIR)/pkga1
	echo "$$(date)" > $(TARGET_DIR)/pkga2
	mkdir -p $(TARGET_DIR)/foo/bla
	touch $(TARGET_DIR)/foo/bla/file1
	touch $(TARGET_DIR)/foo/bla/file2
endef

define TLP_PKGA_LIST_CMD
	echo "ls:" && ls $(TARGET_DIR)/foo/* || true
	echo "wildcard: $(wildcard $(TARGET_DIR)/foo/*)"
endef

TLP_PKGA_PRE_INSTALL_TARGET_HOOKS += TLP_PKGA_LIST_CMD
TLP_PKGA_POST_INSTALL_TARGET_HOOKS += TLP_PKGA_LIST_CMD


$(eval $(generic-package))
---- 8< -----

'make pkg' leads to:
	>>> pkg  Installing to target
	echo "ls:" && ls /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/* || true
	ls:
	ls: cannot access '/home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/*': No such file or directory
	echo "wildcard: "
	wildcard: 
	echo "$(date)" > /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/pkga1
	echo "$(date)" > /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/pkga2
	mkdir -p /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/bla
	touch /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/bla/file1
	touch /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/bla/file2
	echo "ls:" && ls /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/* || true
	ls:
	file1  file2
	echo "wildcard: "
	wildcard: 

$(wildcard ...) uses internal make caching.
Depending on when it is called related to make internal caching updates, it can
lead to unexpected results.

In the example provided, make cannot known that $(TARGET_DIR)/foo/bla was updated by
TLP_PKGA_INSTALL_TARGET_CMDS and so it didn't update its internal caching.
$(wildcard ...) does not see the new files.

Herve
Yann E. MORIN July 8, 2021, 8:26 p.m. UTC | #9
Hervé, All,

On 2021-07-08 17:38 +0200, Herve Codina spake thusly:
> On Wed, 7 Jul 2021 22:12:57 +0200
> "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
[--SNIP--]
> > Note: optimisation: we should be able to use a single find call to do
> > both, though:
> > 
> >     $(foreach dir, $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*), \
> >         $(Q)find $(dir) \( -name "_sysconfigdata*.pyc" -delete \) \
> >             -o \( -name "_sysconfigdata*.py" -print0 \) \
> >         |xargs -0 --no-run-if-empty \
> >             $(SED) 's:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g'
> >     )
[--SNIP--]
> I tried this kind of things and I ran into trouble with $(wildcard ...)
> Indeed, $(wildcard ...) does not see my directories.
> A simple 'ls $(HOST_DIR)/lib/python*' just before $(foreach dir, $(wildcard ...), ...)
> lists the directory.
> 
> Suppose this stupid package:
> ---- 8< -----
> define TLP_PKGA_INSTALL_TARGET_CMDS
> 	echo "$$(date)" > $(TARGET_DIR)/pkga1
> 	echo "$$(date)" > $(TARGET_DIR)/pkga2
> 	mkdir -p $(TARGET_DIR)/foo/bla
> 	touch $(TARGET_DIR)/foo/bla/file1
> 	touch $(TARGET_DIR)/foo/bla/file2
> endef
> 
> define TLP_PKGA_LIST_CMD
> 	echo "ls:" && ls $(TARGET_DIR)/foo/* || true
> 	echo "wildcard: $(wildcard $(TARGET_DIR)/foo/*)"
> endef
> 
> TLP_PKGA_PRE_INSTALL_TARGET_HOOKS += TLP_PKGA_LIST_CMD
> TLP_PKGA_POST_INSTALL_TARGET_HOOKS += TLP_PKGA_LIST_CMD
> 
> 
> $(eval $(generic-package))
> ---- 8< -----
> 
> 'make pkg' leads to:
> 	>>> pkg  Installing to target
> 	echo "ls:" && ls /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/* || true
> 	ls:
> 	ls: cannot access '/home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/*': No such file or directory
> 	echo "wildcard: "
> 	wildcard: 
> 	echo "$(date)" > /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/pkga1
> 	echo "$(date)" > /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/pkga2
> 	mkdir -p /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/bla
> 	touch /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/bla/file1
> 	touch /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/bla/file2
> 	echo "ls:" && ls /home/hcodina/project/nagra/buildroot/output/per-package/tlp_pkga/target/foo/* || true
> 	ls:
> 	file1  file2
> 	echo "wildcard: "
> 	wildcard: 
> 
> $(wildcard ...) uses internal make caching.

It is not caching. The issue is when variables (or calls to functions)
are evaluated.

In makefile, the evaluation is done for a full recipe at once, not for
each line to be executed.

HOOKS, pre or post, are part of the same recipe and the corresponding
CMDS, so all three (pre hooks, cmds, and post hooks) are evaluated at
the same time, even before any single line of the recipe if executed.

So given this trivial Makefile:

    $ cat Makefile
    all:
        @touch toto
        @echo '"toto is $(wildcard toto)"'

    $ make
    "toto is "
    $ make
    "toto is toto"

So, in the end, with all these back-n-forth, and combining all the
explanations, we sould be able to complete the macro with:

    # Can't use $(foreach d, $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*, ...)
    # because those directories may be created in the same recipe this
    # macro wil be expanded in.
    # Additionally, either or both may be missing, which would make find
    # whine and fail.
    # So we just use HOST_DIR as a starting point, and filter on the two
    # directories of interest.
    define FIXUP_PYTHON_SYSCONFIGDATA
        $(Q)find $(HOST_DIR) \
            \( -path '$(HOST_DIR)/lib/python*' -o -path '$(STAGING_DIR)/usr/lib/python*' \) \
            \(    \( -name "_sysconfigdata*.pyc" -delete \) \
               -o \( -name "_sysconfigdata*.py" -print0 \) \) \
        |xargs -0 --no-run-if-empty \
            $(SED) 's:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g'
    endef

And then, in inner-genenric-package:

    $(2)_POST_PREPARE_HOOKS += FIXUP_PYTHON_SYSCONFIGDATA

And then the commit log will need to have all the details why we use
this trick, and probably also a small comment above the macro defintion,
to explain it too.

Not that STAGING_DIR is not a starting point for find, because it is
below HOST_DIR, so we will eventually search STAGING_DIR.

Oh, and a final note: the existing hook is already sliently broken!
Indeed, if either $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*
is mising, find whines and fails. However, the breakage is ignored,
because find is on the left-hand side of a pipe, so its exit status is
ignored.

But of course, this is no longer the case with the new find that was
added by the previous commit, which really is the commit that introduces
the issues of find failing when a directory was missing, not this commit
specifically. I suppose you did not notice, because the prvious commit
was not tested on its own (and I can understand that, for I do that very
often too). But now autobuilders have caught it (and user have started
to notice too)...

Regards,
Yann E. MORIN.
Herve Codina July 9, 2021, 6:48 a.m. UTC | #10
On Thu, 8 Jul 2021 22:26:23 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> Hervé, All,
> 
> On 2021-07-08 17:38 +0200, Herve Codina spake thusly:
> > On Wed, 7 Jul 2021 22:12:57 +0200
> > "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:  
> [--SNIP--]
> > > Note: optimisation: we should be able to use a single find call to do
> > > both, though:
> > > 
> > >     $(foreach dir, $(wildcard $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*), \
> > >         $(Q)find $(dir) \( -name "_sysconfigdata*.pyc" -delete \) \
> > >             -o \( -name "_sysconfigdata*.py" -print0 \) \
> > >         |xargs -0 --no-run-if-empty \
> > >             $(SED) 's:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g'
> > >     )  
> [--SNIP--]
> > I tried this kind of things and I ran into trouble with $(wildcard ...)
> > Indeed, $(wildcard ...) does not see my directories.
> > A simple 'ls $(HOST_DIR)/lib/python*' just before $(foreach dir, $(wildcard ...), ...)
> > lists the directory.
[...]
> It is not caching. The issue is when variables (or calls to functions)
> are evaluated.
> 
> In makefile, the evaluation is done for a full recipe at once, not for
> each line to be executed.
> 
> HOOKS, pre or post, are part of the same recipe and the corresponding
> CMDS, so all three (pre hooks, cmds, and post hooks) are evaluated at
> the same time, even before any single line of the recipe if executed.
> 
> So given this trivial Makefile:
> 
>     $ cat Makefile
>     all:
>         @touch toto
>         @echo '"toto is $(wildcard toto)"'
> 
>     $ make
>     "toto is "
>     $ make
>     "toto is toto"
> 

Thanks for this explanation.

> So, in the end, with all these back-n-forth, and combining all the
> explanations, we sould be able to complete the macro with:
> 
>     # Can't use $(foreach d, $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*, ...)
>     # because those directories may be created in the same recipe this
>     # macro wil be expanded in.
>     # Additionally, either or both may be missing, which would make find
>     # whine and fail.
>     # So we just use HOST_DIR as a starting point, and filter on the two
>     # directories of interest.
>     define FIXUP_PYTHON_SYSCONFIGDATA
>         $(Q)find $(HOST_DIR) \
>             \( -path '$(HOST_DIR)/lib/python*' -o -path '$(STAGING_DIR)/usr/lib/python*' \) \
>             \(    \( -name "_sysconfigdata*.pyc" -delete \) \
>                -o \( -name "_sysconfigdata*.py" -print0 \) \) \
>         |xargs -0 --no-run-if-empty \
>             $(SED) 's:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g'
>     endef
> 
> And then, in inner-genenric-package:
> 
>     $(2)_POST_PREPARE_HOOKS += FIXUP_PYTHON_SYSCONFIGDATA
> 
> And then the commit log will need to have all the details why we use
> this trick, and probably also a small comment above the macro defintion,
> to explain it too.
> 
> Not that STAGING_DIR is not a starting point for find, because it is
> below HOST_DIR, so we will eventually search STAGING_DIR.

Right, I will try this one.

> 
> Oh, and a final note: the existing hook is already sliently broken!
> Indeed, if either $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python*
> is mising, find whines and fails. However, the breakage is ignored,
> because find is on the left-hand side of a pipe, so its exit status is
> ignored.
> 
> But of course, this is no longer the case with the new find that was
> added by the previous commit, which really is the commit that introduces
> the issues of find failing when a directory was missing, not this commit
> specifically. I suppose you did not notice, because the prvious commit
> was not tested on its own (and I can understand that, for I do that very
> often too). But now autobuilders have caught it (and user have started
> to notice too)...

You right, I tested my first version of the patch (with the pipe) and missed
the issue introduced by removing the pipe in v2.
Indeed, I tested the whole series not each patches.
diff mbox series

Patch

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index 6fbd4006da..8c19522fd8 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -103,6 +103,21 @@  define fixup-libtool-files
 endef
 endif
 
+# Make sure python _sysconfigdata*.py files only reference the current
+# per-package directory.
+
+# $1: package name (lower case)
+# $2: staging or host directory of the package
+ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
+define fixup-python-files
+	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
+		-name "_sysconfigdata*.py" -print0 | xargs -0 --no-run-if-empty \
+		$(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$(1)/:g"
+	$(Q)find $(2) \( -path '$(2)/lib/python*' -o -path '$(2)/usr/lib/python*' \) \
+		-name "_sysconfigdata*.pyc" -delete
+endef
+endif
+
 # Functions to collect statistics about installed files
 
 # $(1): base directory to search in
@@ -254,6 +269,8 @@  $(BUILD_DIR)/%/.stamp_configured:
 	@$(call pkg_size_before,$(HOST_DIR),-host)
 	$(call fixup-libtool-files,$(NAME),$(HOST_DIR))
 	$(call fixup-libtool-files,$(NAME),$(STAGING_DIR))
+	$(call fixup-python-files,$(NAME),$(HOST_DIR))
+	$(call fixup-python-files,$(NAME),$(STAGING_DIR))
 	$(foreach hook,$($(PKG)_POST_PREPARE_HOOKS),$(call $(hook))$(sep))
 	$(foreach hook,$($(PKG)_PRE_CONFIGURE_HOOKS),$(call $(hook))$(sep))
 	$($(PKG)_CONFIGURE_CMDS)
diff --git a/package/pkg-python.mk b/package/pkg-python.mk
index 1e4fd5ba33..e6b81bdfd3 100644
--- a/package/pkg-python.mk
+++ b/package/pkg-python.mk
@@ -92,16 +92,6 @@  HOST_PKG_PYTHON_SETUPTOOLS_INSTALL_OPTS = \
 	--root=/ \
 	--single-version-externally-managed
 
-ifeq ($(BR2_PER_PACKAGE_DIRECTORIES),y)
-define PKG_PYTHON_FIXUP_SYSCONFIGDATA
-	find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* \
-		-name "_sysconfigdata*.py" | xargs --no-run-if-empty \
-		$(SED) "s:$(PER_PACKAGE_DIR)/[^/]\+/:$(PER_PACKAGE_DIR)/$($(PKG)_NAME)/:g"
-	find $(HOST_DIR)/lib/python* $(STAGING_DIR)/usr/lib/python* \
-		-name "_sysconfigdata*.pyc" -delete
-endef
-endif
-
 ################################################################################
 # inner-python-package -- defines how the configuration, compilation
 # and installation of a Python package should be done, implements a
@@ -246,8 +236,6 @@  $(2)_PYTHON_INTERPRETER = $$(HOST_DIR)/bin/$$($(2)_NEEDS_HOST_PYTHON)
 endif
 endif
 
-$(2)_PRE_CONFIGURE_HOOKS += PKG_PYTHON_FIXUP_SYSCONFIGDATA
-
 #
 # Build step. Only define it if not already defined by the package .mk
 # file.