diff mbox series

core/pkg-generic: only save latest package list

Message ID 20180429130714.14312-1-john@metanate.com
State Changes Requested
Headers show
Series core/pkg-generic: only save latest package list | expand

Commit Message

John Keeping April 29, 2018, 1:07 p.m. UTC
When rebuilding a package, simply appending the package's file list to
the global list means that the package list grows for every rebuild, as
does the time taken to check for files installed by multiple packages.
Furthermore, we get false positives where a file is reported as being
installed by multiple copies of the same package.

With this approach we may end up with orphaned files in the target
filesystem if a package that has been updated and rebuilt no longer
installs the same set of files, but we know that only a clean build will
produce reliable results.  In fact it may be helpful to identify these
orphaned files as evidence that the build is not clean.

Signed-off-by: John Keeping <john@metanate.com>
---
 package/pkg-generic.mk | 1 +
 1 file changed, 1 insertion(+)

Comments

Yann E. MORIN April 30, 2018, 4:47 p.m. UTC | #1
John, All,

On 2018-04-29 14:07 +0100, John Keeping spake thusly:
> When rebuilding a package, simply appending the package's file list to
> the global list means that the package list grows for every rebuild, as
> does the time taken to check for files installed by multiple packages.
> Furthermore, we get false positives where a file is reported as being
> installed by multiple copies of the same package.
> 
> With this approach we may end up with orphaned files in the target
> filesystem if a package that has been updated and rebuilt no longer
> installs the same set of files, but we know that only a clean build will
> produce reliable results.  In fact it may be helpful to identify these
> orphaned files as evidence that the build is not clean.
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  package/pkg-generic.mk | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> index 1c9dd1d734..edc2c9349c 100644
> --- a/package/pkg-generic.mk
> +++ b/package/pkg-generic.mk
> @@ -64,6 +64,7 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
>  # $(3): suffix of file  (optional)
>  define step_pkg_size_inner
>  	cd $(2); \
> +	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt; \

Since BUILD_DIR is a fully-qualified path, I would have put it as the
first line of the macros, and thus it would not have required the
trailing '\':

    $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
    cd $(2); \
    find [...]

Otherwise:

Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>

Regards,
Yann E. MORIN.

>  	find . \( -type f -o -type l \) \
>  		-newer $($(PKG)_DIR)/.stamp_built \
>  		-exec printf '$(1),%s\n' {} + \
> -- 
> 2.17.0
>
John Keeping April 30, 2018, 4:56 p.m. UTC | #2
On Mon, 30 Apr 2018 18:47:28 +0200
"Yann E. MORIN" <yann.morin.1998@free.fr> wrote:

> John, All,
> 
> On 2018-04-29 14:07 +0100, John Keeping spake thusly:
> > When rebuilding a package, simply appending the package's file list
> > to the global list means that the package list grows for every
> > rebuild, as does the time taken to check for files installed by
> > multiple packages. Furthermore, we get false positives where a file
> > is reported as being installed by multiple copies of the same
> > package.
> > 
> > With this approach we may end up with orphaned files in the target
> > filesystem if a package that has been updated and rebuilt no longer
> > installs the same set of files, but we know that only a clean build
> > will produce reliable results.  In fact it may be helpful to
> > identify these orphaned files as evidence that the build is not
> > clean.
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> >  package/pkg-generic.mk | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> > index 1c9dd1d734..edc2c9349c 100644
> > --- a/package/pkg-generic.mk
> > +++ b/package/pkg-generic.mk
> > @@ -64,6 +64,7 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
> >  # $(3): suffix of file  (optional)
> >  define step_pkg_size_inner
> >  	cd $(2); \
> > +	$(SED) '/^$(1),/d'
> > $(BUILD_DIR)/packages-file-list$(3).txt; \  
> 
> Since BUILD_DIR is a fully-qualified path, I would have put it as the
> first line of the macros, and thus it would not have required the
> trailing '\':
> 
>     $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
>     cd $(2); \
>     find [...]

This will cause a build error if packages-file-list$(3).txt doesn't
exist, which will be the case for the first package built.

We could use:

      -$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt

but then make will log that it is ignoring an error.


Regards,
John
Yann E. MORIN April 30, 2018, 7:41 p.m. UTC | #3
John, All,

On 2018-04-30 17:56 +0100, John Keeping spake thusly:
> On Mon, 30 Apr 2018 18:47:28 +0200
> "Yann E. MORIN" <yann.morin.1998@free.fr> wrote:
> > On 2018-04-29 14:07 +0100, John Keeping spake thusly:
> > > When rebuilding a package, simply appending the package's file list
> > > to the global list means that the package list grows for every
> > > rebuild, as does the time taken to check for files installed by
> > > multiple packages. Furthermore, we get false positives where a file
> > > is reported as being installed by multiple copies of the same
> > > package.
> > > 
> > > With this approach we may end up with orphaned files in the target
> > > filesystem if a package that has been updated and rebuilt no longer
> > > installs the same set of files, but we know that only a clean build
> > > will produce reliable results.  In fact it may be helpful to
> > > identify these orphaned files as evidence that the build is not
> > > clean.
> > > 
> > > Signed-off-by: John Keeping <john@metanate.com>
> > > ---
> > >  package/pkg-generic.mk | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
> > > index 1c9dd1d734..edc2c9349c 100644
> > > --- a/package/pkg-generic.mk
> > > +++ b/package/pkg-generic.mk
> > > @@ -64,6 +64,7 @@ GLOBAL_INSTRUMENTATION_HOOKS += step_time
> > >  # $(3): suffix of file  (optional)
> > >  define step_pkg_size_inner
> > >  	cd $(2); \
> > > +	$(SED) '/^$(1),/d'
> > > $(BUILD_DIR)/packages-file-list$(3).txt; \  
> > 
> > Since BUILD_DIR is a fully-qualified path, I would have put it as the
> > first line of the macros, and thus it would not have required the
> > trailing '\':
> > 
> >     $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
> >     cd $(2); \
> >     find [...]
> 
> This will cause a build error if packages-file-list$(3).txt doesn't
> exist, which will be the case for the first package built.

Hmm, indeed. But that is the case with either your original solution or
my proposal, except in your case, the error is silently ignored.

> We could use:
>       -$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
> but then make will log that it is ignoring an error.

I don;t like that much either. What about:

    @touch $(BUILD_DIR)/packages-file-list$(3).txt
    $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt
    cd $(2); \
    find [...]

Regards,
Yann E. MORIN.
diff mbox series

Patch

diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk
index 1c9dd1d734..edc2c9349c 100644
--- a/package/pkg-generic.mk
+++ b/package/pkg-generic.mk
@@ -64,6 +64,7 @@  GLOBAL_INSTRUMENTATION_HOOKS += step_time
 # $(3): suffix of file  (optional)
 define step_pkg_size_inner
 	cd $(2); \
+	$(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt; \
 	find . \( -type f -o -type l \) \
 		-newer $($(PKG)_DIR)/.stamp_built \
 		-exec printf '$(1),%s\n' {} + \