Message ID | 8c891d4245028f97585d0e55ab5962e6ea337659.1546898693.git.yann.morin.1998@free.fr |
---|---|
State | Changes Requested |
Headers | show |
Series | [01/19] infra/pkg-generic: display MESSAGE before running PRE_HOOKS | expand |
El lun., 7 ene. 2019 a las 23:05, Yann E. MORIN (<yann.morin.1998@free.fr>) escribió: > > Since 7fb6e782542f (core/instrumentation: shave minutes off the build > time), the built stampfile is used as a reference to detect files > installed by a package. > > However, this falls short during development, when a user may want to > re-install a built-early package without rebuilding it (i.e. make > foo-reinstall). In this case, the built stampfile is not touched, and is > still dated from way back when the package was first built. As such, > almost all files in target (or staging or host) are newer than that, and > so those files are all now accounted for that package, when in fact only > a minor subset may be accountable to it. > > We fix that by limiting the search for files that have been actually > touched during the install step, now that we have the proper timestamp > for it. > > Reported-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> > Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr> > Cc: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> > --- > package/pkg-generic.mk | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk > index 3de8a99675..42aebeb49d 100644 > --- a/package/pkg-generic.mk > +++ b/package/pkg-generic.mk > @@ -67,7 +67,7 @@ define step_pkg_size_inner > $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt > cd $(2); \ > find . \( -type f -o -type l \) \ > - -newer $($(PKG)_DIR)/.stamp_built \ > + -newer $@_before \ > -exec printf '$(1),%s\n' {} + \ > >> $(BUILD_DIR)/packages-file-list$(3).txt > endef While it is probably not important, just noting here that if package A installs a file with a modification time in the future, then package B installed after A would still get that future file in its list. In fact, every package built after pkg A would then get that file into 'its' list.
El mar., 8 ene. 2019 a las 14:07, Thomas De Schampheleire (<patrickdepinguin@gmail.com>) escribió: > > El lun., 7 ene. 2019 a las 23:05, Yann E. MORIN > (<yann.morin.1998@free.fr>) escribió: > > > > Since 7fb6e782542f (core/instrumentation: shave minutes off the build > > time), the built stampfile is used as a reference to detect files > > installed by a package. > > > > However, this falls short during development, when a user may want to > > re-install a built-early package without rebuilding it (i.e. make > > foo-reinstall). In this case, the built stampfile is not touched, and is > > still dated from way back when the package was first built. As such, > > almost all files in target (or staging or host) are newer than that, and > > so those files are all now accounted for that package, when in fact only > > a minor subset may be accountable to it. > > > > We fix that by limiting the search for files that have been actually > > touched during the install step, now that we have the proper timestamp > > for it. > > > > Reported-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> > > Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr> > > Cc: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> > > --- > > package/pkg-generic.mk | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk > > index 3de8a99675..42aebeb49d 100644 > > --- a/package/pkg-generic.mk > > +++ b/package/pkg-generic.mk > > @@ -67,7 +67,7 @@ define step_pkg_size_inner > > $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt > > cd $(2); \ > > find . \( -type f -o -type l \) \ > > - -newer $($(PKG)_DIR)/.stamp_built \ > > + -newer $@_before \ > > -exec printf '$(1),%s\n' {} + \ > > >> $(BUILD_DIR)/packages-file-list$(3).txt > > endef > > While it is probably not important, just noting here that if package A > installs a file with a modification time in the future, then package B > installed after A would still get that future file in its list. In > fact, every package built after pkg A would then get that file into > 'its' list. Possibly more realistically, a package installing a file with a fixed timestamp in the past, perhaps a timestamp of the build time preserved via 'cp -a'. For this reason I retract the 'probably not important' part.
Thomas DS, All, Thanks for the review! :-) On 2019-01-08 14:07 +0100, Thomas De Schampheleire spake thusly: > El lun., 7 ene. 2019 a las 23:05, Yann E. MORIN > (<yann.morin.1998@free.fr>) escribió: snip > > We fix that by limiting the search for files that have been actually > > touched during the install step, now that we have the proper timestamp > > for it. [--SNIP--] > > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk > > index 3de8a99675..42aebeb49d 100644 > > --- a/package/pkg-generic.mk > > +++ b/package/pkg-generic.mk > > @@ -67,7 +67,7 @@ define step_pkg_size_inner > > $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt > > cd $(2); \ > > find . \( -type f -o -type l \) \ > > - -newer $($(PKG)_DIR)/.stamp_built \ > > + -newer $@_before \ > While it is probably not important, just noting here that if package A > installs a file with a modification time in the future, then package B > installed after A would still get that future file in its list. In > fact, every package built after pkg A would then get that file into > 'its' list. How is this different from the current situation? Maybe in that case, it would make sense that we touch all files after a package installation (limited to that package's files, of course)? My position is that, if a package voluntarily install stuff in the future, it is borked. I.e. either it uses an absolute time, in which case that time will be in the past some time in the future, or it uses a delta, in which case it is not reproducible. Furthermore when we do an actual reproducible build, we actually touch all files at the end of the build, so a package that relies on file's timestamp is not reproducible. Regards, Yann E. MORIN.
El mar., 8 ene. 2019 a las 17:08, Yann E. MORIN (<yann.morin.1998@free.fr>) escribió: > > Thomas DS, All, > > Thanks for the review! :-) > > On 2019-01-08 14:07 +0100, Thomas De Schampheleire spake thusly: > > El lun., 7 ene. 2019 a las 23:05, Yann E. MORIN > > (<yann.morin.1998@free.fr>) escribió: > snip > > > We fix that by limiting the search for files that have been actually > > > touched during the install step, now that we have the proper timestamp > > > for it. > [--SNIP--] > > > diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk > > > index 3de8a99675..42aebeb49d 100644 > > > --- a/package/pkg-generic.mk > > > +++ b/package/pkg-generic.mk > > > @@ -67,7 +67,7 @@ define step_pkg_size_inner > > > $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt > > > cd $(2); \ > > > find . \( -type f -o -type l \) \ > > > - -newer $($(PKG)_DIR)/.stamp_built \ > > > + -newer $@_before \ > > While it is probably not important, just noting here that if package A > > installs a file with a modification time in the future, then package B > > installed after A would still get that future file in its list. In > > fact, every package built after pkg A would then get that file into > > 'its' list. > > How is this different from the current situation? Yes, it's the same as in the current situation, but not in 'my' current situation where I'm still relying on md5sum :-) Part of my interest in reviewing these changes is the hope that they could remove the need for my revert of the 'shave off' commit that switches to mtime instead of md5sum, and thus following upstream Buildroot in this respect. > > Maybe in that case, it would make sense that we touch all files after a > package installation (limited to that package's files, of course)? But it could be that the package has good reasons to expect a specific timestamp, or more probably a relative time order between certain files. I'm not sure if it is common, like I said I'd need to check in my own case. > > My position is that, if a package voluntarily install stuff in the > future, it is borked. I.e. either it uses an absolute time, in which > case that time will be in the past some time in the future, or it uses > a delta, in which case it is not reproducible. A time delta expected between two files is not irreproducible if both files come from the same package. > > Furthermore when we do an actual reproducible build, we actually touch > all files at the end of the build, so a package that relies on file's > timestamp is not reproducible. I was not aware of this. That changes things, of course. Can you point me to the exact place in the code? Thanks, Thomas
Thomas DS, All, On 2019-01-08 16:56 +0100, Thomas De Schampheleire spake thusly: > El mar., 8 ene. 2019 a las 14:07, Thomas De Schampheleire > (<patrickdepinguin@gmail.com>) escribió: [--SNIP--] > > > - -newer $($(PKG)_DIR)/.stamp_built \ > > > + -newer $@_before \ > > While it is probably not important, just noting here that if package A > > installs a file with a modification time in the future, then package B > > installed after A would still get that future file in its list. In > > fact, every package built after pkg A would then get that file into > > 'its' list. > Possibly more realistically, a package installing a file with a fixed > timestamp in the past, perhaps a timestamp of the build time preserved > via 'cp -a'. Actually, we already have this very issue with our skeleton: its files are copied and their timestamps are kept as-is, so the files are almost certainly older than the built timestamp, and thus are not accounted for againt the skeleton. I already identified this problem a few days ago while working on this series, and I had considered fixing it later, or the series would get even bigger... And yes, the problem already happens with the current situation. If we wanted to be exhaustive, we'd also list files that are not already in the list (and warn loudly). And since we introduce a new format, we could more easily add a "metadata" field that specifies why the entry was added (mtime, new-but-old, etc...) Regards, Yann E. MORIN.
Thomas DS, All, On 2019-01-08 17:55 +0100, Thomas De Schampheleire spake thusly: > El mar., 8 ene. 2019 a las 17:08, Yann E. MORIN > (<yann.morin.1998@free.fr>) escribió: > > On 2019-01-08 14:07 +0100, Thomas De Schampheleire spake thusly: > > > El lun., 7 ene. 2019 a las 23:05, Yann E. MORIN > > > (<yann.morin.1998@free.fr>) escribió: [--SNIP--] > > > > - -newer $($(PKG)_DIR)/.stamp_built \ > > > > + -newer $@_before \ > > > While it is probably not important, just noting here that if package A > > > installs a file with a modification time in the future, then package B > > > installed after A would still get that future file in its list. In > > > fact, every package built after pkg A would then get that file into > > > 'its' list. > > How is this different from the current situation? > > Yes, it's the same as in the current situation, but not in 'my' > current situation where I'm still relying on md5sum :-) > Part of my interest in reviewing these changes is the hope that they > could remove the need for my revert of the 'shave off' commit that > switches to mtime instead of md5sum, and thus following upstream > Buildroot in this respect. Yes, and I thank you a lot for this thorough review! :-) > > Maybe in that case, it would make sense that we touch all files after a > > package installation (limited to that package's files, of course)? > But it could be that the package has good reasons to expect a specific > timestamp, or more probably a relative time order between certain > files. > I'm not sure if it is common, like I said I'd need to check in my own case. The only case I'm aware of, is that systemd can detect the timestamp of /usr and decide to run upgrade scripts if /usr is more recent than a file in /etc, hence the reason we have: https://git.buildroot.org/buildroot/tree/Makefile#n786 But see below... > > My position is that, if a package voluntarily install stuff in the > > future, it is borked. I.e. either it uses an absolute time, in which > > case that time will be in the past some time in the future, or it uses > > a delta, in which case it is not reproducible. > > A time delta expected between two files is not irreproducible if both > files come from the same package. > > > > > Furthermore when we do an actual reproducible build, we actually touch > > all files at the end of the build, so a package that relies on file's > > timestamp is not reproducible. > > I was not aware of this. That changes things, of course. > Can you point me to the exact place in the code? https://git.buildroot.org/buildroot/tree/fs/common.mk#n39 And now I notice that the systemd stuff is borked in this case, because /usr must really be the time of build, not the time of the commit date. I now realise that there is a whole slew of incorrect assumptions we've made about SOURCE_DATE_EPOCH in the context of Buildroot, but I shall write about this in another thread. Regards, Yann E. MORIN.
diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk index 3de8a99675..42aebeb49d 100644 --- a/package/pkg-generic.mk +++ b/package/pkg-generic.mk @@ -67,7 +67,7 @@ define step_pkg_size_inner $(SED) '/^$(1),/d' $(BUILD_DIR)/packages-file-list$(3).txt cd $(2); \ find . \( -type f -o -type l \) \ - -newer $($(PKG)_DIR)/.stamp_built \ + -newer $@_before \ -exec printf '$(1),%s\n' {} + \ >> $(BUILD_DIR)/packages-file-list$(3).txt endef
Since 7fb6e782542f (core/instrumentation: shave minutes off the build time), the built stampfile is used as a reference to detect files installed by a package. However, this falls short during development, when a user may want to re-install a built-early package without rebuilding it (i.e. make foo-reinstall). In this case, the built stampfile is not touched, and is still dated from way back when the package was first built. As such, almost all files in target (or staging or host) are newer than that, and so those files are all now accounted for that package, when in fact only a minor subset may be accountable to it. We fix that by limiting the search for files that have been actually touched during the install step, now that we have the proper timestamp for it. Reported-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr> Cc: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> --- package/pkg-generic.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)