diff mbox

[04/34] reproducibility: make rootfs.tar reproducible

Message ID 1462002570-14706-4-git-send-email-gilles.chanteperdrix@xenomai.org
State Changes Requested
Headers show

Commit Message

Gilles Chanteperdrix April 30, 2016, 7:49 a.m. UTC
By generating a tarball with a deterministic file order and date.
---
 fs/tar/tar.mk | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Thomas Petazzoni May 7, 2016, 1:23 p.m. UTC | #1
Hello,

On Sat, 30 Apr 2016 09:49:00 +0200, Gilles Chanteperdrix wrote:

> +define ROOTFS_TAR_CMD
> +	cd $(TARGET_DIR) && { \
> +		find . -\( -! -type d -o -empty -\) -print0 | \
> +		sort -z | \
> +		tar --null -T - -c$(TAR_OPTS)f $@ --mtime=@$(SOURCE_DATE_EPOCH) --numeric-owner; \
> +	}

We normally write such constructs as:

	(cd $(TARGET_DIR) && \
		foo ....)

However, this raises the question of what's needed for all the other
filesystem formats. Will they all have to implement a different
ROOTFS_<foo>_CMD variable ? Or will there be some commonalities that
should be factored out in the common rootfs image infrastructure ?

Thomas
Arnout Vandecappelle May 7, 2016, 7:51 p.m. UTC | #2
On 05/07/16 15:23, Thomas Petazzoni wrote:
> Hello,
>
> On Sat, 30 Apr 2016 09:49:00 +0200, Gilles Chanteperdrix wrote:
>
>> +define ROOTFS_TAR_CMD
>> +	cd $(TARGET_DIR) && { \
>> +		find . -\( -! -type d -o -empty -\) -print0 | \
>> +		sort -z | \
>> +		tar --null -T - -c$(TAR_OPTS)f $@ --mtime=@$(SOURCE_DATE_EPOCH) --numeric-owner; \
>> +	}
>
> We normally write such constructs as:
>
> 	(cd $(TARGET_DIR) && \
> 		foo ....)

  Actually, we don't AFAIK... In general, the parenthesis are not needed so they 
should be removed. So also in this case it should be

	cd $(TARGET_DIR);
		find ....

>
> However, this raises the question of what's needed for all the other
> filesystem formats. Will they all have to implement a different
> ROOTFS_<foo>_CMD variable ? Or will there be some commonalities that
> should be factored out in the common rootfs image infrastructure ?

  Yes, to me it makes more sense to do this in the actual target directory after 
the post-build scripts have been run.


  Regards,
  Arnout
Gilles Chanteperdrix May 8, 2016, 8:17 p.m. UTC | #3
On Sat, May 07, 2016 at 09:51:36PM +0200, Arnout Vandecappelle wrote:
> On 05/07/16 15:23, Thomas Petazzoni wrote:
> > Hello,
> >
> > On Sat, 30 Apr 2016 09:49:00 +0200, Gilles Chanteperdrix wrote:
> >
> >> +define ROOTFS_TAR_CMD
> >> +	cd $(TARGET_DIR) && { \
> >> +		find . -\( -! -type d -o -empty -\) -print0 | \
> >> +		sort -z | \
> >> +		tar --null -T - -c$(TAR_OPTS)f $@ --mtime=@$(SOURCE_DATE_EPOCH) --numeric-owner; \
> >> +	}
> >
> > We normally write such constructs as:
> >
> > 	(cd $(TARGET_DIR) && \
> > 		foo ....)
> 
>   Actually, we don't AFAIK... In general, the parenthesis are not needed so they 
> should be removed. So also in this case it should be
> 
> 	cd $(TARGET_DIR);
> 		find ....
> 
> >
> > However, this raises the question of what's needed for all the other
> > filesystem formats. Will they all have to implement a different
> > ROOTFS_<foo>_CMD variable ? Or will there be some commonalities that
> > should be factored out in the common rootfs image infrastructure ?
> 
>   Yes, to me it makes more sense to do this in the actual target directory after 
> the post-build scripts have been run.

I do not understand what you mean. The aim of the command is to sort
the list of files passed to tar, it does not operate on the file
system.

Sorting is indeed needed for most other outputs I have tested (cpio,
isofs).
Arnout Vandecappelle May 9, 2016, 11:29 p.m. UTC | #4
On 05/08/16 22:17, Gilles Chanteperdrix wrote:
> On Sat, May 07, 2016 at 09:51:36PM +0200, Arnout Vandecappelle wrote:
>> On 05/07/16 15:23, Thomas Petazzoni wrote:
>>> Hello,
>>>
>>> On Sat, 30 Apr 2016 09:49:00 +0200, Gilles Chanteperdrix wrote:
>>>
>>>> +define ROOTFS_TAR_CMD
>>>> +	cd $(TARGET_DIR) && { \
>>>> +		find . -\( -! -type d -o -empty -\) -print0 | \
>>>> +		sort -z | \
>>>> +		tar --null -T - -c$(TAR_OPTS)f $@ --mtime=@$(SOURCE_DATE_EPOCH) --numeric-owner; \
>>>> +	}
>>>
>>> We normally write such constructs as:
>>>
>>> 	(cd $(TARGET_DIR) && \
>>> 		foo ....)
>>
>>   Actually, we don't AFAIK... In general, the parenthesis are not needed so they
>> should be removed. So also in this case it should be
>>
>> 	cd $(TARGET_DIR);
>> 		find ....
>>
>>>
>>> However, this raises the question of what's needed for all the other
>>> filesystem formats. Will they all have to implement a different
>>> ROOTFS_<foo>_CMD variable ? Or will there be some commonalities that
>>> should be factored out in the common rootfs image infrastructure ?
>>
>>   Yes, to me it makes more sense to do this in the actual target directory after
>> the post-build scripts have been run.
>
> I do not understand what you mean. The aim of the command is to sort
> the list of files passed to tar, it does not operate on the file
> system.

  Duh, sorry, I was just talking about the --mtime bit.

  Regards,
  Arnout

>
> Sorting is indeed needed for most other outputs I have tested (cpio,
> isofs).
>
diff mbox

Patch

diff --git a/fs/tar/tar.mk b/fs/tar/tar.mk
index 28219cf..06a4d7c 100644
--- a/fs/tar/tar.mk
+++ b/fs/tar/tar.mk
@@ -6,8 +6,19 @@ 
 
 TAR_OPTS := $(call qstrip,$(BR2_TARGET_ROOTFS_TAR_OPTIONS))
 
+ifneq ($(BR2_REPRODUCIBLE),y)
 define ROOTFS_TAR_CMD
 	tar -c$(TAR_OPTS)f $@ --numeric-owner -C $(TARGET_DIR) .
 endef
+else
+define ROOTFS_TAR_CMD
+	cd $(TARGET_DIR) && { \
+		find . -\( -! -type d -o -empty -\) -print0 | \
+		sort -z | \
+		tar --null -T - -c$(TAR_OPTS)f $@ --mtime=@$(SOURCE_DATE_EPOCH) --numeric-owner; \
+	}
+endef
+endif
+
 
 $(eval $(call ROOTFS_TARGET,tar))