diff mbox series

support/download/svn: generate reproducible svn archives

Message ID 20191222213148.13762-1-heiko.thiery@gmail.com
State Accepted
Headers show
Series support/download/svn: generate reproducible svn archives | expand

Commit Message

Heiko Thiery Dec. 22, 2019, 9:31 p.m. UTC
To generate a reproducible archive from a svn repository mainly the same
aproach is done like for the archives from a git repository.

Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
---
 support/download/svn | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

Comments

Thomas Petazzoni Dec. 22, 2019, 9:36 p.m. UTC | #1
On Sun, 22 Dec 2019 22:31:49 +0100
Heiko Thiery <heiko.thiery@gmail.com> wrote:

> To generate a reproducible archive from a svn repository mainly the same
> aproach is done like for the archives from a git repository.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

Thanks. Could you check whether the tarballs that are now produced by
this are identical to the tarballs we already have on
sources.buildroot.org ?

Thomas
Heiko Thiery Dec. 22, 2019, 9:40 p.m. UTC | #2
Hi,

Am So., 22. Dez. 2019 um 22:36 Uhr schrieb Thomas Petazzoni
<thomas.petazzoni@bootlin.com>:
>
> On Sun, 22 Dec 2019 22:31:49 +0100
> Heiko Thiery <heiko.thiery@gmail.com> wrote:
>
> > To generate a reproducible archive from a svn repository mainly the same
> > aproach is done like for the archives from a git repository.
> >
> > Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
>
> Thanks. Could you check whether the tarballs that are now produced by
> this are identical to the tarballs we already have on
> sources.buildroot.org ?

I checked this for the fis package and unfortunately this is not the
case. Do you expect to have the same hashes?

> Thomas
> --
> Thomas Petazzoni, CTO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
Heiko Thiery Dec. 22, 2019, 9:54 p.m. UTC | #3
Hi,

> > Thanks. Could you check whether the tarballs that are now produced by
> > this are identical to the tarballs we already have on
> > sources.buildroot.org ?
>
> I checked this for the fis package and unfortunately this is not the
> case. Do you expect to have the same hashes?

this is how the archive from sources.buildroot.org looks:

tar tvf fis-2892.tar.gz
drwxr-xr-x peko/peko         0 2014-02-09 12:26 fis-2892/
-rw-r--r-- peko/peko      1699 2007-01-18 17:42 fis-2892/endswap.cc
-rwxr-xr-x peko/peko      4229 2006-12-13 22:30 fis-2892/disklog
-rwxr-xr-x peko/peko       532 2004-07-13 22:56 fis-2892/cvstotbz2
-rwxr-xr-x peko/peko       451 2004-04-19 14:04 fis-2892/overwrite
-rwxr-xr-x peko/peko       196 2007-03-17 01:24 fis-2892/write_dvd
-rw-r--r-- peko/peko     15659 2007-05-02 22:42 fis-2892/fis.c
-rw-r--r-- peko/peko      6507 2007-01-28 14:55 fis-2892/apexctl.cc
-rw-r--r-- peko/peko       348 2007-03-17 01:24 fis-2892/Makefile
-rw-r--r-- peko/peko      1858 2007-01-23 18:42 fis-2892/sercommhdr.cc
-rwxr-xr-x peko/peko       534 2004-07-13 22:56 fis-2892/cvstotgz
-rwxr-xr-x peko/peko      1021 2006-12-13 22:30 fis-2892/report_disklog
-rw-r--r-- peko/peko      9956 2007-03-25 01:14 fis-2892/fis.cc
-rwxr-xr-x peko/peko       372 2004-01-25 16:04 fis-2892/cvsmv
-rwxr-xr-x peko/peko       402 2005-10-31 19:17 fis-2892/svntotbz2

and this is the generated on;

tar tvf fis-2892.tar.gz
-rw-r--r-- 0/0             348 2019-12-22 00:00 fis-2892/Makefile
-rw-r--r-- 0/0            6507 2019-12-22 00:00 fis-2892/apexctl.cc
-rwxr-xr-x 0/0             372 2019-12-22 00:00 fis-2892/cvsmv
-rwxr-xr-x 0/0             532 2019-12-22 00:00 fis-2892/cvstotbz2
-rwxr-xr-x 0/0             534 2019-12-22 00:00 fis-2892/cvstotgz
-rwxr-xr-x 0/0            4229 2019-12-22 00:00 fis-2892/disklog
-rw-r--r-- 0/0            1699 2019-12-22 00:00 fis-2892/endswap.cc
-rw-r--r-- 0/0           15659 2019-12-22 00:00 fis-2892/fis.c
-rw-r--r-- 0/0            9956 2019-12-22 00:00 fis-2892/fis.cc
-rwxr-xr-x 0/0             451 2019-12-22 00:00 fis-2892/overwrite
-rwxr-xr-x 0/0            1021 2019-12-22 00:00 fis-2892/report_disklog
-rw-r--r-- 0/0            1858 2019-12-22 00:00 fis-2892/sercommhdr.cc
-rwxr-xr-x 0/0             402 2019-12-22 00:00 fis-2892/svntotbz2
-rwxr-xr-x 0/0             196 2019-12-22 00:00 fis-2892/write_dvd

The diffrerences I see are:
- the s.b.o archive has an entry for the directory
- the s.b.o archive has the user/group owner peko while the generated has 0/0
- the file ordering is different

--
Heiko
Thomas Petazzoni Dec. 22, 2019, 9:57 p.m. UTC | #4
On Sun, 22 Dec 2019 22:40:43 +0100
Heiko Thiery <heiko.thiery@gmail.com> wrote:

> > Thanks. Could you check whether the tarballs that are now produced by
> > this are identical to the tarballs we already have on
> > sources.buildroot.org ?  
> 
> I checked this for the fis package and unfortunately this is not the
> case. Do you expect to have the same hashes?

Probably not, because with your change we now generate the tarballs
differently.

Normally, it is annoying because it means the hash has changed,
breaking the build for older Buildroot users, if we update the tarballs
on sources.buildroot.org.

However, in this case, current Buildroot does not have any hash (as far
as I can see) for Subversion-fetched packages. So we could introduce
your change and update the tarballs on sources.buildroot.org at the
same time, and then introduce hashes in those packages.

Peter, Yann, what do you think ?

We have only very few Subversion-fetched packages, I think we should
keep it simple.

Thomas
Yann E. MORIN Dec. 23, 2019, 5:16 p.m. UTC | #5
Thomas, Heiko, All,

On 2019-12-22 22:57 +0100, Thomas Petazzoni spake thusly:
> On Sun, 22 Dec 2019 22:40:43 +0100
> Heiko Thiery <heiko.thiery@gmail.com> wrote:
> 
> > > Thanks. Could you check whether the tarballs that are now produced by
> > > this are identical to the tarballs we already have on
> > > sources.buildroot.org ?  
> > 
> > I checked this for the fis package and unfortunately this is not the
> > case. Do you expect to have the same hashes?
> 
> Probably not, because with your change we now generate the tarballs
> differently.
> 
> Normally, it is annoying because it means the hash has changed,
> breaking the build for older Buildroot users, if we update the tarballs
> on sources.buildroot.org.
> 
> However, in this case, current Buildroot does not have any hash (as far
> as I can see) for Subversion-fetched packages. So we could introduce
> your change and update the tarballs on sources.buildroot.org at the
> same time, and then introduce hashes in those packages.
> 
> Peter, Yann, what do you think ?

I am 100% on-line with regenerating the tarballs so they are reproducible,
even if that means updating s.b.o.

> We have only very few Subversion-fetched packages, I think we should
> keep it simple.

In practice, we have only two: fis and open2300. The other packages that
may use svn are those where the user would set the version, so they
would anyway be excluded from the hash check.

Regards,
Yann E. MORIN.

> Thomas
> -- 
> Thomas Petazzoni, CTO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
Peter Korsgaard Dec. 23, 2019, 10:05 p.m. UTC | #6
>>>>> "Yann" == Yann E MORIN <yann.morin.1998@free.fr> writes:

Hi,

 >> Normally, it is annoying because it means the hash has changed,
 >> breaking the build for older Buildroot users, if we update the tarballs
 >> on sources.buildroot.org.
 >> 
 >> However, in this case, current Buildroot does not have any hash (as far
 >> as I can see) for Subversion-fetched packages. So we could introduce
 >> your change and update the tarballs on sources.buildroot.org at the
 >> same time, and then introduce hashes in those packages.
 >> 
 >> Peter, Yann, what do you think ?

 > I am 100% on-line with regenerating the tarballs so they are reproducible,
 > even if that means updating s.b.o.

As long as it doesn't break anything for existing users, regenerating
the tarballs are also fine by me.


 >> We have only very few Subversion-fetched packages, I think we should
 >> keep it simple.

 > In practice, we have only two: fis and open2300. The other packages that
 > may use svn are those where the user would set the version, so they
 > would anyway be excluded from the hash check.

Talking about those two packages, weren't we going to remove them? They
are both very old and haven't been updated since they were added.
Thomas Petazzoni Dec. 23, 2019, 10:19 p.m. UTC | #7
Hello,

+Alex in Cc. Alex, there's an open2300 question for you below.

On Mon, 23 Dec 2019 23:05:34 +0100
Peter Korsgaard <peter@korsgaard.com> wrote:

>  > In practice, we have only two: fis and open2300. The other packages that
>  > may use svn are those where the user would set the version, so they
>  > would anyway be excluded from the hash check.  
> 
> Talking about those two packages, weren't we going to remove them? They
> are both very old and haven't been updated since they were added.

That's indeed another way. For the "fis" package, I think this is
reasonable, as even the use-case is very obsolete (RedBoot stuff). For
open2300, I'm not sure.

Alex: is open2300 still relevant today? You introduced this package in
Buildroot many years ago.

Thomas
Peter Korsgaard Dec. 23, 2019, 10:36 p.m. UTC | #8
>>>>> "Thomas" == Thomas Petazzoni <thomas.petazzoni@bootlin.com> writes:

 > Hello,
 > +Alex in Cc. Alex, there's an open2300 question for you below.

As far as I can see, you didn't actually put Alex in CC.

 > On Mon, 23 Dec 2019 23:05:34 +0100
 > Peter Korsgaard <peter@korsgaard.com> wrote:

 >> > In practice, we have only two: fis and open2300. The other packages that
 >> > may use svn are those where the user would set the version, so they
 >> > would anyway be excluded from the hash check.  
 >> 
 >> Talking about those two packages, weren't we going to remove them? They
 >> are both very old and haven't been updated since they were added.

 > That's indeed another way. For the "fis" package, I think this is
 > reasonable, as even the use-case is very obsolete (RedBoot stuff). For
 > open2300, I'm not sure.

 > Alex: is open2300 still relevant today? You introduced this package in
 > Buildroot many years ago.
Yann E. MORIN Dec. 30, 2019, 9:58 a.m. UTC | #9
Heiko, All,

On 2019-12-22 22:31 +0100, Heiko Thiery spake thusly:
> To generate a reproducible archive from a svn repository mainly the same
> aproach is done like for the archives from a git repository.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>
> ---
>  support/download/svn | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/support/download/svn b/support/download/svn
> index 542b25c0a2..505cdd58b1 100755
> --- a/support/download/svn
> +++ b/support/download/svn
> @@ -38,4 +38,19 @@ _svn() {
>  
>  _svn export ${verbose} "${@}" "'${uri}@${rev}'" "'${basename}'"
>  
> -tar czf "${output}" "${basename}"
> +# Generate the archive, sort with the C locale so that it is reproducible.
> +# We do not want the .svn dir; we keep other .svn files, in case they are the
> +# only files in their directory.
> +find "${basename}" -not -type d \
> +       -and -not -path "./.svn/*" >"${output}.list"
> +LC_ALL=C sort <"${output}.list" >"${output}.list.sorted"
> +
> +# Create GNU-format tarballs, since that's the format of the tarballs on
> +# sources.buildroot.org and used in the *.hash files
> +tar cf - --transform="s#^\./#${basename}/#" \
> +         --numeric-owner --owner=0 --group=0 --mtime="${date}" --format=gnu \

Where does "${date}" comes from? Nothing is setting it...

So I've added some code to that effect, and pushed to master, thanks.

Regards,
Yann E. MORIN.

> +         -T "${output}.list.sorted" >"${output}.tar"
> +gzip -6 -n <"${output}.tar" >"${output}"
> +
> +rm -f "${output}.list"
> +rm -f "${output}.list.sorted"
> -- 
> 2.20.1
>
Alexandre Belloni Jan. 6, 2020, 9:31 a.m. UTC | #10
Hi,

On 23/12/2019 23:36:50+0100, Peter Korsgaard wrote:
>  > Alex: is open2300 still relevant today? You introduced this package in
>  > Buildroot many years ago.
> 

I'm not personally using it at the moment as I now have an x86 box
connected to the weather station and I compiled open2300 on it directly.
I'm fine with the removal.
Yann E. MORIN May 22, 2020, 2:09 p.m. UTC | #11
Heiko, All,

On 2019-12-22 22:31 +0100, Heiko Thiery spake thusly:
> To generate a reproducible archive from a svn repository mainly the same
> aproach is done like for the archives from a git repository.
> 
> Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com>

That patch has been long applied now, but I just noticed that we forgot
to enable hash checks for svn tarbals:

    https://git.buildroot.org/buildroot/tree/package/pkg-generic.mk#n580

I'll take it on me to fix that, as I'm already working in this area...

Regards,
Yann E. MORIN.

> ---
>  support/download/svn | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/support/download/svn b/support/download/svn
> index 542b25c0a2..505cdd58b1 100755
> --- a/support/download/svn
> +++ b/support/download/svn
> @@ -38,4 +38,19 @@ _svn() {
>  
>  _svn export ${verbose} "${@}" "'${uri}@${rev}'" "'${basename}'"
>  
> -tar czf "${output}" "${basename}"
> +# Generate the archive, sort with the C locale so that it is reproducible.
> +# We do not want the .svn dir; we keep other .svn files, in case they are the
> +# only files in their directory.
> +find "${basename}" -not -type d \
> +       -and -not -path "./.svn/*" >"${output}.list"
> +LC_ALL=C sort <"${output}.list" >"${output}.list.sorted"
> +
> +# Create GNU-format tarballs, since that's the format of the tarballs on
> +# sources.buildroot.org and used in the *.hash files
> +tar cf - --transform="s#^\./#${basename}/#" \
> +         --numeric-owner --owner=0 --group=0 --mtime="${date}" --format=gnu \
> +         -T "${output}.list.sorted" >"${output}.tar"
> +gzip -6 -n <"${output}.tar" >"${output}"
> +
> +rm -f "${output}.list"
> +rm -f "${output}.list.sorted"
> -- 
> 2.20.1
>
diff mbox series

Patch

diff --git a/support/download/svn b/support/download/svn
index 542b25c0a2..505cdd58b1 100755
--- a/support/download/svn
+++ b/support/download/svn
@@ -38,4 +38,19 @@  _svn() {
 
 _svn export ${verbose} "${@}" "'${uri}@${rev}'" "'${basename}'"
 
-tar czf "${output}" "${basename}"
+# Generate the archive, sort with the C locale so that it is reproducible.
+# We do not want the .svn dir; we keep other .svn files, in case they are the
+# only files in their directory.
+find "${basename}" -not -type d \
+       -and -not -path "./.svn/*" >"${output}.list"
+LC_ALL=C sort <"${output}.list" >"${output}.list.sorted"
+
+# Create GNU-format tarballs, since that's the format of the tarballs on
+# sources.buildroot.org and used in the *.hash files
+tar cf - --transform="s#^\./#${basename}/#" \
+         --numeric-owner --owner=0 --group=0 --mtime="${date}" --format=gnu \
+         -T "${output}.list.sorted" >"${output}.tar"
+gzip -6 -n <"${output}.tar" >"${output}"
+
+rm -f "${output}.list"
+rm -f "${output}.list.sorted"