Restore .git directory in git package downloads

Submitted by James Balean on April 19, 2017, 4 a.m.

Details

Message ID 1492574439-4617-1-git-send-email-james@balean.com.au
State Rejected
Headers show

Commit Message

James Balean April 19, 2017, 4 a.m.
Removing the .git directory during package download and caching prevents
Builroot users from making and testing changes to external software
packages using Buildroot. This patch enables users to commit locally,
make patches and contribute back to the projects they are using.

Signed-off-by: James Balean <james@balean.com.au>
Cc: Yann E. MORIN <yann.morin.1998@free.fr>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 support/download/git | 4 ----
 1 file changed, 4 deletions(-)

Comments

Baruch Siach April 19, 2017, 4:15 a.m.
Hi James,

On Tue, Apr 18, 2017 at 11:00:39PM -0500, James Balean wrote:
> Removing the .git directory during package download and caching prevents
> Builroot users from making and testing changes to external software
> packages using Buildroot. This patch enables users to commit locally,
> make patches and contribute back to the projects they are using.

The OVERRIDE_SRCDIR mechanism is meant to cover this use case. See section 
8.12.6 "Using Buildroot during development" in the manual[1]. Is there 
anything missing for your use case?

[1] http://nightly.buildroot.org/manual.html

baruch
Jan Kundrát April 19, 2017, 1:01 p.m.
On středa 19. dubna 2017 6:15:53 CEST, Baruch Siach wrote:
> Is there anything missing for your use case?

This is largely hypothetical (I don't know if any of these are packaged for 
Buldroot), but I know several projects which try to look at the output of 
`git describe` to create a pretty version string at build time (such as 
"v0.7-220-g270d875").

With kind regards,
Jan
Andreas Naumann April 19, 2017, 2:39 p.m.
Hi,

Am 19.04.2017 um 15:01 schrieb Jan Kundrát:
> On středa 19. dubna 2017 6:15:53 CEST, Baruch Siach wrote:
>> Is there anything missing for your use case?
>
> This is largely hypothetical (I don't know if any of these are packaged
> for Buldroot), but I know several projects which try to look at the
> output of `git describe` to create a pretty version string at build time
> (such as "v0.7-220-g270d875").

For me, a custom kernel from private git repo being is one such project. 
Right now I have a hack to append the shortened contents of 
BR2_LINUX_KERNEL_CUSTOM_REPO_VERSION via .scmversion. But of course this 
is not the same/as nice as what git describe or setlocalversion reports.

regards,
Andreas

>
> With kind regards,
> Jan
>
> _______________________________________________
> buildroot mailing list
> buildroot@busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot
Arnout Vandecappelle April 19, 2017, 8:11 p.m.
On 19-04-17 16:39, Andreas Naumann wrote:
> Hi,
> 
> Am 19.04.2017 um 15:01 schrieb Jan Kundrát:
>> On středa 19. dubna 2017 6:15:53 CEST, Baruch Siach wrote:
>>> Is there anything missing for your use case?
>>
>> This is largely hypothetical (I don't know if any of these are packaged
>> for Buldroot), but I know several projects which try to look at the
>> output of `git describe` to create a pretty version string at build time
>> (such as "v0.7-220-g270d875").
> 
> For me, a custom kernel from private git repo being is one such project. Right
> now I have a hack to append the shortened contents of
> BR2_LINUX_KERNEL_CUSTOM_REPO_VERSION via .scmversion. But of course this is not
> the same/as nice as what git describe or setlocalversion reports.

 Cfr. http://patchwork.ozlabs.org/patch/732304/ and subsequent discussion.

 Regards,
 Arnout
Ricardo Martincoski April 19, 2017, 11:37 p.m.
James,

[Either I or my mail client did something stupid and the e-mail did not reach
the mailing list and the patchwork. Resending to the list]

On Wed, Apr 19, 2017 at 01:00 AM, James Balean wrote:
> +++ b/support/download/git
> @@ -85,10 +85,6 @@ if [ ${recurse} -eq 1 ]; then
>      _git submodule update --init --recursive
>  fi
>  
> -# We do not need the .git dir; we keep other .git files, in case they
> -# are the only files in their directory.
> -rm -rf .git

This patch cannot be applied as-is because:

- it would make the tarballs for git packages not reproducible as the contents
  now depend on the state of the remote server (i.e. in some cases a full clone
  is needed; assume a tarball is generated, then a new commit is created in the
  remote server in any branch, a new full clone with the same reference as
  version would include that new commit and therefore the tarball is different);

- it would make the tarballs for git packages much larger (gigabytes for some
  linux trees) as they now include the .git directory;

Of course I am not against the discussion looking for another solution.

Regards,
Ricardo
Baruch Siach April 20, 2017, 3:40 a.m.
Hi Ricardo,

On Wed, Apr 19, 2017 at 08:22:45PM -0300, Ricardo Martincoski wrote:
> On Wed, Apr 19, 2017 at 01:00 AM, James Balean wrote:
> > +++ b/support/download/git
> > @@ -85,10 +85,6 @@ if [ ${recurse} -eq 1 ]; then
> >      _git submodule update --init --recursive
> >  fi
> >  
> > -# We do not need the .git dir; we keep other .git files, in case they
> > -# are the only files in their directory.
> > -rm -rf .git
> 
> This patch cannot be applied as-is because:
> 
> - it would make the tarballs for git packages not reproducible as the contents
>   now depend on the state of the remote server (i.e. in some cases a full clone
>   is needed; assume a tarball is generated, then a new commit is created in the
>   remote server in any branch, a new full clone with the same reference as
>   version would include that new commit and therefore the tarball is different);

Interesting. I think it is worth a mention in the comment. Removing .git is 
not just an optimization as the comment implies. It is necessary in order to 
verify the generated tarball.

baruch
James Balean April 26, 2017, 3:52 a.m.
Thank you for your comments.

On 19 April 2017 at 14:15, Baruch Siach <baruch@tkos.co.il> wrote:
> The OVERRIDE_SRCDIR mechanism is meant to cover this use case. See
> section 8.12.6 "Using Buildroot during development" in the manual[1].
> Is there anything missing for your use case?

I wasn't aware of the OVERRIDE_SRCDIR flag - thank you for this
suggestion. However, I can't help but feel that keeping the .git
directory might result in a more integrated way of enabling package
development without having to override the build process, and facilitate
the use cases mentioned by Jan and Andreas in their comments?


On 20 April 2017 at 09:22, Ricardo Martincoski <ricardo.martincoski@gmail.com> wrote:
> - it would make the tarballs for git packages not reproducible as the
> contents now depend on the state of the remote server (i.e. in some
> cases a full clone is needed; assume a tarball is generated, then a
> new commit is created in the remote server in any branch, a new full
> clone with the same reference as version would include that new commit
> and therefore the tarball is different);

Isn't it already the case that git tarballs are not reproducible? For
example, after a tarball is generated from a clone of 'master' or a tag,
Buildroot doesn't check and re-clone when changes are made to the branch
or tag on subsequent builds. Additionally, the same tarball filename is
also used for a full clone if a shallow clone fails. So tarballs can
already differ markedly between users.

> - it would make the tarballs for git packages much larger (gigabytes
> for some linux trees) as they now include the .git directory;

That's true. I don't know what people's appetite is for increasing the
size of tarballs to enable integrated development, though it is worth
noting that by default we only shallow clone with a depth of 1 rather
than full clone, so file sizes should be minimized 'in theory'.


On 20 April 2017 at 13:40, Baruch Siach <baruch@tkos.co.il> wrote:
> Interesting. I think it is worth a mention in the comment. Removing
> .git is not just an optimization as the comment implies. It is
> necessary in order to verify the generated tarball.

Not sure the tarball is verifiable without the .git directory, for the
reasons mentioned in response to Ricardo above.

--
Thanks again,
James
Baruch Siach April 26, 2017, 4:10 a.m.
Hi James,

On Tue, Apr 25, 2017 at 10:52:19PM -0500, James Balean wrote:
> On 20 April 2017 at 09:22, Ricardo Martincoski 
>   <ricardo.martincoski@gmail.com> wrote:
> > - it would make the tarballs for git packages not reproducible as the
> > contents now depend on the state of the remote server (i.e. in some
> > cases a full clone is needed; assume a tarball is generated, then a
> > new commit is created in the remote server in any branch, a new full
> > clone with the same reference as version would include that new commit
> > and therefore the tarball is different);
> 
> Isn't it already the case that git tarballs are not reproducible? For
> example, after a tarball is generated from a clone of 'master' or a tag,
> Buildroot doesn't check and re-clone when changes are made to the branch
> or tag on subsequent builds.

For this reason we only use immutable git references (either commits ids or 
tags) for git downloaded <PKG>_VERSION.

> Additionally, the same tarball filename is
> also used for a full clone if a shallow clone fails. So tarballs can
> already differ markedly between users.

The content of the source tree should be the same for a given git reference, 
regardless of the clone type.

Buildroot has been verifying git generated tarballs using hashes in 
packages/<pkg>/<pkg>.hash files for some time now without an issue, AFAIK.

baruch
Ricardo Martincoski April 27, 2017, 5:52 p.m.
Baruch, James,

On Wednesday, April 26, 2017 1:10:48 AM, Baruch Siach wrote:
> Hi James,
> 
> On Tue, Apr 25, 2017 at 10:52:19PM -0500, James Balean wrote:
>> On 20 April 2017 at 09:22, Ricardo Martincoski
>>   <ricardo.martincoski@gmail.com> wrote:
>> > - it would make the tarballs for git packages not reproducible as the
>> > contents now depend on the state of the remote server (i.e. in some
>> > cases a full clone is needed; assume a tarball is generated, then a
>> > new commit is created in the remote server in any branch, a new full
>> > clone with the same reference as version would include that new commit
>> > and therefore the tarball is different);
>> 
>> Isn't it already the case that git tarballs are not reproducible? For
>> example, after a tarball is generated from a clone of 'master' or a tag,
>> Buildroot doesn't check and re-clone when changes are made to the branch
>> or tag on subsequent builds.
> 
> For this reason we only use immutable git references (either commits ids or
> tags) for git downloaded <PKG>_VERSION.
> 
>> Additionally, the same tarball filename is
>> also used for a full clone if a shallow clone fails. So tarballs can
>> already differ markedly between users.
> 
> The content of the source tree should be the same for a given git reference,
> regardless of the clone type.

Indeed. Tarballs for git packages are already reproducible.

> Buildroot has been verifying git generated tarballs using hashes in
> packages/<pkg>/<pkg>.hash files for some time now without an issue, AFAIK.

Not yet, but we are almost there, see
http://patchwork.ozlabs.org/patch/741360/

Regards,
Ricardo

Patch hide | download patch | download mbox

diff --git a/support/download/git b/support/download/git
index 056057c..a7f307c 100755
--- a/support/download/git
+++ b/support/download/git
@@ -85,10 +85,6 @@  if [ ${recurse} -eq 1 ]; then
     _git submodule update --init --recursive
 fi
 
-# We do not need the .git dir; we keep other .git files, in case they
-# are the only files in their directory.
-rm -rf .git
-
 popd >/dev/null
 
 # Generate the archive, sort with the C locale so that it is reproducible