diff mbox series

[3/3] support/download: detect and abort when using a git branch by name

Message ID d1cdffd986d2f2128396aa785d542fdd3d2c90e0.1533400367.git.yann.morin.1998@free.fr
State Changes Requested
Headers show
Series [1/3] support/download: remove help from wrapper | expand

Commit Message

Yann E. MORIN Aug. 4, 2018, 4:33 p.m. UTC
Using a git branch by its name does not work as people expect:

 1. due to local caching, Buildroot will not re-fetch the repository,
    so people that expect to be able to follow the remote repository
    will be quite surprised and disapointed;

 2. if the user removes the local cache, then the build is no longer
    reproducible, because the remote repository may change any time
    between two builds, and people will be quite surprised and
    disapointed.

In either case, users are surprised and disapointed, which is a sad
state of matters. :-(

So, detect if the changeset requested is a branch by name, and abort
in that case.

Note that this only applies to using a branch by name. Any other mean
of using the branch (tag, sha1) is still supported, of course.

Note also that the download wrapper still first tries from the local
cache, then from the primary site (if set), and falls back to trying
the mirror eventually (if set). This is not a problem per-se, because
a malicious user that is capable of pre-seeding either locations with
a matching archive already gamed the system, and there is nothing we
can do to prevent that...

Signed-off-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
---
 support/download/git | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Ricardo Martincoski Aug. 6, 2018, 3:14 a.m. UTC | #1
Hello,

On Sat, Aug 04, 2018 at 01:33 PM, Yann E. MORIN wrote:

[snip]
> +# Check if the changeset is a branch name.
> +if _git show-ref "${cset}" |grep -qv refs/tags; then

I didn't had time to test it yet.
Special refs still works after this?

Regards,
Ricardo
Yann E. MORIN Aug. 6, 2018, 6:36 p.m. UTC | #2
Ricardo, All,

On 2018-08-06 00:14 -0300, Ricardo Martincoski spake thusly:
> On Sat, Aug 04, 2018 at 01:33 PM, Yann E. MORIN wrote:
> > +# Check if the changeset is a branch name.
> > +if _git show-ref "${cset}" |grep -qv refs/tags; then
> 
> I didn't had time to test it yet.
> Special refs still works after this?

It is not supposed to be broken, as those special refs are (AFAIU)
exposed as tags, not heads. But as I don't have access to a server
serving those special refs (support for which I find ugly, btw), I
could not test.

Regards,
Yann E. MORIN.
Ricardo Martincoski Aug. 7, 2018, 12:39 a.m. UTC | #3
Hello,

On Mon, Aug 06, 2018 at 03:36 PM, Yann E. MORIN wrote:

> On 2018-08-06 00:14 -0300, Ricardo Martincoski spake thusly:
>> On Sat, Aug 04, 2018 at 01:33 PM, Yann E. MORIN wrote:
>> > +# Check if the changeset is a branch name.
>> > +if _git show-ref "${cset}" |grep -qv refs/tags; then
>> 
>> I didn't had time to test it yet.
>> Special refs still works after this?
> 
> It is not supposed to be broken, as those special refs are (AFAIU)
> exposed as tags, not heads. But as I don't have access to a server
> serving those special refs (support for which I find ugly, btw), I
> could not test.

One way to test special refs is by using a github package with some dirty
hacks (I picked a special ref at random by inspecting the output of ls-remote):
$ make defconfig
$ ./utils/config --set-str BR2_BACKUP_SITE ""
$ sed -e '/^TMUX_SITE_METHOD/d' \
      -e 's,^TMUX_VERSION =.*,TMUX_VERSION = refs/pull/1014/head,g' \
      -e 's,^TMUX_SITE.*,TMUX_SITE = https://github.com/tmux/tmux.git\nTMUX_SITE_METHOD = git,g' \
      -i package/tmux/tmux.mk
$ rm -f package/tmux/tmux.hash
$ BR2_DL_DIR=$(mktemp -d) make tmux-dirclean tmux-source
...
Commit 'refs/pull/1014/head' is a branch name.
Using a branch name is not supported.


The other way is by using this series:
http://patchwork.ozlabs.org/project/buildroot/list/?series=44009
current master:
https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301317
current master + this patch:
https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301358
It doesn't get to show that special-ref is broken because the tests run
sequentially (to avoid overpopulate the gitlab CI), but it shows (in the
-build.log) that also the download of a sha1 tip of a branch (search for
"git-sha1-branch-head" in the log) would be broken with this patch.
This can be reproduced locally:
$ make defconfig
$ ./utils/config --set-str BR2_BACKUP_SITE ""
$ BR2_DL_DIR=$(mktemp -d) make tremor-dirclean tremor-source
...
Commit '7c30a66346199f3f09017a09567c6c8a3a0eedc8' is a branch name.
Using a branch name is not supported.


So NACK for this patch as-is.
Of course, I am not against the concept of detecting the name of a branch being
used as package version and aborting the download with a nice message.
Unfortunately I currently don't have suggestions on how to fix the code.

Regards,
Ricardo
Yann E. MORIN Aug. 12, 2018, 8:25 p.m. UTC | #4
Ricardo, All,

On 2018-08-06 21:39 -0300, Ricardo Martincoski spake thusly:
> On Mon, Aug 06, 2018 at 03:36 PM, Yann E. MORIN wrote:
> > On 2018-08-06 00:14 -0300, Ricardo Martincoski spake thusly:
> >> On Sat, Aug 04, 2018 at 01:33 PM, Yann E. MORIN wrote:
> >> > +# Check if the changeset is a branch name.
> >> > +if _git show-ref "${cset}" |grep -qv refs/tags; then
> >> 
> >> I didn't had time to test it yet.
> >> Special refs still works after this?
> > 
> > It is not supposed to be broken, as those special refs are (AFAIU)
> > exposed as tags, not heads. But as I don't have access to a server
> > serving those special refs (support for which I find ugly, btw), I
> > could not test.
> 
> One way to test special refs is by using a github package with some dirty
> hacks (I picked a special ref at random by inspecting the output of ls-remote):
> $ make defconfig
> $ ./utils/config --set-str BR2_BACKUP_SITE ""
> $ sed -e '/^TMUX_SITE_METHOD/d' \
>       -e 's,^TMUX_VERSION =.*,TMUX_VERSION = refs/pull/1014/head,g' \
>       -e 's,^TMUX_SITE.*,TMUX_SITE = https://github.com/tmux/tmux.git\nTMUX_SITE_METHOD = git,g' \
>       -i package/tmux/tmux.mk
> $ rm -f package/tmux/tmux.hash
> $ BR2_DL_DIR=$(mktemp -d) make tmux-dirclean tmux-source
> ...
> Commit 'refs/pull/1014/head' is a branch name.
> Using a branch name is not supported.

But then, the special refs also suffer from the same problems as
branches do: they can't work for the very same reasons as explained in
the previous patch.

So, I would be of the opinion that we should just drop support for those
special refs altogether, based on the same explanations.

Now, I do understand that one will want to be able to test github MRs,
or gerrit reviews or whatnots... But in that case, we already civer this
by way of local.mk and overide-srcdir.

So, I would argue (as I always had since we introduced this special refs
support) that this should better be handled by an upper layer.

Sure, you'd argue that an automated build job could do the build. But
you anyway have to write some scripting for that automated job anyway.
Just have it prepare a git clone of the affected package, checkout the
correct commit, and prepare a local.mk with the correct override-srcdir
befor attempting the buildroot build.

Regards,
Yann E. MORIN.

> The other way is by using this series:
> http://patchwork.ozlabs.org/project/buildroot/list/?series=44009
> current master:
> https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301317
> current master + this patch:
> https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301358
> It doesn't get to show that special-ref is broken because the tests run
> sequentially (to avoid overpopulate the gitlab CI), but it shows (in the
> -build.log) that also the download of a sha1 tip of a branch (search for
> "git-sha1-branch-head" in the log) would be broken with this patch.
> This can be reproduced locally:
> $ make defconfig
> $ ./utils/config --set-str BR2_BACKUP_SITE ""
> $ BR2_DL_DIR=$(mktemp -d) make tremor-dirclean tremor-source
> ...
> Commit '7c30a66346199f3f09017a09567c6c8a3a0eedc8' is a branch name.
> Using a branch name is not supported.
> 
> 
> So NACK for this patch as-is.
> Of course, I am not against the concept of detecting the name of a branch being
> used as package version and aborting the download with a nice message.
> Unfortunately I currently don't have suggestions on how to fix the code.
> 
> Regards,
> Ricardo
Thomas Petazzoni Aug. 12, 2018, 8:41 p.m. UTC | #5
Yann, Ricardo,

On Sun, 12 Aug 2018 22:25:43 +0200, Yann E. MORIN wrote:

> > Commit 'refs/pull/1014/head' is a branch name.
> > Using a branch name is not supported.  
> 
> But then, the special refs also suffer from the same problems as
> branches do: they can't work for the very same reasons as explained in
> the previous patch.
> 
> So, I would be of the opinion that we should just drop support for those
> special refs altogether, based on the same explanations.
> 
> Now, I do understand that one will want to be able to test github MRs,
> or gerrit reviews or whatnots... But in that case, we already civer this
> by way of local.mk and overide-srcdir.
> 
> So, I would argue (as I always had since we introduced this special refs
> support) that this should better be handled by an upper layer.
> 
> Sure, you'd argue that an automated build job could do the build. But
> you anyway have to write some scripting for that automated job anyway.
> Just have it prepare a git clone of the affected package, checkout the
> correct commit, and prepare a local.mk with the correct override-srcdir
> befor attempting the buildroot build.

I don't remember all the discussion about special refs, but if I
understand correctly, it's for example a Github pull request. And if I
understood how Github works, you can push a new version of a pull
request as the same pull request number. This means that what you fetch
from a pull request is not stable. So it suffers from the same
reproducibility issue as regular branches.

Therefore, if we decide to officially not support branches, then we
should also not support special refs. Unless of course I misunderstood
the use case of special refs.

> > The other way is by using this series:
> > http://patchwork.ozlabs.org/project/buildroot/list/?series=44009
> > current master:
> > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301317
> > current master + this patch:
> > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301358
> > It doesn't get to show that special-ref is broken because the tests run
> > sequentially (to avoid overpopulate the gitlab CI), but it shows (in the
> > -build.log) that also the download of a sha1 tip of a branch (search for
> > "git-sha1-branch-head" in the log) would be broken with this patch.
> > This can be reproduced locally:
> > $ make defconfig
> > $ ./utils/config --set-str BR2_BACKUP_SITE ""
> > $ BR2_DL_DIR=$(mktemp -d) make tremor-dirclean tremor-source
> > ...
> > Commit '7c30a66346199f3f09017a09567c6c8a3a0eedc8' is a branch name.
> > Using a branch name is not supported.

Yann: this one I don't understand.
7c30a66346199f3f09017a09567c6c8a3a0eedc8 is a regular commit SHA1, so
it should not be considered as a branch. Why do we have this failure
for a commit SHA1 ?

Best regards,

Thomas
Yann E. MORIN Aug. 12, 2018, 8:48 p.m. UTC | #6
Thomas, All,

On 2018-08-12 22:41 +0200, Thomas Petazzoni spake thusly:
> > > The other way is by using this series:
> > > http://patchwork.ozlabs.org/project/buildroot/list/?series=44009
> > > current master:
> > > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301317
> > > current master + this patch:
> > > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301358
> > > It doesn't get to show that special-ref is broken because the tests run
> > > sequentially (to avoid overpopulate the gitlab CI), but it shows (in the
> > > -build.log) that also the download of a sha1 tip of a branch (search for
> > > "git-sha1-branch-head" in the log) would be broken with this patch.
> > > This can be reproduced locally:
> > > $ make defconfig
> > > $ ./utils/config --set-str BR2_BACKUP_SITE ""
> > > $ BR2_DL_DIR=$(mktemp -d) make tremor-dirclean tremor-source
> > > ...
> > > Commit '7c30a66346199f3f09017a09567c6c8a3a0eedc8' is a branch name.
> > > Using a branch name is not supported.
> 
> Yann: this one I don't understand.
> 7c30a66346199f3f09017a09567c6c8a3a0eedc8 is a regular commit SHA1, so
> it should not be considered as a branch. Why do we have this failure
> for a commit SHA1 ?

That's because of our special refs support, for which we do [0]:

    git fetch origin "${cset}:${cset}"

This creates a local brnach named after the sha1, and git even whines:

    warning: refname '7c30a66346199f3f09017a09567c6c8a3a0eedc8' is ambiguous.
    Git normally never creates a ref that ends with 40 hex characters
    because it will be ignored when you just specify 40-hex. These refs
    may be created by mistake. For example,

      git checkout -b $br $(git rev-parse ...)

    where "$br" is somehow empty and a 40-hex ref is created. Please
    examine these refs and maybe delete them. Turn this message off by
    running "git config advice.objectNameWarning false"

[0] https://git.buildroot.org/buildroot/tree/support/download/git#n118

Regards,
Yann E. MORIN.
Ricardo Martincoski Aug. 13, 2018, 2:13 p.m. UTC | #7
Hello,

On Sun, Aug 12, 2018 at 05:48 PM, Yann E. MORIN wrote:

> On 2018-08-12 22:41 +0200, Thomas Petazzoni spake thusly:
>> > > The other way is by using this series:
>> > > http://patchwork.ozlabs.org/project/buildroot/list/?series=44009
>> > > current master:
>> > > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301317
>> > > current master + this patch:
>> > > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301358
>> > > It doesn't get to show that special-ref is broken because the tests run
>> > > sequentially (to avoid overpopulate the gitlab CI), but it shows (in the
>> > > -build.log) that also the download of a sha1 tip of a branch (search for
>> > > "git-sha1-branch-head" in the log) would be broken with this patch.
>> > > This can be reproduced locally:
>> > > $ make defconfig
>> > > $ ./utils/config --set-str BR2_BACKUP_SITE ""
>> > > $ BR2_DL_DIR=$(mktemp -d) make tremor-dirclean tremor-source
>> > > ...
>> > > Commit '7c30a66346199f3f09017a09567c6c8a3a0eedc8' is a branch name.
>> > > Using a branch name is not supported.
>> 
>> Yann: this one I don't understand.
>> 7c30a66346199f3f09017a09567c6c8a3a0eedc8 is a regular commit SHA1, so
>> it should not be considered as a branch. Why do we have this failure
>> for a commit SHA1 ?
> 
> That's because of our special refs support, for which we do [0]:
> 
>     git fetch origin "${cset}:${cset}"

But only removing this code will not make 'git show-ref' to work as expected.
There are git caches out there with the mentioned branch.
If for any reason the tarball is not there (like we do test in autobuilder) the
download will fail for the wrong reason, and it would require a manual fix.

And IMO we should not start removing all refs from the git caches otherwise we
can decrease the cache hit ratio (due to git gc), specially for linux trees.

So the best approach IMO is to ask "given we want a named ref, is that ref a
branch in the remote?". That obviously lead to using git ls-remote and check
the second column:
$ git ls-remote https://git.xiph.org/tremor.git
7c30a66346199f3f09017a09567c6c8a3a0eedc8        HEAD
293fd1c04f9d4489be6d4b2b1ca8698f2f902e8e        refs/heads/lowmem
7c30a66346199f3f09017a09567c6c8a3a0eedc8        refs/heads/master
f946f9acbf7aced75d3810a52c878e47680a18f6        refs/heads/nobyte
3175ff964c7d85d070df823189997f6b753cfef7        refs/heads/tremolo
0bcded806a0327c86d9246703724b45037d1bbaa        refs/heads/tremolo_mips

And the corner case is when there is a branch and a tag with the same name.
We must choose whether that case fails or the tag is used. I would use the tag.

Regards,
Ricardo
Ricardo Martincoski Aug. 13, 2018, 2:33 p.m. UTC | #8
Hello,

On Sun, Aug 12, 2018 at 05:41 PM, Thomas Petazzoni wrote:

> On Sun, 12 Aug 2018 22:25:43 +0200, Yann E. MORIN wrote:
> 
>> > Commit 'refs/pull/1014/head' is a branch name.
>> > Using a branch name is not supported.  
>> 
>> But then, the special refs also suffer from the same problems as
>> branches do: they can't work for the very same reasons as explained in
>> the previous patch.
>> 
>> So, I would be of the opinion that we should just drop support for those
>> special refs altogether, based on the same explanations.

Indeed using the name of a special ref as version is not guaranteed to be
reproducible for all special refs providers.

>> 
>> Now, I do understand that one will want to be able to test github MRs,
>> or gerrit reviews or whatnots... But in that case, we already civer this
>> by way of local.mk and overide-srcdir.
>> 
>> So, I would argue (as I always had since we introduced this special refs
>> support) that this should better be handled by an upper layer.
>> 
>> Sure, you'd argue that an automated build job could do the build. But
>> you anyway have to write some scripting for that automated job anyway.
>> Just have it prepare a git clone of the affected package, checkout the
>> correct commit, and prepare a local.mk with the correct override-srcdir
>> befor attempting the buildroot build.

It seems a lot of scripting for something that git supports natively and that
could be added to the backend to support the right way: using sha1 for
reproducibility.

> 
> I don't remember all the discussion about special refs, but if I

Most can be seen on [0].

> understand correctly, it's for example a Github pull request. And if I
> understood how Github works, you can push a new version of a pull
> request as the same pull request number. This means that what you fetch
> from a pull request is not stable. So it suffers from the same
> reproducibility issue as regular branches.

Gerrit Changes (when the server has the 'Draft' mode disabled) do not suffer
from the same. Each new revision ('Patch Set') has its own sha1. The refs for
many revisions of Change 1001 look like this:
refs/changes/01/1001/1
refs/changes/01/1001/2
refs/changes/01/1001/3
...

Aside: if someone wants to give a try, a test server can be run in minutes:
docker run -ti -p 8080:8080 -p 29418:29418 gerritcodereview/gerrit
No. It's not for Buildroot as it doesn't fit well for e-mail oriented reviews.
But the diff between any pair of revisions in the web interface is its awesome
feature IMO.
No. I have nothing to do with the Gerrit project, I am just a happy user.

> 
> Therefore, if we decide to officially not support branches, then we
> should also not support special refs. Unless of course I misunderstood
> the use case of special refs.

Fair enough. Let's officially don't support name of branches and name of
special refs as package version as they are not guaranteed to be reproducible,
taking in account all special ref providers.

What I think we should do is:
 - to re-add support to download of the sha1 of any ref (branch, tag, special
   ref).
 - add automated tests to avoid going back and forth, supporting some refs,
   then dropping support for them and then supporting them.
 - drop the support to the name of a special ref as it is not guaranteed to be
   reproducible.

At DATACOM we never used the name of special-ref as version.
The only occasion was when we updated the buildroot version to 2016.08 and the
support to sha1 of special-ref was removed. Then we used the name for a while as
immediate workaround, but it brings double effort to the developers.
Then I created a patch, submitted to the mailing list, with tests, and we are
maintaining inhouse ever since. We did not updated to the new download infra.

I can send a patch series for the first 2. An early version is at [1].
But due to the lack of time (testing linux trees takes time) I so far only
tested it with all packages that use the git download method in the tree. More
tests are needed for git versions (specially git 1.7.1 for RHEL6 and the brand
new one at the time) and also changing among linux trees. It also lacks a
nice commit message and probably some patch splitting. Notice in the series I
already assumed the support for name of special ref as package version will be
dropped.
This series certainly will not be applied to 2018.08 as we are so close to the
release.

But maybe even this patch "detect and abort" should not be applied to 2018.08
either as it is not so trivial, demonstrated by this review.
Just to be clear: please do not apply v1 of this patch. It will break things.

[0] http://patchwork.ozlabs.org/patch/654418/
[1] https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27791462

Regards,
Ricardo
Yann E. MORIN Aug. 13, 2018, 4:06 p.m. UTC | #9
Ricardo, All,

On 2018-08-13 11:13 -0300, ricardo.martincoski@gmail.com spake thusly:
> On Sun, Aug 12, 2018 at 05:48 PM, Yann E. MORIN wrote:
> > On 2018-08-12 22:41 +0200, Thomas Petazzoni spake thusly:
> >> > > The other way is by using this series:
> >> > > http://patchwork.ozlabs.org/project/buildroot/list/?series=44009
> >> > > current master:
> >> > > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301317
> >> > > current master + this patch:
> >> > > https://gitlab.com/RicardoMartincoski/buildroot/pipelines/27301358
> >> > > It doesn't get to show that special-ref is broken because the tests run
> >> > > sequentially (to avoid overpopulate the gitlab CI), but it shows (in the
> >> > > -build.log) that also the download of a sha1 tip of a branch (search for
> >> > > "git-sha1-branch-head" in the log) would be broken with this patch.
> >> > > This can be reproduced locally:
> >> > > $ make defconfig
> >> > > $ ./utils/config --set-str BR2_BACKUP_SITE ""
> >> > > $ BR2_DL_DIR=$(mktemp -d) make tremor-dirclean tremor-source
> >> > > ...
> >> > > Commit '7c30a66346199f3f09017a09567c6c8a3a0eedc8' is a branch name.
> >> > > Using a branch name is not supported.
> >> 
> >> Yann: this one I don't understand.
> >> 7c30a66346199f3f09017a09567c6c8a3a0eedc8 is a regular commit SHA1, so
> >> it should not be considered as a branch. Why do we have this failure
> >> for a commit SHA1 ?
> > 
> > That's because of our special refs support, for which we do [0]:
> > 
> >     git fetch origin "${cset}:${cset}"
> 
> But only removing this code will not make 'git show-ref' to work as expected.
> There are git caches out there with the mentioned branch.
> If for any reason the tarball is not there (like we do test in autobuilder) the
> download will fail for the wrong reason, and it would require a manual fix.

Rightfully right.

So, we don't care about the ref being a local branch. Instead, we must
check whether it is a branch on the remote. I.e.:

    git show-ref "origin/${cset}" |grep -qv refs/tags

Thoughts?

> And IMO we should not start removing all refs from the git caches otherwise we
> can decrease the cache hit ratio (due to git gc), specially for linux trees.

Well, we can still remove local branches.

> So the best approach IMO is to ask "given we want a named ref, is that ref a
> branch in the remote?". That obviously lead to using git ls-remote and check
> the second column:

I would prefer to avoid another round-trip to the remote, when we have
the necessary information locally. What about the git-show-ref snippet I
pasted above instead?

> $ git ls-remote https://git.xiph.org/tremor.git
> 7c30a66346199f3f09017a09567c6c8a3a0eedc8        HEAD
> 293fd1c04f9d4489be6d4b2b1ca8698f2f902e8e        refs/heads/lowmem
> 7c30a66346199f3f09017a09567c6c8a3a0eedc8        refs/heads/master
> f946f9acbf7aced75d3810a52c878e47680a18f6        refs/heads/nobyte
> 3175ff964c7d85d070df823189997f6b753cfef7        refs/heads/tremolo
> 0bcded806a0327c86d9246703724b45037d1bbaa        refs/heads/tremolo_mips
> 
> And the corner case is when there is a branch and a tag with the same name.
> We must choose whether that case fails or the tag is used. I would use the tag.

Arggg... Is it really possible to have a tag and a branch with the same
name? Indeed it is... Sigh... :-(

OK, so some more hacking is needed...

Regards,
Yann E. MORIN.
Yann E. MORIN Aug. 13, 2018, 4:19 p.m. UTC | #10
Ricardo, All,

On 2018-08-13 11:33 -0300, Ricardo Martincoski spake thusly:
> On Sun, Aug 12, 2018 at 05:41 PM, Thomas Petazzoni wrote:
> > On Sun, 12 Aug 2018 22:25:43 +0200, Yann E. MORIN wrote:
> >> > Commit 'refs/pull/1014/head' is a branch name.
> >> > Using a branch name is not supported.  
[--SNIP--]
> >> Sure, you'd argue that an automated build job could do the build. But
> >> you anyway have to write some scripting for that automated job anyway.
> >> Just have it prepare a git clone of the affected package, checkout the
> >> correct commit, and prepare a local.mk with the correct override-srcdir
> >> befor attempting the buildroot build.
> It seems a lot of scripting for something that git supports natively and that

Not true: they are not even fetched by default at all. If they were,
then we would not be having this discussion. As it stands, we currently
have to resort to unstable trickery to use those.

> could be added to the backend to support the right way: using sha1 for
> reproducibility.

I also thought about it and tried to replace the special ref by its
expanded sha1 and name the local archive by the sha1.

But that does not work, because locating the tarball is completely
decorelated from generating it.

I.e. the code that extracts the tarball has no way to know that the
version string has been changed!

[--SNIP--]
> Gerrit Changes (when the server has the 'Draft' mode disabled) do not suffer
> from the same. Each new revision ('Patch Set') has its own sha1. The refs for
> many revisions of Change 1001 look like this:
> refs/changes/01/1001/1
> refs/changes/01/1001/2
> refs/changes/01/1001/3

So, those would be stable, it looks like, indeed.

But then it means we're maybe currently doing it wrong anyway, as we
create a local branch for something that is more akin to a tag...

[--SNIP--]
> What I think we should do is:
>  - to re-add support to download of the sha1 of any ref (branch, tag, special
>    ref).

That is not supposed to be broken, and should work again if we check if
the remote has the ref as we both suggested in previous replies in this
thread (but you and I with different solutions).

>  - add automated tests to avoid going back and forth, supporting some refs,
>    then dropping support for them and then supporting them.

Yes, your series doing that is still pending... TBH, it is still marked
as unread here, because I still want to have a look at it...

>  - drop the support to the name of a special ref as it is not guaranteed to be
>    reproducible.

This should be done before we add tests, I think.

[--SNIP--]
> But maybe even this patch "detect and abort" should not be applied to 2018.08
> either as it is not so trivial, demonstrated by this review.
> Just to be clear: please do not apply v1 of this patch. It will break things.

Agreed. It is material for next.

Thanks! :-)

Regards,
Yann E. MORIN.
Ricardo Martincoski Aug. 16, 2018, 1:04 a.m. UTC | #11
Hello,

On Mon, Aug 13, 2018 at 01:06 PM, Yann E. MORIN wrote:

> On 2018-08-13 11:13 -0300, ricardo.martincoski@gmail.com spake thusly:
>> On Sun, Aug 12, 2018 at 05:48 PM, Yann E. MORIN wrote:
[snip]
>> >     git fetch origin "${cset}:${cset}"
>> 
>> But only removing this code will not make 'git show-ref' to work as expected.
>> There are git caches out there with the mentioned branch.
>> If for any reason the tarball is not there (like we do test in autobuilder) the
>> download will fail for the wrong reason, and it would require a manual fix.
> 
> Rightfully right.
> 
> So, we don't care about the ref being a local branch. Instead, we must
> check whether it is a branch on the remote. I.e.:
> 
>     git show-ref "origin/${cset}" |grep -qv refs/tags
> 
> Thoughts?

I didn't tested it, but it looks promising.

> 
>> And IMO we should not start removing all refs from the git caches otherwise we
>> can decrease the cache hit ratio (due to git gc), specially for linux trees.
> 
> Well, we can still remove local branches.

OK.

> 
>> So the best approach IMO is to ask "given we want a named ref, is that ref a
>> branch in the remote?". That obviously lead to using git ls-remote and check
>> the second column:
> 
> I would prefer to avoid another round-trip to the remote, when we have
> the necessary information locally. What about the git-show-ref snippet I
> pasted above instead?

In the other hand with ls-remote we could fail faster, before downloading any
refs.

Regards,
Ricardo
Ricardo Martincoski Aug. 16, 2018, 3:13 a.m. UTC | #12
Hello,

On Mon, Aug 13, 2018 at 01:19 PM, Yann E. MORIN wrote:
> On 2018-08-13 11:33 -0300, Ricardo Martincoski spake thusly:
>> On Sun, Aug 12, 2018 at 05:41 PM, Thomas Petazzoni wrote:
>> > On Sun, 12 Aug 2018 22:25:43 +0200, Yann E. MORIN wrote:
>> >> > Commit 'refs/pull/1014/head' is a branch name.
>> >> > Using a branch name is not supported.  
> [--SNIP--]
>> >> Sure, you'd argue that an automated build job could do the build. But
>> >> you anyway have to write some scripting for that automated job anyway.
>> >> Just have it prepare a git clone of the affected package, checkout the
>> >> correct commit, and prepare a local.mk with the correct override-srcdir
>> >> befor attempting the buildroot build.
>> It seems a lot of scripting for something that git supports natively and that
> 
> Not true: they are not even fetched by default at all. If they were,
> then we would not be having this discussion. As it stands, we currently
> have to resort to unstable trickery to use those.

OK. The porcelain commands do support them but they are not fetched using the
default options.

> 
>> could be added to the backend to support the right way: using sha1 for
>> reproducibility.
> 
> I also thought about it and tried to replace the special ref by its
> expanded sha1 and name the local archive by the sha1.

No. Sorry. I meant the other way around:
To always use (for special refs) a sha1 as version of the package and get the
download infra to fetch it using the name expanded from the sha1 (by git
ls-remote), as it is reproducible.
It also works for branches tips and tags and provides some performance
improvement, nothing so dramatic as --depth 1 but without all the corner cases
that come with shallow clones, because it is not shallow, we just download more
like... on-demand.

[snip]
>> What I think we should do is:
>>  - to re-add support to download of the sha1 of any ref (branch, tag, special
>>    ref).
> 
> That is not supposed to be broken, and should work again if we check if
> the remote has the ref as we both suggested in previous replies in this
> thread (but you and I with different solutions).

Sorry. I failed to express what I wanted to. Let me try again:
  - to re-add support to download of the sha1 of *any* ref (sha1 of branch tip
    and sha1 of tag currently do work, sha1 of special ref worked in the past
    but don't work anymore).

> 
>>  - add automated tests to avoid going back and forth, supporting some refs,
>>    then dropping support for them and then supporting them.
> 
> Yes, your series doing that is still pending... TBH, it is still marked
> as unread here, because I still want to have a look at it...

Great! Please do review when you find some time.
They use br2-external, hash-check and git repos.

> 
>>  - drop the support to the name of a special ref as it is not guaranteed to be
>>    reproducible.
> 
> This should be done before we add tests, I think.

*should*: no. We can always adapt the tests when the behavior needs to change.
If we had already the test case for the name of a special ref as package
version, we would keep it, and just change it (in the same commit removing the
code from the backend) to ensure the download using the name of a special ref
always fails instead of always succeeding.

But yes. It *can* be done before.


Regards,
Ricardo
Arnout Vandecappelle Aug. 21, 2018, 8:22 p.m. UTC | #13
On 16/08/2018 03:04, Ricardo Martincoski wrote:
> Hello,
> 
> On Mon, Aug 13, 2018 at 01:06 PM, Yann E. MORIN wrote:
> 
>> On 2018-08-13 11:13 -0300, ricardo.martincoski@gmail.com spake thusly:
>>> On Sun, Aug 12, 2018 at 05:48 PM, Yann E. MORIN wrote:
[snip]
>>> So the best approach IMO is to ask "given we want a named ref, is that ref a
>>> branch in the remote?". That obviously lead to using git ls-remote and check
>>> the second column:
>>
>> I would prefer to avoid another round-trip to the remote, when we have
>> the necessary information locally. What about the git-show-ref snippet I
>> pasted above instead?
> 
> In the other hand with ls-remote we could fail faster, before downloading any
> refs.

 A failure in this case is because the developer specified a branch name instead
of a tag or sha. He in the end still wants to download that specific ref. So
downloading it before the check does make sense, because that way the cache is
already populated and the next attempt will go faster.

 Regards,
 Arnout
Ricardo Martincoski Aug. 21, 2018, 11:45 p.m. UTC | #14
Hello,

On Tue, Aug 21, 2018 at 05:22 PM, Arnout Vandecappelle wrote:

> On 16/08/2018 03:04, Ricardo Martincoski wrote:
>> 
>> On Mon, Aug 13, 2018 at 01:06 PM, Yann E. MORIN wrote:
>> 
>>> On 2018-08-13 11:13 -0300, ricardo.martincoski@gmail.com spake thusly:
>>>> On Sun, Aug 12, 2018 at 05:48 PM, Yann E. MORIN wrote:
> [snip]
>>>> So the best approach IMO is to ask "given we want a named ref, is that ref a
>>>> branch in the remote?". That obviously lead to using git ls-remote and check
>>>> the second column:
>>>
>>> I would prefer to avoid another round-trip to the remote, when we have
>>> the necessary information locally. What about the git-show-ref snippet I
>>> pasted above instead?
>> 
>> In the other hand with ls-remote we could fail faster, before downloading any
>> refs.
> 
>  A failure in this case is because the developer specified a branch name instead
> of a tag or sha. He in the end still wants to download that specific ref. So
> downloading it before the check does make sense, because that way the cache is
> already populated and the next attempt will go faster.

... because in this case we 'exit 1' instead of ditching the cache.

OK. Now I got why v2 should use show-ref. Thanks.


Regards,
Ricardo
diff mbox series

Patch

diff --git a/support/download/git b/support/download/git
index 11bb52c1e1..28391b908b 100755
--- a/support/download/git
+++ b/support/download/git
@@ -134,6 +134,13 @@  if ! _git rev-parse --quiet --verify "'${cset}^{commit}'" >/dev/null 2>&1; then
     exit 1
 fi
 
+# Check if the changeset is a branch name.
+if _git show-ref "${cset}" |grep -qv refs/tags; then
+    printf "Commit '%s' is a branch name.\n" "${cset}"
+    printf "Using a branch name is not supported.\n"
+    exit 1
+fi
+
 # The new cset we want to checkout might have different submodules, or
 # have sub-dirs converted to/from a submodule. So we would need to
 # deregister _current_ submodules before we checkout.