Patchwork [git,patches] libata updates, GPG signed (but see admin notes)

login
register
mail settings
Submitter Linus Torvalds
Date Oct. 31, 2011, 3:53 p.m.
Message ID <CA+55aFz3=cbciRfTYodNhdEetXYxTARGTfpP9GL9RZK222XmKQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/122876/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Linus Torvalds - Oct. 31, 2011, 3:53 p.m.
On Mon, Oct 31, 2011 at 1:19 AM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
>> That said, even the "BEGIN PGP SIGNED MESSAGE" things are a massive
>> pain in the butt. We need to automate this some sane way, both for the
>> sender and for the recipient.
>
> But this doesn't help with what practise you want us to follow.  Do you
> want us to send full signed email using pgp encapsulation for pull
> requests in spite of the mangling it does to attached patches and the
> amount of extra pain it causes you?

No. I don't want the *whole* email signed, because that is quite
inconvenient: it means that I can't just cut-and-paste some signature,
I have to save the email and verify it etc etc.

So my preferred thing would literally be to make the signed part as
small as possible with no odd characters or whitespace (top commit and
probably repository name), so that I can cut-and-paste it and just
have a terminal window open with "gpg --verify + paste + ^D" and I'm
done.

For the people who use "git request-pull", I'm attaching a trivial
patch to make it add this kind of signature if you give it the "-s"
flag. It basically just adds a hunk like the appended crazy example to
the pull request, and it's small enough and simple enough that it
makes verification simple too with just the above kind of trivial
cut-and-paste thing.

(Junio cc'd, I think he had something more complicated in mind)

Now, admittedly it would be *even nicer* if this gpg-signed block was
instead uploaded as a signed tag automatically, and "git pull" would
notice such a signed tag (tagname the same as the branch name + date
or something) and would download and verify the tag as I pull. Then I
wouldn't even need to actually do the cut-and-paste at all. But this
is the *really* simple approach that gets up 95% of the way there.

And the attached patch is so trivial that if you aren't actually using
"git request-pull" but instead have some home-cooked script to do the
same, then you can just look at this patch and trivially change your
script to do something very similar.

                 Linus

[ Example gpg-signed small block that the attached patch adds to the
pull request: ]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Commit be3fa9125e708348c7baf04ebe9507a72a9d1800
from git.kernel.org/pub/git
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)

iQEcBAEBAgAGBQJOrsILAAoJEHm+PkMAQRiGxZcH/31e0RrBitXUPKxHJajD58yh
SIEe/7i6E2RUSFva3KybEuFslcR8p8DYzDQTPLejStvnkO8v0lXu9s9R53tvjLMF
aaQXLOgrOC2RqvzP4F27O972h32YpLBkwIdWQGAhYcUOdKYDZ9RfgEgtdJwSYuL+
oJ7TjLrtkcILaFmr9nYZC+0Fh7z+84R8kR53v0iBHJQOFfssuMjUWCoj9aEY12t+
pywXuVk2FsuYvhniCAcyU6Y1K9aXaf6w5iOY2hx/ysXtUBnv92F7lcathxQkvgjO
fA7/TXEcummOv5KQFc9vckd5Z1gN2ync5jhfnmlT2uiobE6mNdCbOVlCOpsKQkU=
=l5PG
-----END PGP SIGNATURE-----
Junio C Hamano - Oct. 31, 2011, 6:23 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> For the people who use "git request-pull", I'm attaching a trivial
> patch to make it add this kind of signature if you give it the "-s"
> flag. It basically just adds a hunk like the appended crazy example to
> the pull request, and it's small enough and simple enough that it
> makes verification simple too with just the above kind of trivial
> cut-and-paste thing.
>
> (Junio cc'd, I think he had something more complicated in mind)

You have misread me this time.

I think the minimalistic "paste this line to your 'git pull' command line
and expect to get history leading to this commit" like you did in your
patch would be the solution that is the least painful and still useful,
which is an important criterion for wide adoption.

> Now, admittedly it would be *even nicer* if this gpg-signed block was
> instead uploaded as a signed tag automatically, and "git pull" would
> notice such a signed tag (tagname the same as the branch name + date
> or something) and would download and verify the tag as I pull. Then I
> wouldn't even need to actually do the cut-and-paste at all. But this
> is the *really* simple approach that gets up 95% of the way there.

I however have a small trouble with "lieutenants use signed tags in order
to prove who they are to Linus", depending on the details.

It certainly lets you run "git tag --verify" after you pulled and will
give you assurance that you pulled the right thing from the right person,
but what do you plan to do to the tag from your lieutenants after you
fetched and verified?  I count 379 merges by you between 3.0 (2011-07-21)
and 3.1 (2011-10-24), which would mean you would see 4-5 tags per day on
average.  Will these tags be pushed out to your public history?

On one hand, we (not just you but the consumers of "Linus kernel") can
consider these tags are of ephemeral nature. Once they are used for _you_
to verify the authenticity, they are not needed anymore. The consumers of
"Linus kernel" by definition trusts what you publish, so as long as they
have a way to verify the tip commit you push out, they _should_ be happy.
If you take this stance, you would not push these tags out so that you do
not have to contaminate the tags namespace with them, and you might even
choose to discard them once you pulled and verified the lieutenants' tips
to avoid contamination of your own refs namespace.

On the other hand, the consumers of "Linus kernel" may want to say that
they trust your tree and your tags because they can verify them with your
GPG signature, but also they can independently verify the lieutenants'
trees you pulled from are genuine. Keeping signed tags and publishing them
is one way to make it possible, but 400 extra tags in 3 months feels like
an approach with too much downside (i.e. noise) for that benefit.

On Git mailing list, we have been toying with a couple of ideas. The
simplest one (cooking in next) is to allow committers to add gpg signature
in an additional header of the commit objects. "git show" and friends are
taught how to verify these signatures when asked.

This might have a potential downside on the lieutenants' workflow; after
integrating the work by their sub-lieutenants and by themselves, they
would test and review the result to convince themselves that it is worth
asking you to pull, and then they have to either

    (1) "commit --amend --gpg-signature" the tip; or

    (2) "commit --allow-empty --gpg-signature" to add an empty commit
        whose sole purpose is to hold the signature (and avoid amending
        the tip)

before pushing it out, asking you to pull.

An alternative we have discussed was to store gpg signature for the commit
("push certificate") somewhere in notes tree and push that out, certifying
that the commit indeed came from the pusher, but that would:

 (1) require upstreams to fetch (and possibly suffer from merge conflicts
     in notes tree) push certificate whenever they pull from their
     lieutenants; and

 (2) require downstreams to also fetch the notes tree for "push
     certificates" (especially when the central repository is shared among
     multiple people) before adding their own signature and then push it
     back (and possiblly suffer from "non-fast-forward" in notes tree).

both of which are downsides coming from "notes" being not a very good
match for what these signatures are trying to achieve.

Namely, the current "notes" mechanism is designed to keep track of history
of changes made to notes attached to commits, but for the signature
application, we do not care about the order that signatures came to two
separate commits. "Non-fast-forward" conflicts while pushing, or having to
fetch and merge before adding one's own signature, are unwanted burden
imposed only by choosing to use "notes" for storing and conveying the
signature.

Also the "notes" approach would end up mixing "push certificates" for
different branches (this won't be an issue in your repository where there
is only one branch) into a single "notes" tree. We would want to use
something that behaves more like the "auto-following" semantics of tag
objects. You would want to fetch only signatures that are attached to the
commits you are fetching. Use of signed tags, or commit objects that can
be signed in-place, have this property, but storing signature in notes
tree does not give it to us.

I think further discussions on this should continue on the git mailing
list.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Oct. 31, 2011, 10:18 p.m.
On Mon, Oct 31, 2011 at 11:23 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
> It certainly lets you run "git tag --verify" after you pulled and will
> give you assurance that you pulled the right thing from the right person,
> but what do you plan to do to the tag from your lieutenants after you
> fetched and verified?  I count 379 merges by you between 3.0 (2011-07-21)
> and 3.1 (2011-10-24), which would mean you would see 4-5 tags per day on
> average.  Will these tags be pushed out to your public history?

No, you misunderstand.

I can do that kind of "crazy manual check of a tag" today. And it's
too painful to be useful in the long run (or even the short run - I'd
much prefer the pgp signature in the email which is easier to check
and more visible anyway). Fetching a tag by name and saving it as a
tag is indeed pointless.

But what would be nice is that "git pull" would fetch the tag (based
on name) *automatically*, and not actually create a tag in my
repository at all. Instead, if would use the tag to check the
signature, and - if we do this right - also use the tag contents to
populate the merge commit message.

In other words, no actual tag would ever be left around as a turd, it
would simply be used as an automatic communication channel between the
"git push -s" of the submitter and my subsequent "git pull". Neither
side would have to do anything special, and the tag would never show
up in any relevant tree (it could even be in a totally separate
namespace like "refs/pullmarker/<branchname>" or something).

                                 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin - Oct. 31, 2011, 10:20 p.m.
On 10/31/2011 03:18 PM, Linus Torvalds wrote:
> On Mon, Oct 31, 2011 at 11:23 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> It certainly lets you run "git tag --verify" after you pulled and will
>> give you assurance that you pulled the right thing from the right person,
>> but what do you plan to do to the tag from your lieutenants after you
>> fetched and verified?  I count 379 merges by you between 3.0 (2011-07-21)
>> and 3.1 (2011-10-24), which would mean you would see 4-5 tags per day on
>> average.  Will these tags be pushed out to your public history?
> 
> No, you misunderstand.
> 
> I can do that kind of "crazy manual check of a tag" today. And it's
> too painful to be useful in the long run (or even the short run - I'd
> much prefer the pgp signature in the email which is easier to check
> and more visible anyway). Fetching a tag by name and saving it as a
> tag is indeed pointless.
> 
> But what would be nice is that "git pull" would fetch the tag (based
> on name) *automatically*, and not actually create a tag in my
> repository at all. Instead, if would use the tag to check the
> signature, and - if we do this right - also use the tag contents to
> populate the merge commit message.
> 
> In other words, no actual tag would ever be left around as a turd, it
> would simply be used as an automatic communication channel between the
> "git push -s" of the submitter and my subsequent "git pull". Neither
> side would have to do anything special, and the tag would never show
> up in any relevant tree (it could even be in a totally separate
> namespace like "refs/pullmarker/<branchname>" or something).
> 

Perhaps we should introduce the notion of a "private tag" or something
along those lines?  (I guess that would still have to be possible to
push it, but not pull it by default...)

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Oct. 31, 2011, 10:30 p.m.
On Mon, Oct 31, 2011 at 3:20 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> Perhaps we should introduce the notion of a "private tag" or something
> along those lines?  (I guess that would still have to be possible to
> push it, but not pull it by default...)

All tags are private by default.

We actually *only* fetch tags if somebody explicitly asks for them
(--tags), or when fetching from a named remote (and even then it will
only fetch tags that point to objects you fetched by default iirc -
you have to mark the remote specially to get *all* tags).

But if you do the normal "git pull git://git.kernel.org/name/of/repo"
- which is how things happen as a result of a pull request - you won't
get tags at all - you have to ask for them by name or use "--tags" to
get them all.

                   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Kosina - Oct. 31, 2011, 10:33 p.m.
On Mon, 31 Oct 2011, H. Peter Anvin wrote:

> Perhaps we should introduce the notion of a "private tag" or something
> along those lines?  (I guess that would still have to be possible to
> push it, but not pull it by default...)

That's exactly what git does now, right? (unless you pull from a very 
specific remote).
H. Peter Anvin - Oct. 31, 2011, 10:33 p.m.
On 10/31/2011 03:30 PM, Linus Torvalds wrote:
> 
> But if you do the normal "git pull git://git.kernel.org/name/of/repo"
> - which is how things happen as a result of a pull request - you won't
> get tags at all - you have to ask for them by name or use "--tags" to
> get them all.
> 

Didn't realize that... I guess I'm too used to named remotes.

If so, just using a tag should be fine, no?

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Oct. 31, 2011, 10:38 p.m.
On Mon, Oct 31, 2011 at 3:33 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> Didn't realize that... I guess I'm too used to named remotes.
>
> If so, just using a tag should be fine, no?

Yes, that's what I think. But the argument for using a separate
namespace is that
 (a) you never get confused
 (b) it would make it easier to make the 1:1 relationship between
branch names and these "pull request signature tags" without limiting
the naming of *normal* tags in any way
 (c) they do have separate lifetimes from "real" tags.

But seriously, I don't care about the *implementation* all that much.
If people want to use the crazy git "notes" capability, you can do
that too, although quite frankly, I don't see the point. What actually
matters is that "git push" and "git pull" would JustWork(tm), and
check the signature if one exists, without having to cut-and-paste
data that simply shouldn't be visible to the user.

I abhor the interface Ingo suggested, for example. Why would we have
stupid command line options that we should cut-and-paste? Automation
is for computers, not for people.

                          Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Oct. 31, 2011, 10:44 p.m.
"H. Peter Anvin" <hpa@zytor.com> writes:

> On 10/31/2011 03:30 PM, Linus Torvalds wrote:
>> 
>> But if you do the normal "git pull git://git.kernel.org/name/of/repo"
>> - which is how things happen as a result of a pull request - you won't
>> get tags at all - you have to ask for them by name or use "--tags" to
>> get them all.
>> 
>
> Didn't realize that... I guess I'm too used to named remotes.
>
> If so, just using a tag should be fine, no?

So nobody is worried about this (quoting from my earlier message)?

   On the other hand, the consumers of "Linus kernel" may want to say that
   they trust your tree and your tags because they can verify them with your
   GPG signature, but also they can independently verify the lieutenants'
   trees you pulled from are genuine.

A signed emphemeral tag is usable as means to verify authenticity in a
hop-by-hop fashion, but that does not leave a permanent trail that can be
used for auditing.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin - Oct. 31, 2011, 10:47 p.m.
On 10/31/2011 03:44 PM, Junio C Hamano wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
>> On 10/31/2011 03:30 PM, Linus Torvalds wrote:
>>>
>>> But if you do the normal "git pull git://git.kernel.org/name/of/repo"
>>> - which is how things happen as a result of a pull request - you won't
>>> get tags at all - you have to ask for them by name or use "--tags" to
>>> get them all.
>>>
>>
>> Didn't realize that... I guess I'm too used to named remotes.
>>
>> If so, just using a tag should be fine, no?
> 
> So nobody is worried about this (quoting from my earlier message)?
> 
>    On the other hand, the consumers of "Linus kernel" may want to say that
>    they trust your tree and your tags because they can verify them with your
>    GPG signature, but also they can independently verify the lieutenants'
>    trees you pulled from are genuine.
> 
> A signed emphemeral tag is usable as means to verify authenticity in a
> hop-by-hop fashion, but that does not leave a permanent trail that can be
> used for auditing.
> 

Well, the permanent trail is in the maintainer's tree, but that might
still be suboptimal.  The problem with Linus pulling those tags I assume
that it makes the tree too noisy?

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Oct. 31, 2011, 10:49 p.m.
On Mon, Oct 31, 2011 at 03:44:25PM -0700, Junio C Hamano wrote:
> So nobody is worried about this (quoting from my earlier message)?
> 
>    On the other hand, the consumers of "Linus kernel" may want to say that
>    they trust your tree and your tags because they can verify them with your
>    GPG signature, but also they can independently verify the lieutenants'
>    trees you pulled from are genuine.
> 
> A signed emphemeral tag is usable as means to verify authenticity in a
> hop-by-hop fashion, but that does not leave a permanent trail that can be
> used for auditing.

Oh, there are definitely people who worry about this.  They tend to be
security poeple, though, so the goal is how do we leave the permanent
trail in a way that doesn't generate too much noise or otherwise makes
life difficult for developers who don't care.

							- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Oct. 31, 2011, 10:51 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> But seriously, I don't care about the *implementation* all that much.
> If people want to use the crazy git "notes" capability, you can do
> that too, although quite frankly, I don't see the point.

As I already said, I do not think notes is a good match as a tool to do
this.

> matters is that "git push" and "git pull" would JustWork(tm), and
> check the signature if one exists, without having to cut-and-paste
> data that simply shouldn't be visible to the user.
>
> I abhor the interface Ingo suggested, for example....

Some cut-and-paste (or piping the e-mail to a command) would be necessary
evil, though, as you would have GPG keys from more than one trusted person
in your keyring, and when you are responding to a pull-request from person
A, finding a valid commit signed by person B should not be a success, but
at least should raise a warning.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin - Oct. 31, 2011, 10:51 p.m.
On 10/31/2011 03:49 PM, Ted Ts'o wrote:
> On Mon, Oct 31, 2011 at 03:44:25PM -0700, Junio C Hamano wrote:
>> So nobody is worried about this (quoting from my earlier message)?
>>
>>    On the other hand, the consumers of "Linus kernel" may want to say that
>>    they trust your tree and your tags because they can verify them with your
>>    GPG signature, but also they can independently verify the lieutenants'
>>    trees you pulled from are genuine.
>>
>> A signed emphemeral tag is usable as means to verify authenticity in a
>> hop-by-hop fashion, but that does not leave a permanent trail that can be
>> used for auditing.
> 
> Oh, there are definitely people who worry about this.  They tend to be
> security poeple, though, so the goal is how do we leave the permanent
> trail in a way that doesn't generate too much noise or otherwise makes
> life difficult for developers who don't care.
> 

Could we introduce a tag namespace that doesn't show up in gitweb by
default, and perhaps doesn't resolve in abbreviated form?

This is basically what Linus suggested, as far as I understand:
something like refs/pulls/hpa/tip-123-456 which is otherwise a normal
tag object?

	-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Oct. 31, 2011, 10:52 p.m.
On Mon, Oct 31, 2011 at 3:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> So nobody is worried about this (quoting from my earlier message)?

No, because you haven't been reading what we write.

The tag is useless.

The information *in* the tag is not. But it shouldn't be saved in the
tag (or note, or whatever). Because that's just an annoying place for
it to be, with no upside.

Save it in the commit we generate. BAM! Useful, readable, permanent,
and independently verifiable.

And the advantage is that we can make that same mechanism add
"maintainer notes" to the merge message too. Right now some
maintainers write good notes about what the merge will bring in, but
they are basically lost, because git is so good at merging and doesn't
even stop to ask people to edit the merge message.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin - Oct. 31, 2011, 10:54 p.m.
On 10/31/2011 03:52 PM, Linus Torvalds wrote:
> 
> Save it in the commit we generate. BAM! Useful, readable, permanent,
> and independently verifiable.
> 

Note: this means creating a commit even for a fast-forward merge.  Not
that there is any technical problem with that, of course.

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Oct. 31, 2011, 10:56 p.m.
On Mon, Oct 31, 2011 at 3:51 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Some cut-and-paste (or piping the e-mail to a command) would be necessary
> evil, though, as you would have GPG keys from more than one trusted person
> in your keyring, and when you are responding to a pull-request from person
> A, finding a valid commit signed by person B should not be a success, but
> at least should raise a warning.

Why?

The signer of the message needs to be printed out *anyway*. I can
match that up with the pull request, the same way I already match up
diffstat information.

So any extra cut-and-paste is (a) stupid, (b) unnecessary and (c) annoying.

It's also "bad user interface". The whole point is that we should make
the user interface *good*. Which means that the pushing side should
only need to add a "-s" to ask for signing, have to type his
passphrase (and even that would go away when using gpg-agent or
something), and perhaps a message (which would not be about the
signing, but about something that could be added to the merge commit.

And the receiving side would just do the "git pull" and automatically
just get notified that "Yes, this push has been signed by key Xyz
Abcdef"

                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Oct. 31, 2011, 11:03 p.m.
On Mon, Oct 31, 2011 at 3:54 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 10/31/2011 03:52 PM, Linus Torvalds wrote:
>>
>> Save it in the commit we generate. BAM! Useful, readable, permanent,
>> and independently verifiable.
>>
>
> Note: this means creating a commit even for a fast-forward merge.  Not
> that there is any technical problem with that, of course.

Well, only for the signed case, but yes. And for that case it's likely
a good thing.

In fact, even without signing, some projects always use --no-ff,
because they want the merge messages with the nice summary in them.
I've played around with it too, but haven't generally found it to be
worth it, and tend to think that it aggrandizes the merger too much.

It generates nice merge summaries, and it can look nice, but if the
*only* upside is the merge summary I think it's borderline worth it.
But with a signature, it would suddenly actually contain real
information, and I think that changes the equation.

                           Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Garzik - Oct. 31, 2011, 11:55 p.m.
On 10/31/2011 06:44 PM, Junio C Hamano wrote:
> "H. Peter Anvin"<hpa@zytor.com>  writes:
>
>> On 10/31/2011 03:30 PM, Linus Torvalds wrote:
>>>
>>> But if you do the normal "git pull git://git.kernel.org/name/of/repo"
>>> - which is how things happen as a result of a pull request - you won't
>>> get tags at all - you have to ask for them by name or use "--tags" to
>>> get them all.
>>>
>>
>> Didn't realize that... I guess I'm too used to named remotes.
>>
>> If so, just using a tag should be fine, no?
>
> So nobody is worried about this (quoting from my earlier message)?
>
>     On the other hand, the consumers of "Linus kernel" may want to say that
>     they trust your tree and your tags because they can verify them with your
>     GPG signature, but also they can independently verify the lieutenants'
>     trees you pulled from are genuine.
>
> A signed emphemeral tag is usable as means to verify authenticity in a
> hop-by-hop fashion, but that does not leave a permanent trail that can be
> used for auditing.

The main worry is Linus ($human_who_pulls) gets 
cryptographically-verified data at the time he pulls.  Once Linus 
republishes his tree (git push), there will be few, if any, wanting to 
verify Jeff Garzik's signature.

So no, I don't see that as a _driving_ need in the kernel's case.

And IMO the kernel will be a mix of signed and unsigned content for a 
while, possibly forever.


And Linus wrote:
> [ Example gpg-signed small block that the attached patch adds to the
> pull request: ]
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Commit be3fa9125e708348c7baf04ebe9507a72a9d1800
> from git.kernel.org/pub/git
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.18 (GNU/Linux)
>
> iQEcBAEBAgAGBQJOrsILAAoJEHm+PkMAQRiGxZcH/31e0RrBitXUPKxHJajD58yh
> SIEe/7i6E2RUSFva3KybEuFslcR8p8DYzDQTPLejStvnkO8v0lXu9s9R53tvjLMF
> aaQXLOgrOC2RqvzP4F27O972h32YpLBkwIdWQGAhYcUOdKYDZ9RfgEgtdJwSYuL+
> oJ7TjLrtkcILaFmr9nYZC+0Fh7z+84R8kR53v0iBHJQOFfssuMjUWCoj9aEY12t+
> pywXuVk2FsuYvhniCAcyU6Y1K9aXaf6w5iOY2hx/ysXtUBnv92F7lcathxQkvgjO
> fA7/TXEcummOv5KQFc9vckd5Z1gN2ync5jhfnmlT2uiobE6mNdCbOVlCOpsKQkU=
> =l5PG
> -----END PGP SIGNATURE-----


This is my preference for kernel pull requests at the moment.  That has 
one advantage over Junio's "git pull --require-signature" and signed 
commits, notably, the URL is signed.

But in general signed commits would be nice, too.  pull-generated merge 
requests would need to be signed, potentially introducing an additional 
interactive step (GPG passphrase request) into an automated process.

	Jeff


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin - Nov. 1, 2011, 12:42 a.m.
> 
> The main worry is Linus ($human_who_pulls) gets
> cryptographically-verified data at the time he pulls.  Once Linus
> republishes his tree (git push), there will be few, if any, wanting to
> verify Jeff Garzik's signature.
> 
> So no, I don't see that as a _driving_ need in the kernel's case.
> 
> And IMO the kernel will be a mix of signed and unsigned content for a
> while, possibly forever.
> 

I think the desire is to be able to deconstruct things if things were to
go wrong.

	-hpa
James Bottomley - Nov. 1, 2011, 5:39 a.m.
On Mon, 2011-10-31 at 15:52 -0700, Linus Torvalds wrote:
> On Mon, Oct 31, 2011 at 3:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
> >
> > So nobody is worried about this (quoting from my earlier message)?
> 
> No, because you haven't been reading what we write.
> 
> The tag is useless.

It's not useless to people who want to verify the tree after it's been
released by you (say for forensics or something).  As Peter said, we can
put it in a normally invisible namespace, but having a flag to make it
visible allows tools like git describe --contains to tell me which
signed tag was used to send a particular commit.

> The information *in* the tag is not. But it shouldn't be saved in the
> tag (or note, or whatever). Because that's just an annoying place for
> it to be, with no upside.
> 
> Save it in the commit we generate. BAM! Useful, readable, permanent,
> and independently verifiable.
> 
> And the advantage is that we can make that same mechanism add
> "maintainer notes" to the merge message too. Right now some
> maintainers write good notes about what the merge will bring in, but
> they are basically lost, because git is so good at merging and doesn't
> even stop to ask people to edit the merge message.

A signed empty commit containing the merge message as a comment also
looks fine to me.  We'd need extra tooling to say which signed merge
corresponds to this patch, but I'd say its workable.  The only slightly
counter intuitive thing is that for a non-trivial merge, my signed merge
description will have to be the next commit below rather than in the
actual merge you do (because we can't alter a cryptographically signed
commit).

James


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 1, 2011, 7:47 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> But what would be nice is that "git pull" would fetch the tag (based on
> name) *automatically*, and not actually create a tag in my repository at
> all. Instead, if would use the tag to check the signature, and - if we
> do this right - also use the tag contents to populate the merge commit
> message.
>
> In other words, no actual tag would ever be left around as a turd, it
> would simply be used as an automatic communication channel between the
> "git push -s" of the submitter and my subsequent "git pull". Neither
> side would have to do anything special, and the tag would never show
> up in any relevant tree (it could even be in a totally separate
> namespace like "refs/pullmarker/<branchname>" or something).

While I like the "an ephemeral tag is used only for hop-to-hop
communication to carry information to be recorded in the resulting
history" approach, I see a few downsides.

 * The ephemeral tag needs to stay somewhere under refs/ hierarchy of the
   lieutenant's tree until you pick it up, even if they are out of the way
   in refs/pullmarker/$branchname. The next time the same lieutenant makes
   a pull request, either it will be overwritten or multiple versions of
   them refs/pullmarker/$branchname/$serial need to be kept.

   - If the former, this makes forking of the project harder. Suppose a
     pull request is made, you fetch and reject it. The lieutenant reworks
     and makes another pull request. At this point the earlier signature
     is gone. If somebody disagreed with your rejection and wanted to run
     his tree with the initial version you rejected, his tree will not
     carry the signature from the lieutenant.

   - If the latter, then there needs to be a way to expire these pull
     markers when they no longer are useful (i.e. the signature in it is
     transcribed to a merge commit you create) [*1*]. But the party who
     has power to clean them (i.e. the lieutenant who owns the repository)
     is different from the party whose action determines when they no
     longer are necessary (i.e. you). In practice this would lead to these
     pull markers not cleaned at all [*2*].

 * To verify the commit C that was taken from the tip of lieutenant's tree
   some time ago, one has to find the merge commit that has C as a parent,
   and look at the merge commit.  For example "git log --show-signature"
   would either show or not show the authenticity of C depending on where
   the traversal comes from. You certainly can implement it that way, but
   "some child describes an aspect of its parent, but not necessarily all
   children do so" feels philosophically less correct than "the commit has
   data to describe itself".

In your "ephemeral tag", the workflow for a developer (D) and his
integrator (U) would look like this, I think.

 D$ until have something worth sending; do work; done
 D$ git push -s
 Enter passphrase: ...
	- "push" internally creates a pull marker that signs the commit
          object name this is pushing, among other things, and sends it
          along the primary payload
 D$ git pull-request; mail linus

 U$ git pull
 	- "pull" notices the pull marker and fetches it as well;
        - "pull" GPG validates the pull marker;
        - When preparing a merge commit message, the contents of the
          pull marker is included in .git/MERGE_MSG

The "in-commit signature" would give you 100% and your contributors 98% of
that, I think.

 D$ until have something worth sending; do work; done
        - The final round of reworking is concluded with "commit -S",
          which would GPG sign the tip commit itself
 D$ git push
	- Nothing needs to change in the protocol nor "push" itself
 D$ git pull-request; mail linus

 U$ git pull
 	- "pull" GPG validates the tip commit
	- Nothing unusual needs to happen to the resulting "merge" commit

And as a bonus, the code is already there ;-).


[Footnote]

*1* The common ancestor discovery in fetch uses as many refs as it can to
reduce the amount of data that needs to be transferred, and it is known to
hurt performance of the initial advertisement exchange when there are too
many useless refs.

*2* Do casual git users even know how to remove refs in a
remote/publishing repository?
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 1, 2011, 9:21 p.m.
On Tue, Nov 1, 2011 at 12:47 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> While I like the "an ephemeral tag is used only for hop-to-hop
> communication to carry information to be recorded in the resulting
> history" approach, I see a few downsides.

So I do agree.

I'd actually be *happier* with a generic multi-line "branch
description" thing that involves no git objects at all, just a nice
description of what the branch is.

The fact that you could also hide a signed version of the
top-of-branch there would be kind of a side effect, and wouldn't be a
requirement.

I hate how anonymous our branches are. Sure, we can use good names for
them, but it was a mistake to think we should describe the repository
(for gitweb), rather than the branch.

Ok, "hate" is a strong word. I don't "hate" it. I don't even think
it's a major design issue. But I do think that it would have been
nicer if we had had some branch description model.

The only reason I suggest a tag is really because it would fit with
existing tooling - especially the git transport protocol. So it's not
that I actually think that a tag is the right way to describe (and
sign) the branch, it's just that it's the way that wouldn't require
any changes other than in "git push -s" and "git pull".

>  * To verify the commit C that was taken from the tip of lieutenant's tree
>   some time ago, one has to find the merge commit that has C as a parent,
>   and look at the merge commit.  For example "git log --show-signature"
>   would either show or not show the authenticity of C depending on where
>   the traversal comes from. You certainly can implement it that way, but
>   "some child describes an aspect of its parent, but not necessarily all
>   children do so" feels philosophically less correct than "the commit has
>   data to describe itself".

Yeah.

Having thought about it, I'm also not convinced I really want to
pollute the "git log" output with information that realistically
almost nobody cares about. The primary use is just for the person who
pulls things to verify it, after that the information is largely stale
and almost certain to never be interesting to anybody ever again. It's
*theoretically* useful if somebody wants to go back and re-verify, but
at the same time that really isn't expected to be the common case.

So I'm wondering if we want to save it at all. it's quite possible
that realistically speaking "google the mailing list archives" is the
*right* way to look up the signature if it is ever needed later.

Maybe just verifying the email message (with the suggested kind of
change to "git request-pull") is actually the right approach. And what
I should do is to just wrap my "git pull" in some script that I can
just cut-and-paste the gpg-signed thing into, and which just does the
"gpg --verify" on it, and then does the "git pull" after that.

Because in many ways, "git request-pull" is when you do want to sign
stuff. A developer might well want to push out his stuff for some
random internal testing (linux-next, for example), and then only later
decide "Ok, it was all good, now I want to make it 'official' and ask
Linus to pull it", and sign it at *that* time, rather than when
actually pushing it out.

And I suspect signing the pull request fits better into peoples
existing workflow anyway - sending out the email to ask the maintainer
to pull really is the "special event", rather than pushing out the
code itself.

                      Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 1, 2011, 9:56 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> Having thought about it, I'm also not convinced I really want to
> pollute the "git log" output with information that realistically
> almost nobody cares about. The primary use is just for the person who
> pulls things to verify it, after that the information is largely stale
> and almost certain to never be interesting to anybody ever again. It's
> *theoretically* useful if somebody wants to go back and re-verify, but
> at the same time that really isn't expected to be the common case.
> ...
> So I'm wondering if we want to save it at all. it's quite possible
> that realistically speaking "google the mailing list archives" is the
> *right* way to look up the signature if it is ever needed later.

I'd rather want to hear opinions from people who base their work on public
kernels (e.g. distros, and companies who roll their own prod kernels), on
that.

But my gut feeling is that "usually hidden not to disturb normal users,
but is cast in stone in the history and cannot be lost" strikes the right
balance. Both your "next merge commit records the signature together with
the largely useless merge summary cruft but everybody learned to ignore it
with 'log --no-merges' anyway so it does not hurt to have it there" and
the commit signature topic from the next branch [*1*] that puts the
signature in the object header and teaches '--show-signature' option to
the log family to show it share this property.

> Maybe just verifying the email message (with the suggested kind of
> change to "git request-pull") is actually the right approach. And what
> I should do is to just wrap my "git pull" in some script that I can
> just cut-and-paste the gpg-signed thing into, and which just does the
> "gpg --verify" on it, and then does the "git pull" after that.
>
> Because in many ways, "git request-pull" is when you do want to sign
> stuff. A developer might well want to push out his stuff for some
> random internal testing (linux-next, for example), and then only later
> decide "Ok, it was all good, now I want to make it 'official' and ask
> Linus to pull it", and sign it at *that* time, rather than when
> actually pushing it out.
>
> And I suspect signing the pull request fits better into peoples
> existing workflow anyway - sending out the email to ask the maintainer
> to pull really is the "special event", rather than pushing out the
> code itself.

"I can silently push and re-push or even rewind-and-then-push until I
officially send pull-request out" fits well with the "defer the decision
as much as possible" model Git takes in general, and I find certain
attractiveness in it.

But on the other hand, in many ways, publishing your commit to the outside
world, not necessarily for getting pulled into the final destination
(i.e. your tree) but merely for other people to try it out, is the point
of no return (aka "don't rewind or rebase once you publish").  "pushing
out" might be less special than "please pull", but it still is special.

Also there is nothing lost if you sign commits whenever you push them
out.


[Footnote]

*1* Here are three examples on the same commit that is signed for
illustration.

------------------------------------------------
$ git show -s pu
commit c9d870fceac787fdb1c1c43b136c1a94ab2ab005
Merge: 8367c51 71f45ee
Author: Junio C Hamano <gitster@pobox.com>
Date:   Mon Oct 31 20:06:58 2011 -0700

    Merge branch 'jc/stream-to-pack' into pu
    
    * jc/stream-to-pack:
      Bulk check-in
      finish_tmp_packfile(): a helper function
      create_tmp_packfile(): a helper function
      write_pack_header(): a helper function
------------------------------------------------
$ git show -s --show-signature pu
commit c9d870fceac787fdb1c1c43b136c1a94ab2ab005
gpg: Signature made Mon 31 Oct 2011 08:07:04 PM PDT using RSA key ID 96AFE6CB
gpg: Good signature from "Junio C Hamano <gitster@pobox.com>"
gpg:                 aka "Junio C Hamano <junio@pobox.com>"
gpg:                 aka "Junio C Hamano <jch@google.com>"
Merge: 8367c51 71f45ee
Author: Junio C Hamano <gitster@pobox.com>
Date:   Mon Oct 31 20:06:58 2011 -0700

    Merge branch 'jc/stream-to-pack' into pu
    
    * jc/stream-to-pack:
      Bulk check-in
      finish_tmp_packfile(): a helper function
      create_tmp_packfile(): a helper function
      write_pack_header(): a helper function
------------------------------------------------
$ git cat-file commit pu
tree 9add290d468800c3c51ff68fedfb3d16427872ff
parent 8367c51becc5a225b9a192348b7d7c615fb6d250
parent 71f45eeb8278670257bea83620f7d3eac174eee7
author Junio C Hamano <gitster@pobox.com> 1320116818 -0700
committer Junio C Hamano <gitster@pobox.com> 1320116824 -0700
gpgsig -----BEGIN PGP SIGNATURE-----
gpgsig Version: GnuPG v1.4.10 (GNU/Linux)
gpgsig 
gpgsig ...
gpgsig =c62U
gpgsig -----END PGP SIGNATURE-----

Merge branch 'jc/stream-to-pack' into pu

* jc/stream-to-pack:
  Bulk check-in
  finish_tmp_packfile(): a helper function
  create_tmp_packfile(): a helper function
  write_pack_header(): a helper function
------------------------------------------------

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Nov. 1, 2011, 10:39 p.m.
On Tue, Nov 01, 2011 at 02:21:59PM -0700, Linus Torvalds wrote:
> So I'm wondering if we want to save it at all. it's quite possible
> that realistically speaking "google the mailing list archives" is the
> *right* way to look up the signature if it is ever needed later.

Given the number of trees that you merge in every merge window (never
mind over an entire release), I don't think "google the mailing list
archives" is going to scale.  Finding some way to keep it along with
the merge window seems the right thing.  I agree that it should hidden
normally, but that's a UI display issue.  Heck, we could just hide
after the terminating NULL in the commit description, per a discussion
on the git list 2-3 weeks ago.  :-)

> Because in many ways, "git request-pull" is when you do want to sign
> stuff. A developer might well want to push out his stuff for some
> random internal testing (linux-next, for example), and then only later
> decide "Ok, it was all good, now I want to make it 'official' and ask
> Linus to pull it", and sign it at *that* time, rather than when
> actually pushing it out.

Sure, the signed content should be buried in the commit that it
describes.  Whether we carry it in an emphemeral tag or in the git
request-pull is not really important from a security perspective.  The
tag is nicer simply because the person doing the pull won't need to
cut and paste the signature information.

One approach which might work is if git request-pull sends the e-mail
message with the git shortlog and diffstat, *and* an MIME attachment
that contained all of the necessary information.  The maintainer would
then save the attachment, and feed it to git, which will display the
git shortlog and diffstat, ask for confirmation, and then embed the
digital signature into the merge commit.

The only problem with that is (a) you'd have to get over your hatred
of attachment (but if you're using Gmail hopefully that's relative
convenient :-), and (b) LKML list filter would have to be taught to
tolerate git-generated attachments.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ingo Molnar - Nov. 2, 2011, 9:11 a.m.
* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> And the receiving side would just do the "git pull" and 
> automatically just get notified that "Yes, this push has been 
> signed by key Xyz Abcdef"

If this approach is used then it would be nice to have a .gitconfig 
switch to require trusted pulls by default: to not allow doing 
non-signed or untrusted pulls accidentally, or for Git to warn in a 
visible, hard to miss way if there's a non-signed pull.

This adds social uncertainty (and an element of a silent alarm) to a 
realistic attack: the attacker wouldnt know exactly how the puller 
checks signed pull requests, it's kept private.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael J Gruber - Nov. 2, 2011, 10:53 a.m.
Junio C Hamano venit, vidit, dixit 01.11.2011 20:47:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
>> But what would be nice is that "git pull" would fetch the tag (based on
>> name) *automatically*, and not actually create a tag in my repository at
>> all. Instead, if would use the tag to check the signature, and - if we
>> do this right - also use the tag contents to populate the merge commit
>> message.
>>
>> In other words, no actual tag would ever be left around as a turd, it
>> would simply be used as an automatic communication channel between the
>> "git push -s" of the submitter and my subsequent "git pull". Neither
>> side would have to do anything special, and the tag would never show
>> up in any relevant tree (it could even be in a totally separate
>> namespace like "refs/pullmarker/<branchname>" or something).
> 
> While I like the "an ephemeral tag is used only for hop-to-hop
> communication to carry information to be recorded in the resulting
> history" approach, I see a few downsides.
> 
>  * The ephemeral tag needs to stay somewhere under refs/ hierarchy of the
>    lieutenant's tree until you pick it up, even if they are out of the way
>    in refs/pullmarker/$branchname. The next time the same lieutenant makes
>    a pull request, either it will be overwritten or multiple versions of
>    them refs/pullmarker/$branchname/$serial need to be kept.

If we are interested in commit sigs, the easiest tag-based approach is
to name the sig carrying tag by the commit's sha1. Just like the sig is
tied (in)to a commit in Junio's approach, it would be indexed by it. We
can do that now:

git config --global alias.sign '!f() { c=$(git rev-parse "$1") || exit;
shift; git tag -s $@ sigs/$c $c; }; f'

But a different place rather than refs/tags/sigs/<sha1> will be more
appropriate, so that we don't pollute the tag namespace. (Yes, this is
similar to storing them in notes.) tags have a message etc.

With an appropriate refspec, these sigs can be pushed out automatically
(by the lieutenant).

pull-request as in next will list the expected <sha1> at tip.

git pull needs to learn to (fetch and) use refs/<whatever>/<sha1> to
verify that the tip is signed.

git log --show-signature can do the same tricks as with in-commit sigs.

Some things to decide in this approach:
- Should git-pull (pull sigs and) verify by default?
- Should we worry about overwriting existings sigs? We have union-merge
for notes already, and that would be appropriate for sigs. (Yes, our
tags code does verify multiple concatenated sigs.)

The advantage of tags is that they can be added without rewriting the
commit, of course.

Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jochen Striepe - Nov. 2, 2011, 11:20 a.m.
Hi,

On Wed, Nov 02, 2011 at 10:11:26AM +0100, Ingo Molnar wrote:
> If this approach is used then it would be nice to have a .gitconfig 
> switch to require trusted pulls by default: to not allow doing 
> non-signed or untrusted pulls accidentally, or for Git to warn in a 
> visible, hard to miss way if there's a non-signed pull.
> 
> This adds social uncertainty (and an element of a silent alarm) to a 
> realistic attack: the attacker wouldnt know exactly how the puller 
> checks signed pull requests, it's kept private.

But that way you get a false sense of alarm when someone sent a
perfectly trustable pull request, e.g. by signed email.


Another question: If store the actual pgp/gpg signatures in the git tree,
how do you handle signatures by keys which were valid by the time the
signature was made but expired when checking some time afterwards? AFAICT,
gpg will only tell you the key is expired _now_, and will make no statement
regarding the time the actual signature was made.


Thanks,
Jochen.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 2, 2011, 6:58 p.m.
Michael J Gruber <git@drmicha.warpmail.net> writes:

> The advantage of tags is that they can be added without rewriting the
> commit, of course.

And you did neither think about the downsides of tags, nor read what
others already explained for you?
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 2, 2011, 8:04 p.m.
On Tue, Nov 1, 2011 at 2:56 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> But on the other hand, in many ways, publishing your commit to the outside
> world, not necessarily for getting pulled into the final destination
> (i.e. your tree) but merely for other people to try it out, is the point
> of no return (aka "don't rewind or rebase once you publish").  "pushing
> out" might be less special than "please pull", but it still is special.

So I really think that signing the top commit itself is fundamentally wrong.

That commit may not even be *yours*. You may have pulled it from a
sub-lieutenant as a fast-forward, or similar. Amending it later would
be actively very very *wrong*.

So quite frankly, I think the stuff in pu (or next?) is completely
mis-designed. Doing it in the commit is wrong for fundamental reasons,
which all boil down to a simple issue:

 - you absolutely *need* to add the signature later. You *cannot* do
it at "git commit" time.

That's a fundamental issue both from a "workflow model" issue (ie you
want to sign stuff after it has passed testing etc, but you may need
to commit it in order to *get* testing), as well as from a
"fundamental git datastructures" issue (ie you would want to sign
commits that aren't yours.

"git commit --amend" is not the answer - that destroys the fundamental
concept of history being immutable, and while it works for your local
commits, it doesn't work for anybody elses commits, or for stuff you
already pushed out.

And "add a fake empty commit just for the signature" is not the answer
either - because that is clearly inferior to the tags we already had.

I dunno. Did I miss something? As far as I can tell, the signed tags
that we've had since day one are *clearly* much better in very
fundamental ways.

                             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael J Gruber - Nov. 2, 2011, 9:05 p.m.
Junio C Hamano venit, vidit, dixit 02.11.2011 19:58:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> The advantage of tags is that they can be added without rewriting the
>> commit, of course.
> 
> And you did neither think about the downsides of tags, nor read what
> others already explained for you?

We're just weighing things differently here, and no accusations of
"misinformation" or "not thinking" will change this.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 2, 2011, 9:13 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> And "add a fake empty commit just for the signature" is not the answer
> either - because that is clearly inferior to the tags we already had.
>
> I dunno. Did I miss something? As far as I can tell, the signed tags
> that we've had since day one are *clearly* much better in very
> fundamental ways.

Ok, back to the drawing board (which is not a loss as I wasn't expecting
this to be in the official release in upcoming 1.7.8 anyway).

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 2, 2011, 11:34 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> I hate how anonymous our branches are. Sure, we can use good names for
> them, but it was a mistake to think we should describe the repository
> (for gitweb), rather than the branch.
>
> Ok, "hate" is a strong word. I don't "hate" it. I don't even think
> it's a major design issue. But I do think that it would have been
> nicer if we had had some branch description model.
> ...
> Maybe just verifying the email message (with the suggested kind of
> change to "git request-pull") is actually the right approach. And what
> I should do is to just wrap my "git pull" in some script that I can
> just cut-and-paste the gpg-signed thing into, and which just does the
> "gpg --verify" on it, and then does the "git pull" after that.
>
> Because in many ways, "git request-pull" is when you do want to sign
> stuff. A developer might well want to push out his stuff for some
> random internal testing (linux-next, for example), and then only later
> decide "Ok, it was all good, now I want to make it 'official' and ask
> Linus to pull it", and sign it at *that* time, rather than when
> actually pushing it out.

You keep saying cut-and-paste, but do you mind feeding the e-mail text
itself to a tool, instead of cut-and-paste?

The reason I am wondering about this is because in another topic (also in
'next') cooking there is an extended support for topic description for the
branch that states what the purpose of the topic is why the requestor
wants you to have it (this information can be set and updated with "git
branch --edit-description").

A respond-to-request-pull wrapper you would use could be:

 - Get the e-mail from the standard input;
 - Pick up the signed bits and validate the signature;
 - Perform the requested fetch; and
 - Record the merge (or prepare .git/MERGE_MSG) with both the signed bits.

and the "signed bits" could include:

   - the repository and the branch you were expected to pull;
   - the topic description.

among other things the requestor can edit when request-pull message is
prepared.

That would get us back to your "the lieutenant tip is not so special, but
the merge commit the integrator makes using that tip has the signature for
this particular pull" model.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
david@lang.hm - Nov. 2, 2011, 11:41 p.m.
On Wed, 2 Nov 2011, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> I hate how anonymous our branches are. Sure, we can use good names for
>> them, but it was a mistake to think we should describe the repository
>> (for gitweb), rather than the branch.
>>
>> Ok, "hate" is a strong word. I don't "hate" it. I don't even think
>> it's a major design issue. But I do think that it would have been
>> nicer if we had had some branch description model.
>> ...
>> Maybe just verifying the email message (with the suggested kind of
>> change to "git request-pull") is actually the right approach. And what
>> I should do is to just wrap my "git pull" in some script that I can
>> just cut-and-paste the gpg-signed thing into, and which just does the
>> "gpg --verify" on it, and then does the "git pull" after that.
>>
>> Because in many ways, "git request-pull" is when you do want to sign
>> stuff. A developer might well want to push out his stuff for some
>> random internal testing (linux-next, for example), and then only later
>> decide "Ok, it was all good, now I want to make it 'official' and ask
>> Linus to pull it", and sign it at *that* time, rather than when
>> actually pushing it out.
>
> You keep saying cut-and-paste, but do you mind feeding the e-mail text
> itself to a tool, instead of cut-and-paste?

think webmail (i.e. gmail), to feed the e-mail itself to a tool you either 
need to cut-n-paste the entire e-mail or you have to first save the mail 
to a text file. both of which are significantly harder than doing a 
cut-n-past of a portion of the message.

David Lang

> The reason I am wondering about this is because in another topic (also in
> 'next') cooking there is an extended support for topic description for the
> branch that states what the purpose of the topic is why the requestor
> wants you to have it (this information can be set and updated with "git
> branch --edit-description").
>
> A respond-to-request-pull wrapper you would use could be:
>
> - Get the e-mail from the standard input;
> - Pick up the signed bits and validate the signature;
> - Perform the requested fetch; and
> - Record the merge (or prepare .git/MERGE_MSG) with both the signed bits.
>
> and the "signed bits" could include:
>
>   - the repository and the branch you were expected to pull;
>   - the topic description.
>
> among other things the requestor can edit when request-pull message is
> prepared.
>
> That would get us back to your "the lieutenant tip is not so special, but
> the merge commit the integrator makes using that tip has the signature for
> this particular pull" model.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 2, 2011, 11:42 p.m.
On Wed, Nov 2, 2011 at 4:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> You keep saying cut-and-paste, but do you mind feeding the e-mail text
> itself to a tool, instead of cut-and-paste?

Feeding the email to a tool is actually a fair amount of extra work.
It would have worked well in the days when I used text-based email
clients that just had a "pipe email to command" model, but that's long
gone.

In contrast, cut-and-paste to another program is easy - but then you
really can't depend on whitespace or headers or other subtle things.

> A respond-to-request-pull wrapper you would use could be:
>
>  - Get the e-mail from the standard input;
>  - Pick up the signed bits and validate the signature;
>  - Perform the requested fetch; and
>  - Record the merge (or prepare .git/MERGE_MSG) with both the signed bits.

So is there any reason this couldn't be cut-and-paste? Make the signed
part small (*not* including diffstat and shortlog), and make it
whitespace-safe, and I wouldn't mind a tool at all.

If it *can* take the whole email, that would probably be a good design
(so that a "pipe email to command"  model would still work), but it
would be much better if it doesn't require it.

> and the "signed bits" could include:
>
>   - the repository and the branch you were expected to pull;
>   - the topic description.
>
> among other things the requestor can edit when request-pull message is
> prepared.

One thing I'd like is that it would also fire up an editor for the
merge, even if it gets the topic description from the email or
cut-and-paste. I often want to fix up peoples grammar etc. That's a
separate argument for trying to keep the signed part minimal - because
 I really don't want to have to maintain spelin errors just because
they are part of what was signed..

                  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Pearce - Nov. 3, 2011, 1:02 a.m.
On Wed, Nov 2, 2011 at 13:04, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Nov 1, 2011 at 2:56 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> But on the other hand, in many ways, publishing your commit to the outside
>> world, not necessarily for getting pulled into the final destination
>> (i.e. your tree) but merely for other people to try it out, is the point
>> of no return (aka "don't rewind or rebase once you publish").  "pushing
>> out" might be less special than "please pull", but it still is special.
>
> So I really think that signing the top commit itself is fundamentally wrong.

I really disagree. I like the signed commit approach. It allows for a
lot more workflows than just providing a way for you to validate a
pull from a trusted lieutenant. Debian/Gentoo folks want a way to sign
every commit in their workflow. Just because you don't want that and
think its crazy doesn't mean its not a valid workflow for that
community and is something Git shouldn't support. I never use `git
stash`. I hate the damn command. Yet its still there. I just choose
not to use it. Junio's gpgsig header on each commit is also optional,
and communities/contributors can choose to use (or ignore) the feature
as they need to.

> That commit may not even be *yours*. You may have pulled it from a
> sub-lieutenant as a fast-forward, or similar. Amending it later would
> be actively very very *wrong*.

Obviously you shouldn't amend a commit that would otherwise be a
fast-forward. But why not write a new empty signed commit on top, and
teach `git log` without the verify signatures flag to skip over
commits that have a gpgsig header line, have exactly one parent, and
whose parent tree matches the commit's own tree? This removes these
commits from the normal `git log` revision output, but yet the flow of
changes is still very visible within the history.

As I understand it, the point of multiple Signed-off-by lines in
commit message bodies is to show the flow of a change, who reviewed
and applied a given commit, until it finally lands in a tree where its
commit SHA-1 is frozen in stone and you can later pull it. The empty
signed commit on top of a fast-forward provides that same flow of a
change, readily visible with standard `git log` tools, but doesn't
have to clutter up history if we teach log how to skip this particular
type. Similar to the --no-merges way to skip merges. :-)

> So quite frankly, I think the stuff in pu (or next?) is completely
> mis-designed. Doing it in the commit is wrong for fundamental reasons,
> which all boil down to a simple issue:

Totally disagree. I'm really in favor of embedding these into the
commit headers the way Junio has done.

>  - you absolutely *need* to add the signature later. You *cannot* do
> it at "git commit" time.

Why can't you add it at commit time? What is stopping me from running
`git commit -S` every time I make a commit? Is it that my fingers will
wear out more quickly because I have to type my pass-phrase too often?

What is wrong with making a signed commit on a commit I have a high
level of confidence in, but not signing the others? In my own workflow
I make a lot of commit --amends  / rebases until I am pretty confident
in the code being written and organized the way I think it should be
for distribution to others. But at some point in that workflow I'm
doing an --amend or a rebase to make that last final touch, and during
that commit I can add -S to make it signed, because I'm pretty certain
its ready to go. At that point, barring some horrific bug or reviewer
comments, I am unlikely to change the commit. I know at the time I
make that commit that I am pretty confident in the commit, so I take
the extra few key strokes to sign it.

> That's a fundamental issue both from a "workflow model" issue (ie you
> want to sign stuff after it has passed testing etc,

Why do I have to wait until its tested to sign it? The gpgsig
signature isn't any more special than the Signed-off-by line I put
into my commit message to agree to the developer's certificate of
origin, nor is it any more special than the committer line in the
commit header. Its just a statement on the commit that I have a
reasonable enough confidence in the value of this particular commit
and its ancestors that I should take the time to unlock my GPG key and
sign the content in case I do distribute this to others.

If you are going to spend time testing a commit, its probably going to
take longer to perform that testing than it is to perform the GPG key
unlock and signature. So why are you complaining about the time it
takes to sign something you think is worthy of testing?  If the tests
fail, you'll need to rewind/amend/whatever to address the breakage. If
the tests pass, the commit is already signed and ready for
distribution. If you are spending a lot of time signing commits that
are highly likely to fail tests, well, maybe you should look at other
ways to improve your workflow so that you have a higher level of
confidence in the code you record and assume will be a permanent part
of the project's history.

> but you may need
> to commit it in order to *get* testing),

Maybe consider allowing a ".dirty" suffix like git-core does on
builds? Or if you are submitting the code to a remote test cluster
that auto-compiles the code for you (and that is why you need a
commit), it sounds like the time it takes for that to push, compile,
test, and report back is way higher than the time it takes to make the
signature. So you probably should only be submitting something that
you had a reasonable level of confidence in. So you should go ahead
and sign it before sending it for testing, in case the tests do pass
and you want to publish that commit.

> as well as from a
> "fundamental git datastructures" issue (ie you would want to sign
> commits that aren't yours.

Sure. But this is why you can make an empty commit and sign that.

> "git commit --amend" is not the answer - that destroys the fundamental
> concept of history being immutable, and while it works for your local
> commits, it doesn't work for anybody elses commits, or for stuff you
> already pushed out.

Nobody said you had to amend everything. You can add an empty commit.

> And "add a fake empty commit just for the signature" is not the answer
> either - because that is clearly inferior to the tags we already had.

Really? I disagree. The commit DAG scales quite well. The tag
namespace does not. A refs/signatures/$COMMIT_SHA1 namespace also does
not scale well.

An empty commit with a gpgsig header has about the same object cost as
an annotated tag once packed. But it has the advantage that the damn
thing doesn't clog up the reference space, the reference handling
code, or the advertisements in the native protocol. As history goes
on, older signatures are less relevant, and automatically are
avoided/skipped/bypassed by the normal DAG walking code. Tags don't do
this well because they have no relationship to the project history.

The only downside to an empty commit with the gpgsig header is I
cannot grab an arbitrarily deep ancestor and say "Who has signed a
commit that depends on this"? Today we already have this with git
describe --contains (aka git name-rev) for annotated tags. Its a new
feature we have to teach to some part of the log machinery, but the
algorithm will be easier because it doesn't have to mess with the
mapping table of tag objects. It just has to start digging from roots,
remembering each commit that has a gpgsig on any given branch path,
and then outputting the matches when it finds the commit in question.

The commit approach also has the advantage that your tree
automatically carries any lieutenant's signatures, by virtue of them
already being frozen in the commits.  This allows anyone downstream of
you to verify the same signatures, and check them against their own
keyring contents. If the signatures are all detached in some transient
annotated tag space, its impossible for anyone other than you to
verify pull requests. I would hate to say we have this nice
distributed version control system, but only Linus can prove the pull
requests in his repository are what they claim, and we have to then
implicitly trust you to resign that data without the original
signatures being present. $DAY_JOB would feel a lot better about the
integrity of the Linux kernel repository if _ANYONE_ can validate pull
requests offline after they have happened.

> I dunno. Did I miss something? As far as I can tell, the signed tags
> that we've had since day one are *clearly* much better in very
> fundamental ways.

Completely disagree. :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 3, 2011, 1:19 a.m.
On Wed, Nov 2, 2011 at 6:02 PM, Shawn Pearce <spearce@spearce.org> wrote:
>>
>> So I really think that signing the top commit itself is fundamentally wrong.
>
> I really disagree. I like the signed commit approach.

If you like it so much, go ahead and use them.

But stop with the crazy excuses for the downsides. I explained exactly
why amending is stupid and wrong, and why empty commits are f*cking
moronic. But even apart from the *technical* problems with the stupid
mis-designed feature, I explained why it was fundamentally broken from
a workflow standpoint too.

I'm not saying that you shouldn't use them - go ahead and use the
feature if you like it. But please spare me your excuses for stupid
workarounds that come from the fact that they aren't a good match for
sane workflows.

                       Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 3, 2011, 1:45 a.m.
On Wed, Nov 2, 2011 at 6:19 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I'm not saying that you shouldn't use them - go ahead and use the
> feature if you like it. But please spare me your excuses for stupid
> workarounds that come from the fact that they aren't a good match for
> sane workflows.

Btw, having now done odd things with signed tags (because we've used
them as a side-band verification mechanism), I can certainly also say
that the signed tags have their set of problems too.

So signed tags aren't perfect. They were designed for making releases,
and that shows very clearly in how git works with them. The default
choices that git makes are very awkward indeed when you use signed
tags as "security tokens".

But unlike the "sign the commit" approach, those are implementation
and UI issues, not "fundamentally broken design" issues.

For example, fetching a single signed tag with git is surprisingly
hard. It *shouldn't* be hard - and there's no underlying technical or
design reason why it would be hard, but it is. Why? Because all the
git actions when it comes to tags are all geared towards one
particular use, that is *not* about the signature checking aspect of
them.

Here's an example: Rusty Russell now makes nice signed tags for the
things he asks me to pull, and then states them in the pull message.
So he will mention that he has a tag named

   rusty@rustcorp.com.au-v3.1-8068-g5087a50

in his git repository at

   git://github.com/rustyrussell/linux.git

and while I don't think his tag names are all that wonderful, it makes
sense from an automated script kind of standpoint.

Now, let's try to get that tag:

  [torvalds@i5 linux]$ git fetch
git://github.com/rustyrussell/linux.git
rusty@rustcorp.com.au-v3.1-8068-g5087a50
  fatal: Couldn't find remote ref rusty@rustcorp.com.au-v3.1-8068-g5087a50

oops. Ok, so his tag naming is *really* akward. Whatever. Let's try again:

   [torvalds@i5 linux]$ git fetch
git://github.com/rustyrussell/linux.git
refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50
   From git://github.com/rustyrussell/linux
    * tag
rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> FETCH_HEAD

Ahh, success!

Oops. Nope. It turns out that git will *peel* the tag when you fetch
it, so FETCH_HEAD actually doesn't contain the tag object at all, but
the commit object that the tag pointed to. MAJOR FAIL.

Quite frankly, I think that's a git bug, but it's a git bug because
"git fetch" was designed to get the commit to merge. Fair enough.
Let's work around it, and rename the tag at the same time:

   [torvalds@i5 linux]$ git fetch
git://github.com/rustyrussell/linux.git
refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50:refs/tags/rusty
   From git://github.com/rustyrussell/linux
    * [new tag]
rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> rusty
    * [new tag]
rusty@rustcorp.com.au-v3.1-2-gb1e4d20 ->
rusty@rustcorp.com.au-v3.1-2-gb1e4d20
    * [new tag]
rusty@rustcorp.com.au-v3.1-4896-g0acf000 ->
rusty@rustcorp.com.au-v3.1-4896-g0acf000
    * [new tag]
rusty@rustcorp.com.au-v3.1-8068-g5087a50 ->
rusty@rustcorp.com.au-v3.1-8068-g5087a50

WTF? Now we finally *did* get the tag, and we can do

   git verify-tag rusty

and that will work. But what the hell happened? We got three other
tags too that we didn't even ask for!

So we have actual git bugs here, that relate to the fact that we've
treated signed tags specially, and have magic code to basically say
"if there's a signed tag that is reachable from the thing you pull,
and you're not just doing a temporary pull into FETCH_HEAD, we'll
fetch that signed tag too".

Again - not a fundamental design mistake in the data structures, and
it actually made sense from a "signed tags are important release
points" standpoint, but it makes it *really* inconvenient to use
signed tags for signature verification.

Also, the fact that the signed tag gets peeled when we do fetch into
FETCH_HEAD also means that we can't actually save the signature in
resulting the merge commit. The merge, instead of being able to
perhaps save the information that we merged a nice trusted signed
point, only has the commit.

But practically, all of these issues should be pretty easily solvable.
So it should be quite easy to make

    git pull <repo> <tag-name>

just do the right thing - including verifying the tag, and adding the
information in the tag into the merge commit message.

So signed tags are not mis-designed from a conceptual standpoint -
they just work really really awkwardly right now for what the kernel
would like to do with them.

With a few UI fixes, I think the signed tag thing would "just work".

That said, I do think that the "signature in the pull request" should
also "just work", and I'm not entirely sure which one is better. It
might be more convenient to get the signature data from the pull
request. So I'm not at all married the the notion of using signed tags
for this.

                       Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Shawn Pearce - Nov. 3, 2011, 2:14 a.m.
On Wed, Nov 2, 2011 at 18:45, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Nov 2, 2011 at 6:19 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> I'm not saying that you shouldn't use them - go ahead and use the
>> feature if you like it. But please spare me your excuses for stupid
>> workarounds that come from the fact that they aren't a good match for
>> sane workflows.

We often disagree. :-)

> Btw, having now done odd things with signed tags (because we've used
> them as a side-band verification mechanism), I can certainly also say
> that the signed tags have their set of problems too.
...
> But practically, all of these issues should be pretty easily solvable.
> So it should be quite easy to make
>
>    git pull <repo> <tag-name>
>
> just do the right thing - including verifying the tag, and adding the
> information in the tag into the merge commit message.

Uhm, sure.

Quoting you 2 days ago:

On Mon, Oct 31, 2011 at 15:52, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Oct 31, 2011 at 3:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> So nobody is worried about this (quoting from my earlier message)?
>
> No, because you haven't been reading what we write.
>
> The tag is useless.
>
> The information *in* the tag is not. But it shouldn't be saved in the
> tag (or note, or whatever). Because that's just an annoying place for
> it to be, with no upside.
>
> Save it in the commit we generate. BAM! Useful, readable, permanent,
> and independently verifiable.

So you propose we put the tag contents into the merge commit message
so it can be verified after the fact? So merges are now going to be
something much more horrific to read, because it will end with Git
object tag cruft, the tag message, and the PGP signature spew that no
human can decode in the head?

Oh, right, tags are almost good enough. Elsewhere in this thread you
also stated we have to redo the way tags are signed so that the tag
message body itself is not part of the signature, allowing you to fix
spelin errors so you are not stuck with them in your commit history.
But I assume we will have to keep the more typical headers of object /
type / tag / tagger fields, as that is the key information the
signature needs to be over to be of any value. So now there will be
two different ways in which a Git annotated tag object will have its
signature created, as certainly you don't mean to remove the tag
message body from the PGP signature content for release tags.

I fail to see how shoving Git object data fields and a complete PGP
signature block into a merge commit message body, which will show by
default in all git log type tools, and exist in cherry-picks or
rebases that might make that data less valuable, is somehow better
than the gpgsig header that neatly tucks it away until requested. I
also fail to see how scraping the message body for the proper fields
in order to implement automated verification of the signature (because
no human can do it themselves and copy-paste sucks) is a good idea.
Everywhere else in Git that we have machine readable formats its very
well structured so that no guessing is required.

> So signed tags are not mis-designed from a conceptual standpoint -
> they just work really really awkwardly right now for what the kernel
> would like to do with them.
>
> With a few UI fixes, I think the signed tag thing would "just work".

Well, UI fixes, protocol changes, improvements to manage a large
reference space which we have previously said is an insane and stupid
workflow, etc. One reason you picked up all of those extra tags was
the include-tag capability kicking on and picking up older tag
history. We now have to disable it in certain cases.

Its not just a few UI fixes. And there is a lot more work to write a
verify for the tag contents+signature that appears in the body of a
merge commit message. Not to mention we now have to do that verify
logic twice, once in the signed pull request tag like but not quite a
tag but uses a tag thing you are advocating, and again for the merge
commit message body that contains the tag object data that we don't
normally show to an end user, but will now be in every merge commit
you make.

Go ahead and call me stupid, but this already is a bigger amount of
surgery to the git-core code, not to mention worse user experience for
the average `git log` reading human, than having a hidden by default
gpgsig header that might ask a contributor to take 2 extra seconds
before making a commit to consider the useful lifespan of that commit.
Or $DEITY forbid, write a new empty commit to record the equivalent of
their Signed-off-by.

Oh, and while I am on that subject...


<rant>
I have never grasped why sometimes a Signed-off-by is added to a
patch, and why sometimes its not. It seems to be this weird function
of "If the commit SHA-1 is already stable DON'T FUCKING TOUCH IT BY
ADDING SIGNED-OFF-BY IT RUINS THE HISTORY", but if you are too far
down the food chain to be fortunate enough for your commit SHA-1 to
remain frozen, the Signed-off-by has to be added to assert that the
code can be contributed. It sounds like the workflow developed around
where it wasn't acceptable to force history rewriting, you suffer by
not having the SOB, but whenever possible you force a history rewrite
on the contributor just so you can add a SOB and feel good about the
fact that the SOB is added to the commit message.

Get over it. Add the fucking empty commit to show the flow of a
change. Stop forcing every fucking contributor to rebase/rewrite his
commits just so someone higher up in the food chain can wank with
their SOB line.

Everyone I talk to that contributes code to the kernel who isn't Linus
or Ted Tso complains about this, and then asks me to fucking fix it.
They want stable SHA-1s so they know their change arrived into Linus'
tree unmolested. Unfortunately, despite their volume of changes, they
aren't high enough in the food chain to be this lucky. Nope, someone
has to wank their SOB in first. And maybe fix a spelin error.
</rant>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 3, 2011, 2:25 a.m.
On Wed, Nov 2, 2011 at 7:14 PM, Shawn Pearce <spearce@spearce.org> wrote:
>
> So you propose we put the tag contents into the merge commit message
> so it can be verified after the fact? So merges are now going to be
> something much more horrific to read, because it will end with Git
> object tag cruft, the tag message, and the PGP signature spew that no
> human can decode in the head?

Actually, I wanted to just drop the damn thing.

To me, the point of the tag is so that the person doing the merge can
verify that he merges something trusted.

However, everybody else seems to disagree, and wants that stupid
signature to live along in the repository. And I can live with that,
although I do agree with you that it's not exactly pretty. I can live
with "ugly signature that I don't care for" way more than "stupid
design".

Because unlike your crazy empty commit, it at least fits the workflow,
and it certainly isn't any uglier that extraneous pointless commit.

You can disagree. You obviously do. I simply don't care. Because I'm right.

(And your claim that it's big UI fixes and protocol changes is pure
and utter garbage. I just sent a patch that cleans the code up,
removes a line that improperly drops information and gets rid of the
biggest problem with our current handling of tags. No protocol changes
involved, no big UI fixup).

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 3, 2011, 2:31 a.m.
On Wed, Nov 2, 2011 at 7:14 PM, Shawn Pearce <spearce@spearce.org> wrote:
>
> <rant>

I'm answering this separately, because it's a separate rant.

It's also totally bogus, but whatever.

> Get over it. Add the fucking empty commit to show the flow of a
> change. Stop forcing every fucking contributor to rebase/rewrite his
> commits just so someone higher up in the food chain can wank with
> their SOB line.

Shawn, stop using whatever drugs you are using.

NOBODY EVER REBASES ANYTHING FOR SIGNED-OFF-BY.

If they do, they are doing things very very wrong.

Signed-off-by: is *purely* for sending patches by email. No git
operations involved. None. Nada. Zilch. No rebasing involved, because
there's not even a git repository involved, for chissake!

Once something is in git, it's not signed off on - there should be a
sign-off-chain from the author to the committer, and that's it.
Anything else would be crazy.

So stop the crazy rants. Stop with the bad drugs. Seriously. You're
acting crazy.

                          Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff King - Nov. 3, 2011, 2:55 a.m.
On Wed, Nov 02, 2011 at 06:02:37PM -0700, Shawn O. Pearce wrote:

> > So I really think that signing the top commit itself is fundamentally wrong.
> 
> I really disagree. I like the signed commit approach. It allows for a
> lot more workflows than just providing a way for you to validate a
> pull from a trusted lieutenant. Debian/Gentoo folks want a way to sign
> every commit in their workflow. Just because you don't want that and
> think its crazy doesn't mean its not a valid workflow for that
> community and is something Git shouldn't support. I never use `git
> stash`. I hate the damn command. Yet its still there. I just choose
> not to use it. Junio's gpgsig header on each commit is also optional,
> and communities/contributors can choose to use (or ignore) the feature
> as they need to.

Stop for a minute and think about what it _means_ to sign a commit. Is
it saying "I wrote this commit?" Or "I think this commit is good?" Or "I
think all of the history leading to this is good?" It's obviously going
to be a per-project thing, but it's very constricting.  Leaving aside
all of the workflow issues Linus brought up (but which I do agree with),
think about what it would mean for Linus to fetch a commit from a
lieutenant and then sign it. Whatever it means, it can really only be
_one_ thing.

But big projects that are interested in signatures probably want to say
more. They want to say "this developer really wrote this commit". They
want to say "QA passed this commit". They want to say "the history up to
here looks good". And so on.

But they can't say those things without binding some data to the commit
(i.e., making a certificate saying "this commit passed QA").  Data which
might only make sense to assert much later than the commit is written.

So you're going to need to support detached commit signatures in some
form anyway to make everybody happy. Which isn't to say in-commit
signatures are wrong, but they are just one tool in a toolbox.

Personally, I think the only thing that makes sense to assert inside a
commit itself is that you are the author, and the author line of the key
should match the email UID of the signing key. And then anything you
want to say about _other_ people's commits (or even your own commits,
but later) should come in the form of detached signatures with some
content.

That's how signed tags work. It's not just Linus signing a commit. It's
Linus signing a binding between a commit and the statement "this is
v2.6.28". The only thing wrong with the signed tag model for more
general use is that you need some way of naming and organizing large
numbers of tags (e.g., several per commit if you have things like QA
signatures).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robin H. Johnson - Nov. 3, 2011, 3:16 a.m.
On Wed, Nov 02, 2011 at 10:55:32PM -0400,  Jeff King wrote:
> But big projects that are interested in signatures probably want to say
> more. They want to say "this developer really wrote this commit". They
> want to say "QA passed this commit". They want to say "the history up to
> here looks good". And so on.
On the Gentoo side, we've also pondered the question of:
author != committer != pusher
And how to preserve many signatures from sources.

We're on a central repo model, with some ~250 committers.

I was originally primarily after the push certificates/signed-push, and
recording that data in the notes, but that still has the problems of
third-party verification as mentioned in the thread.

If we require that the tip of every push is a signed commit via a hook,
we get knowledge of the pushers. Either your real commit itself is
signed, or you have a signed merge commit on top, or you have a signed
empty commit. In all of the cases, I can verify your signature at the
recv hook. Having signed push in this case has a benefit that you could
ship the data as a bundle, or async from the signing.

The QA value of multiple signatures per commit is also valuable, to
assert SOB WITHOUT altering the commit. I see spearce's rant and the
retort, and really think there needs to be a middle ground - some of
commits that are coming from pulls, and not getting additional SOB,
could really benefit from them being recorded (I see them on mailing
lists, but not introduced since that would break 'stable' IDs).

> But they can't say those things without binding some data to the commit
> (i.e., making a certificate saying "this commit passed QA").  Data which
> might only make sense to assert much later than the commit is written.
> 
> So you're going to need to support detached commit signatures in some
> form anyway to make everybody happy. Which isn't to say in-commit
> signatures are wrong, but they are just one tool in a toolbox.
I was proposing that Git supports _all_ of these models:
- signed commits
- signed pushes (via certs)
- whatever signed lightweight tag idea happens
- existing annotated tags

Choices. Each with their own costs and advantages.
Jochen Striepe - Nov. 3, 2011, 3:22 a.m.
Hi,

On Wed, Nov 02, 2011 at 07:25:17PM -0700, Linus Torvalds wrote:
> To me, the point of the tag is so that the person doing the merge can
> verify that he merges something trusted.
> 
> However, everybody else seems to disagree, and wants that stupid
> signature to live along in the repository.

It seems quite useless and leading to false conclusions in several cases
where the merger's gpg output differs from someone's checking later on,
e.g. when

 - the signing key has been revoked in the mean time (for whatever
   reasons)
 - the signing key has expired
 - the public part of the signing key is not available for the general
   public.

AFAIK gpg just gives you an error code and a message like e.g. "Key has
expired" without stating if the key was valid _when signing the commit_.

How do you plan to handle this when keeping the signature in the
repository? Or am I overlooking something?


Thanks,
Jochen.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 3, 2011, 4:13 a.m.
On Wed, Nov 2, 2011 at 8:22 PM, Jochen Striepe <jochen@tolot.escape.de> wrote:
>
> It seems quite useless and leading to false conclusions in several cases
> where the merger's gpg output differs from someone's checking later on,
> e.g. when
>
>  - the signing key has been revoked in the mean time (for whatever
>   reasons)
>  - the signing key has expired
>  - the public part of the signing key is not available for the general
>   public.

So I don't think those are *big* issues. Sure, you'd want the public
key to be public for it to make any real sense to save, but on the
other hand, they *are* generally public. Yes, yes, you might have keys
that are only used - and only made public - within some particular
organization, but in that case the source code that gets signed with
those keys would tend to be private to that organization too, so..

And yes, keys get revoked or they expire, but that's still a pretty
rare event, so it doesn't really invalidate the argument that making
the original signed content available can quite often be useful - even
if it's not guaranteed to *always* be useful.

No, my main objection to saving the data is that it's ugly and it's
redundant. Sure, in practice you can check the signatures later fine
(with the rare exceptions you mention), but even when you can do it,
what's the big upside?

And there are much bigger real downsides, imho.

For example, let's say that we do eventually end up switching from
SHA1 to SHA256 in git, and we do a full re-import of the tree. Guess
what? All those signatures are now just so much garbage. Sure, you can
recreate them (create some trusted script that you agree does a 1:1
transform, and re-sign everything), but in practice you can't ever
really do that - because all those things are tied to the tree, so you
need to have *everybodys* private keys in one place to do so. And the
people who signed things initially would have to be insane to allow
that.

So I'm actually of the opinion that "internal signatures" are bad
design at a rather fundamental level.

In contrast, the "external signed tags" are fine: it's not just that
there are much fewer of them, it's that they are *independent*. So you
can easily re-generate the signed tags, because each signer can
*individually* decide to validate the newly converted tree, and sign
off on the fact that the conversion was done identically using new
external tags with signatures.

This was one of the reasons I made the signed tags work the way they
do. And it wasn't because I was extremely far-sighted and thought of
all the problems that internal signatures have - it's because monotone
had their internal signatures, and every other email on the monotone
list was about all the problems it caused.

> AFAIK gpg just gives you an error code and a message like e.g. "Key has
> expired" without stating if the key was valid _when signing the commit_.
>
> How do you plan to handle this when keeping the signature in the
> repository? Or am I overlooking something?

So see above - I just wouldn't worry about it. The possible few cases
where it would occur are dwarfed by the cases where it *doesn't*
occur, and those are the ones I'd concentrate on. They are the ones
that need to be important enough that it's even worth carrying the
random noise around.

Are they?

So I do think that there are real upsides at the *process* level where
you can use the signatures to verify that what is pulled is pulled
from the person you thought it was. I don't think anybody disputes
those advantages. But outside of that I think it gets very gray, and
there real disadvantages.

That said, I don't care *that* much. I don't mind polluting the merge
commits with information that I don't think is really worth it. So I'd
be willing to carry the signature information around, although I'd
hope to minimize it and have some sane way to hide it.

            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 3, 2011, 6:16 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

>   [torvalds@i5 linux]$ git fetch
> git://github.com/rustyrussell/linux.git
> rusty@rustcorp.com.au-v3.1-8068-g5087a50
>   fatal: Couldn't find remote ref rusty@rustcorp.com.au-v3.1-8068-g5087a50
>
> oops. Ok, so his tag naming is *really* akward. Whatever.

It is not "Whatever".

 $ git fetch git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v3.0
 fatal: Couldn't find remote ref v3.0

I do not think we ever DWIMmed fetch refspecs to prefix refs/tags/, so it
is not the naming but fetching tags without saying "git fetch tag v3.0"
(which IIRC was your invention long time ago). 

If we changed this "git fetch $there v3.0" to fetch tag, it would help the
final step in your illustration, and I do not think it would be a huge
regression---the only case it becomes fuzzy is when they have v3.0 branch
at the same time, but the owner of such a repository is already playing
with fire.

>    [torvalds@i5 linux]$ git fetch
> git://github.com/rustyrussell/linux.git
> refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50
>    From git://github.com/rustyrussell/linux
>     * tag
> rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> FETCH_HEAD
>
> Ahh, success!
>
> Oops. Nope. It turns out that git will *peel* the tag when you fetch
> it, so FETCH_HEAD actually doesn't contain the tag object at all, but
> the commit object that the tag pointed to. MAJOR FAIL.
>
> Quite frankly, I think that's a git bug, but it's a git bug because
> "git fetch" was designed to get the commit to merge. Fair enough.

And because FETCH_HEAD started as (and probably still is) an internal
implementation detail of communication between fetch and merge inside
pull. So I do not have any issue in changing it to store tags unpeeled
there.
>    [torvalds@i5 linux]$ git fetch
> git://github.com/rustyrussell/linux.git
> refs/tags/rusty@rustcorp.com.au-v3.1-8068-g5087a50:refs/tags/rusty
>    From git://github.com/rustyrussell/linux
>     * [new tag]
> rusty@rustcorp.com.au-v3.1-8068-g5087a50 -> rusty
>     * [new tag]
> rusty@rustcorp.com.au-v3.1-2-gb1e4d20 ->
> rusty@rustcorp.com.au-v3.1-2-gb1e4d20
>     * [new tag]
> rusty@rustcorp.com.au-v3.1-4896-g0acf000 ->
> rusty@rustcorp.com.au-v3.1-4896-g0acf000
>     * [new tag]
> rusty@rustcorp.com.au-v3.1-8068-g5087a50 ->
> rusty@rustcorp.com.au-v3.1-8068-g5087a50
>
> WTF?

This is not WTF but "fetching a history to store the tip of it in your
refs/ namespace causes tags pointing into the history line followed
automatically", and it exactly is what you want to happen if rusty asked
you to fetch his for-linus branch (which the tag may point at) instead.

> We got three other
> tags too that we didn't even ask for!

We could change the rule to read "fetching a history to store the tip of it
in your refs/heads namespace causes autofollow". I am not sure if that is
what we really want, though.

> Again - not a fundamental design mistake in the data structures, and
> it actually made sense from a "signed tags are important release
> points" standpoint, but it makes it *really* inconvenient to use
> signed tags for signature verification.

We could update three things:

 - DWIM $name in "git fetch $there $name" to refs/tags/$name when it makes
   sense;
 - FETCH_HEAD stores unpeeled object names; and
 - "git pull" learns --verify option.

Then

 $ git pull --verify rusty rusty@rustcorp.com.au-v3.1-8068-g5087a50

could integrate the history leading to that tag to your current branch
while running verify-tag on it.

For this, disabling the tag-auto-following is not necessary, as you are
not storing the retrieved tag anywhere.

That is a longwinded way to say I agree what you said below.

> So signed tags are not mis-designed from a conceptual standpoint -
> they just work really really awkwardly right now for what the kernel
> would like to do with them.
>
> With a few UI fixes, I think the signed tag thing would "just work".
>
> That said, I do think that the "signature in the pull request" should
> also "just work", and I'm not entirely sure which one is better.

I do not think it is necessarily either/or choice.

Either way does not solve anything other than validating the last hop
between the last lieutenant to the integrator without having a way to give
the verification material to third parties.

Your earlier "pull request signature could be copied into the message of
the merge that integrates the pulled history" solves 90% of the "third
party validation" issue.

With the signed tags approach, you could push out these signed tags you
get from lieutenants, but there are quite a few things that need to happen
for it to be usable:

 - You or your lieutenants do not want to keep these tags in your working
   repository, to be listed in "git tag -l". They are ephemeral to you and
   your lieutenant, even though they have to be permanent for third
   party auditors.

 - Normal users of your project do not want to see them in "git tag -l"
   either.

 - Responses to "git fetch" and "git ls-remote" produced by "git
   upload-pack" do need to (optionally) include them to allow third party
   auditors to ask for them.

I wonder if an approach like the following, in addition to the three
things I listed above, may give us a workable solution:

 * "git fetch linus v3.0" called by "git pull --verify linus v3.0" fetches
   the v3.0 unpeeled into FETCH_HEAD, GPG verifies it, creates
   refs/audit/$u, before running "git merge". $u is derived from v3.0
   (given tag), the identity of the GPG signer, and perhaps timestamp to
   make it both identifiable and unique under refs/audit/ hierarchy.

 * You "git push origin". This causes refs/audit/* refs that point at
   commits in the transferred history to auto-follow, just like the
   current "git fetch $there $src:$dst" causes refs/tags/* auto-follow.
   The refs/audit/* hierarchy in your public repository will be populated
   by lieutenant signatures.

 * (Optional) You may have signed "git tag -s 'Linux v3.2' v3.2 master"
   before you push origin out, or you may have not. Currently, you do have
   to "git push origin v3.2" separately if you did. The above auto-follow
   could be extended to push refs/tags/* hierarchy to eliminate this step
   as well.

Note that because of the way "upload-pack" protocol is structured, the
first response from "upload-pack" after it gets connection is the
advertisement of refs, and there is no way for "fetch-pack" to ask for
customized refs advertisement to it. So for this to work without incurring
undue overhead for normal users, we would need to exclude refs/audit/*
from the normal ref advertisement (i.e. "ls-remote" does not see it) so
that "git fetch" by casual users will not have to wait for megabytes of
ref advertisements before issuing its first "want" request. Probably we
can change "upload-pack" to advertise only refs/heads/*, refs/tags/*, and
HEAD by default, and a protocol extension could be added to ask for other
hierarchies for specialized needs like third party auditors.

BUT.

This does not allow third party auditors to audit how sub-subsystem
histories came into your lieutenants' history unless you also fetch from
your lieutenants in "auditor" mode to retrieve their refs/audit/* refs to
be propagated to your public repository, which all of us involved in this
thread know you wouldn't bother if it is an additional manual step (and I
personally do not think I would bother if I were you).

So the audit trail will end at one level unless we have even more complex
arrangements. The auditors know the history up to some point in the past
came from you (your last signed tag at release time, which some people may
feel a bit too sparse for auditing purposes when a security incident like
that one happens in between releases), and they know subhistories of what
you merged came from your direct lieutenants (the refs/audit/* tags the
above change allowed you to forward automatically when you published), but
they have to take the word of your direct lieutenants at face value.

I do not know if that is acceptable for $DAYJOB types, though.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 3, 2011, 6:29 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Tue, Nov 1, 2011 at 2:56 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> But on the other hand, in many ways, publishing your commit to the outside
>> world, not necessarily for getting pulled into the final destination
>> (i.e. your tree) but merely for other people to try it out, is the point
>> of no return (aka "don't rewind or rebase once you publish"). "pushing
>> out" might be less special than "please pull", but it still is special.
>
> So I really think that signing the top commit itself is fundamentally wrong.

It merely is a stronger form of the "committer" line in the commit
object. A random repository at Github anybody can create repositories at
can serve you a random commit with any random name on "committer" line,
and the new gpgsig header is a way to let the committer certify it
genuinely is from the committer.

I do not think for that purpose, in-commit signature is fundamentally
wrong. I was hoping it would be more useful than it turned out to be, but
I agree that it just is not suitable as a vehicle to convey "I made that
commit some time ago, and now I want you to pull it for such and such
reasons" in a larger workflow.

The "now I want you to pull it for such and such reasons" part is the pull
request, and if we are to protect them with GPG signatures, and perhaps
copy the signed part in the resulting merge, don't we have a reasonable
solution, without all the downsides the signed tag approach would cause if
we wanted to allow third party auditors to have access to the signatures
for independent auditing purposes (described in a separate message)?

Perhaps what is causing the problem is the desire to allow third party
auditors finer grained audit trail, but after having heard that $DAYJOB
folks went through each and every commit after known release points with
fine-toothed comb, I am not brave/rude/blunt enough to dismiss it as
unimportant.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 3, 2011, 6:52 p.m.
Junio C Hamano <gitster@pobox.com> writes:

> BUT.

Ahh, sorry for the noise. I realize that we already have a winner, namely,
the proposal outlined in your message I was responding to.

It just didn't click to me that you were replacing "signed material from
pull request copied into the merge" with "contents of signed tag copied
into the merge".

So forget everything I said in the later parts of my response that talks
about refs/audit/*, and the other message except for gpgsig header being a
stronger form of existing committer line.



--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 3, 2011, 7:06 p.m.
On Thu, Nov 3, 2011 at 11:16 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
> It is not "Whatever".
>
>  $ git fetch git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v3.0
>  fatal: Couldn't find remote ref v3.0
>
> I do not think we ever DWIMmed fetch refspecs to prefix refs/tags/, so it
> is not the naming but fetching tags without saying "git fetch tag v3.0"
> (which IIRC was your invention long time ago).

Ahh. Yeah, and not DWIM'ing tags is probably ok. I'd completely
forgotten about the special "tag" shortcut.

Which probably means it was a bad ui decision to begin with. But once
more, the UI is clearly designed for fetching the tags into your own
tag-space (ie it does "refs/tags/<tag>:refs/tags/<tag>") rather than
fetching the tag just for verification.

> If we changed this "git fetch $there v3.0" to fetch tag, it would help the
> final step in your illustration, and I do not think it would be a huge
> regression---the only case it becomes fuzzy is when they have v3.0 branch
> at the same time, but the owner of such a repository is already playing
> with fire.

Yeah, extending DWIM for remote repos to do the same thing it does for
local repositories is probably the right thing regardless of any other
issues.

We already have the "tag and branch with the same name" issue for
local repositories, and we have perfectly good disambiguation rules
for when disambiguation is necessary. Making the DWIM rules be the
same for a remote case sounds sane.

That said, I don't think it's a big deal either. I was just confused
by the expansion being different, but having to have the refs/tags/
there isn't a dealbreaker by any means.

>> Quite frankly, I think that's a git bug, but it's a git bug because
>> "git fetch" was designed to get the commit to merge. Fair enough.
>
> And because FETCH_HEAD started as (and probably still is) an internal
> implementation detail of communication between fetch and merge inside
> pull.

Well, I certainly don't consider it to be just "an implementation
detail" personally. I use FETCH_HEAD all the time (the same way I use
ORIG_HEAD and just plain HEAD). It's very useful for "fetch and check
what they have", when you want to look at something but you don't want
all the remote tags and crud. So I consider it a honest-to-goodness
real user feature.

>So I do not have any issue in changing it to store tags unpeeled there.

In fact, storing the peeled was really surprising to me, especially
since it actually *says* "tag" in the .git/FETCH_HEAD file. So the
.git/FETCH_HEAD file really currently ends up being actively wrogn and
misleading for tags we fetch: it looks something like

  <sha-of-commit>  tag '<tagname>'  of <reponame>

and says it is a tag, but the SHA1 is of the peeled commit. That's
just crazy, and actually made me think the other end (Rusty, in this
case) had done something wrong initially (ie I quite reasonably - I
thought - blamed it on Rusty using a non-signed tag).

>> WTF?
>
> This is not WTF but "fetching a history to store the tip of it in your
> refs/ namespace causes tags pointing into the history line followed
> automatically", and it exactly is what you want to happen if rusty asked
> you to fetch his for-linus branch (which the tag may point at) instead.

Well, yes and no. But mostly no.

If I just fetch his for-linus branch, I don't get (and I don't want)
his tags. It's only because I fetched it into my ref-space.

And I only fetched it into my ref-space, because otherwise the crazy
git peeling happened if I don't do that.

So I didn't want those other tags, and I really normally wouldn't have
gotten them. Only because I had to do that odd work-around to avoid
the peeling did I get it, because then the totally unrelated logic of
"ok, get the tags too" triggered.

So it's a WTF, because this work-around ends up having the special
side effects - and they make sense when you *really* fetch his branch
and make it part of your name-space, but not when you only did the
"part of my namespace" as a workaround for another git issue.

Obviously, you can use "-n" (--no-tags) to fetch the tag, and that
actually fixes the issue, but that is it's own kind of WTF too: in
order to fetch just *one* tag, you have to specify that you don't want
tags? Not exactly a greatly intuitive use case ;)

Anyway, the one-line rpatch I sent basically avoids all these WTF
moments, by just making "git fetch <repo> <tagname>" work (apart from
the DWIMmery on the tag-name, but that's a totally independent small
detail that doesn't really matter)

>> We got three other
>> tags too that we didn't even ask for!
>
> We could change the rule to read "fetching a history to store the tip of it
> in your refs/heads namespace causes autofollow". I am not sure if that is
> what we really want, though.

No, I think the current "follow tags" rule is fine. It's just that it
didn't really mesh well with "damn, I have to work around this other
git issue".

> We could update three things:
>
>  - DWIM $name in "git fetch $there $name" to refs/tags/$name when it makes
>   sense;
>  - FETCH_HEAD stores unpeeled object names; and
>  - "git pull" learns --verify option.

Yes. I think that would indeed solve everything.

> Then
>
>  $ git pull --verify rusty rusty@rustcorp.com.au-v3.1-8068-g5087a50
>
> could integrate the history leading to that tag to your current branch
> while running verify-tag on it.

Agreed. The only remaining issue then would be how that "yes, I
verified the tag" part would be actually saved for posterity. My
suggestion would be to to just punt that question, and let the user
decide, by simply:

 - start the editor by default with "--verify"

 - output the "gpg --verify" result into the end of the commit file,
along with the tag content (which has the original pgp signature, of
course).

 - let the user decide what part of it he wants to use.

In particular, the "gpg --verify" result may well be something that
the user wants to *act* on - maybe the key didn't exist in the key
ring, or maybe it does exist but doesn't have quite enough trust and
gpg complains about that etc etc. But that's all something that "start
the editor and show the user what is up" would let the user decide on.

> For this, disabling the tag-auto-following is not necessary, as you are
> not storing the retrieved tag anywhere.

Exactly,

>> That said, I do think that the "signature in the pull request" should
>> also "just work", and I'm not entirely sure which one is better.
>
> I do not think it is necessarily either/or choice.

No, I think we can do both, and it actually ends up being just a
matter of convenience which one a particular project ends up using (or
even use both depending on preferences of particular sub-lieutenants
within the project).

> I wonder if an approach like the following, in addition to the three
> things I listed above, may give us a workable solution:
>
>  * "git fetch linus v3.0" called by "git pull --verify linus v3.0" fetches
>   the v3.0 unpeeled into FETCH_HEAD, GPG verifies it, creates
>   refs/audit/$u, before running "git merge". $u is derived from v3.0
>   (given tag), the identity of the GPG signer, and perhaps timestamp to
>   make it both identifiable and unique under refs/audit/ hierarchy.

So far so good, but see above: it may turn out that the user will
*re-verify* the key after having done some gpg action. So..

>  * You "git push origin". This causes refs/audit/* refs that point at
>   commits in the transferred history to auto-follow, just like the
>   current "git fetch $there $src:$dst" causes refs/tags/* auto-follow.
>   The refs/audit/* hierarchy in your public repository will be populated
>   by lieutenant signatures.

So I don't think auto-follow is good here.

I could *easily* see various companies using this for their own
internal audit, without really wanting to expose things outside of the
company. So auto-following sounds like the wrong approach. Make it an
explicit "expose audit checks" thing.

>  * (Optional) You may have signed "git tag -s 'Linux v3.2' v3.2 master"
>   before you push origin out, or you may have not. Currently, you do have
>   to "git push origin v3.2" separately if you did. The above auto-follow
>   could be extended to push refs/tags/* hierarchy to eliminate this step
>   as well.

So far I haven't really had any issues with having to do a "git push
--tags" to push things out.

That said, maybe the auto-push could just be a per-repo option, and
then you can have it both ways.

> Note that because of the way "upload-pack" protocol is structured, the
> first response from "upload-pack" after it gets connection is the
> advertisement of refs, and there is no way for "fetch-pack" to ask for
> customized refs advertisement to it. So for this to work without incurring
> undue overhead for normal users, we would need to exclude refs/audit/*
> from the normal ref advertisement (i.e. "ls-remote" does not see it) so
> that "git fetch" by casual users will not have to wait for megabytes of
> ref advertisements before issuing its first "want" request.

I think that would be a good thing, and make it much more palatable.
After all, th elikelihood is that *nobody* will ever care about the
audit cases at all. They are very much a "..but what if xyz happens"
kind of safety net for the extreme badness, not anything you'd expect
to use.

                         Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 3, 2011, 7:09 p.m.
On Thu, Nov 3, 2011 at 11:52 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
> Ahh, sorry for the noise. I realize that we already have a winner, namely,
> the proposal outlined in your message I was responding to.

No, no, don't consider my "put in the merge message" a winner at all.

I personally dislike it, and don't really think it's a wonderful thing
at all. I really does have real downsides:

 - internal signatures really *are* a disaster for maintenance. You
can never fix them if they need fixing (and "need fixing" may well be
"you want to re-sign things after a repository format change")

 - they are ugly as heck, and you really don't want to see them in
99.999% of all cases.

So putting those things iin the merge commit message may have some
upsides, but it has tons of downsides too.

I think your refs/audit/ idea should be given real thought, because
maybe that's the right idea.

                           Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o - Nov. 4, 2011, 2:59 p.m.
On Thu, Nov 03, 2011 at 12:09:55PM -0700, Linus Torvalds wrote:
> I personally dislike it, and don't really think it's a wonderful thing
> at all. I really does have real downsides:
> 
>  - internal signatures really *are* a disaster for maintenance. You
> can never fix them if they need fixing (and "need fixing" may well be
> "you want to re-sign things after a repository format change")

Note that a repository format change will break a bunch of other
things as well, including references in commit descriptions ("This
fixes a regression introduced in commit 42DEADBEEF") So if SHA-1 is in
danger of failing in way that would threaten git's use of it (highly
unlikely), we'd probably be well advised to find a way to add a new
crypto checksum (i.e., SHA-256) in parallel, but keep the original
SHA-1 checksum for UI purposes.

>  - they are ugly as heck, and you really don't want to see them in
> 99.999% of all cases.

So we can make them be hidden from "git log" and "gik" by default.
That bit is a bit gross, I agree, but 3rd party verification really is
a good thing, which I'm hoping can be added in a relatively clean
fashion.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 4, 2011, 3:14 p.m.
On Fri, Nov 4, 2011 at 7:59 AM, Ted Ts'o <tytso@mit.edu> wrote:
>
> Note that a repository format change will break a bunch of other
> things as well, including references in commit descriptions ("This
> fixes a regression introduced in commit 42DEADBEEF")

No they won't. Not if you do it right. It's easy enough to
automatically replace the SHA1's in the description, the same way we
replace everything else.

Really.  It's *trivial*.

Maybe some current tools don't do it, but if I were to convert the
kernel tree, I'd absolutely *require* the conversion to be done right.
And "right" means "don't just get the parent SHA1's right, but the
ones hiding in the description too".

Any conversion tool has to keep track of the translation from "old
SHA1 to new SHA1" *anyway* because of all the other issues (ie exactly
things like parent pointers etc), so conversion tools by definition
have the information to do things like this right.

But "internal cryptographic signatures" are fundamentally different. A
conversion tool *cannot* convert them, since it won't have access to
the private keys in question, and thus cannot fix up the signature.

Sure, if I do the conversion, I could make *my* signatures match. And
that is true for every signer out there - individually. But only
individually, never collectively. Sure, we could all meet in one place
and synchronously re-sign things on our private machines with some
"distributed conversion tool", but realistically that really really
doesn't work.

It's a fundamental problem. And it really isn't a theoretical one -
it's one we know will happen *some* day.

I haven't worried about SHA1, exactly because I know it's not a real
problem - we can always convert. But internal signatures very
fundamentally change that.

And it really is about *internal* signatures. The kinds of signed tags
we have now are not a problem. Those can trivially be converted in a
distributed manner, exactly because they are "detatched" from what
they sign. We carry them along with the git repo, but they don't mess
up history, and they can be re-created individually without changing
anything else.

And yes, this was actually a design issue for me, which is why I feel
so strongly about it. I actually *thought* about issues like this
five+ years ago: I wanted to have cryptographic security, but I very
much on purpose wanted it to be "outside" the repo.

(Ok, so the git tag objects can sign other git tag objects
recursively, and in that case you have an ordering issue where a
conversion would first have to get somebody to re-sign their "inner"
tag before the "outer" signature can be re-created, but even if that
were to happen - and I don't think anybody does it - it's a trivial
problem with no real complexity issues).

>>  - they are ugly as heck, and you really don't want to see them in
>> 99.999% of all cases.
>
> So we can make them be hidden from "git log" and "gik" by default.
> That bit is a bit gross, I agree, but 3rd party verification really is
> a good thing, which I'm hoping can be added in a relatively clean
> fashion.

I agree that we can hide them - that's after all what the pgpsig thing
does in the "internal commit signature" that git has in pu/next. That
one hides ie even more specifically, by putting it in the headers of
the commit, but that's just a random implementation detail.

But I really think that "internal signatures" that actually affect the
SHA1 of the object and its history have fundamental design problems.
They may not be "insurmountably bad", but they are definitely real.

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 5, 2011, 6:36 a.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Thu, Nov 3, 2011 at 11:52 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Ahh, sorry for the noise. I realize that we already have a winner, namely,
>> the proposal outlined in your message I was responding to.
>
> No, no, don't consider my "put in the merge message" a winner at all.
>
> I personally dislike it, and don't really think it's a wonderful thing
> at all. I really does have real downsides:
>
>  - internal signatures really *are* a disaster for maintenance. You
> can never fix them if they need fixing (and "need fixing" may well be
> "you want to re-sign things after a repository format change")
>
>  - they are ugly as heck, and you really don't want to see them in
> 99.999% of all cases.
>
> So putting those things iin the merge commit message may have some
> upsides, but it has tons of downsides too.
>
> I think your refs/audit/ idea should be given real thought, because
> maybe that's the right idea.

While I agree that re-signing is a problem, I do not see it as a huge
issue. In your "SHA-1 to SHA-256 transtion" scenario, the conversion is a
flag day event in the hopefully fairly distant (in the git timescale)
future, and I am reasonably sure that by that time we would already have
infrastructure updates necessary to support huge number of refs, including
the "lazily scan only the refs necessary" and the "some refs are optional
in advertisement" topics that are useful for other purposes.

In the worst case, even if we used your "merge commit records the merged
tag as the record of requested pull" design today, we could choose not to
rewrite these in-merge-commit signatures when the conversion becomes
necessary. Instead, the conversion procedure can prepare a mapping table
between the old SHA-1 and the rewritten SHA-256, and contributors can
prepare detached signature for the mappings of their own commits after
verifying that the conversion produced what they are happy with. And then
we store concatenation of these detached signatures in a blob to help
future third party auditors to audit these (by-then) historical commits.

About the ugliness of the merge commit log messages, you have already
learned to ignore them with "log --no-merges" ;-) and the material the
patch series I sent out adds are at the end, so "/^commit.*$" in less
would hopefully work well enough in "log --no-merges" as well.

Because the refs/audit/ approach requires too much infrastructure we still
do not have today, and workflow elements are not fully worked out
(e.g. propagating audit trails fully from sub-sub-sub-...-lieutenants
upwards is tricky as I outlined in the other message), I think we should
start from a design that we can see how it would work now.

With the posted series, the workflow would become something like this:

  contributor$ work work work

  contributor$ git tag -s -m 'Signed pull

  This series is to allow the integrator to pull from contributors
  by specifying a signed tag, not the tip of the branch, and verify
  the authenticity of the series while merging' for-linus

  contributor$ git push public for-linus

  contributor$ git request-pull origin \
          $(git config remote.public.url) for-linus >msg
  contributor$ edit msg
  contributor$ mail torvalds@...

  integrator$ mail ;# read the pull request
  integrator$ git pull git://github.com/contributor/linux.git for-linus
   ... editor opens with the usual merge message, but with
   ... the contents of the tag and the "GPG verify" result at
   ... the end.

It might make sense to also teach the "git tag" part somehow use branch
description of the tip of branch being tagged to prime the tag message.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 5, 2011, 4:41 p.m.
On Fri, Nov 4, 2011 at 11:36 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
> About the ugliness of the merge commit log messages, you have already
> learned to ignore them with "log --no-merges" ;-)

Absolutely not. I look at merges all the time. I never use
"--no-merges" except when I'm doing certain statistics (ie "How many
real changes do we have") or when I do release files.

But I actually think it's important that people write *good* merge
messages. I've berated some people for it when they just have

    Merge branch 'origin'

in their commit message, because I think a merge commit should say why
it happened or what it brought in.

> and the material the
> patch series I sent out adds are at the end, so "/^commit.*$" in less
> would hopefully work well enough in "log --no-merges" as well.

I agree that being at the end helps, but I do a lot of "git log
ORIG_HEAD.." etc, and I don't do a lot of "/^commit" searching.

The "/commit" thing I do tends to be because I do "git log -p" to see
patches, but at the same time am not going to read through
everything..

So I'd really like some way to not see it.

Ted suggested a NUL character in the commit message in front of the
"hidden content". What do you think?

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 5, 2011, 11:49 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> So I'd really like some way to not see it.
>
> Ted suggested a NUL character in the commit message in front of the
> "hidden content". What do you think?

You do not have to resort to NUL; we could just stuff whatever you do not
need to see but needs to be left *intact* in the new header fields just
like the embedded GPG signatures are stored in signed commits.

By the time the integrator is presented the merge commit template, we
would have:

 1. The merge title (e.g. "Merge tag for-linus of git://.../rusty.git/");

 2. Payload of the signed tag (or just "annotated tag"), which is used to
    convey meaningful topic description from the lieutenant;

 3. The signature in the tag, if the tag is not just merely annotated, but
    is signed;

 4. The output from GPG verification of the above (only when 3. is
    available); and

 5. The traditional "merge summary", if merge.log is enabled.

The 10-patch series I sent earlier appends 2 and 3 with "tag:" prefix and
4 with "# " prefix in the commit log template, but it does not have to be
that way. We could arrange things so that we put only 1, 2, 4 (still with
"# " prefix because this is meant to help you verify the authenticity, not
for later third-party audit, and to be stripped away with stripspace
before the commit is made) and 5 in the commit log template, and the
original signed tag contents (only when the tag is signed, not merely
annotated) in a separate file MERGE_SIG in $GIT_DIR/ next to MERGE_MSG,
and teach "git commit" to pick it up and stuff it in a new header field.

That way, the integrator can use the message 2 for the commit log message
and is free to typofix it, without breaking later third-party audit which
would use what is taken literally from the signed tag and stored in the
new header field, because the integrator's editor would never touch the
latter.







--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds - Nov. 6, 2011, 12:53 a.m.
On Sat, Nov 5, 2011 at 4:49 PM, Junio C Hamano <junio@pobox.com> wrote:
>
> You do not have to resort to NUL; we could just stuff whatever you do not
> need to see but needs to be left *intact* in the new header fields just
> like the embedded GPG signatures are stored in signed commits.

Agreed, [ details removed ] that sounds perfect. And makes it easy to
get at if you want to with just "git cat-file commit" - without ever
really being visible to people who don't care. And having it visible
in the editor with '#' means that the user who does the merge gets to
see what actually ended up being put in there, along with the fact
that yes, it verified correctly.

So I think I really like that approach - it seems to solve all problems.

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Valdis.Kletnieks@vt.edu - Nov. 7, 2011, 7:52 a.m.
On Fri, 04 Nov 2011 08:14:52 PDT, Linus Torvalds said:
> On Fri, Nov 4, 2011 at 7:59 AM, Ted Ts'o <tytso@mit.edu> wrote:
> > Note that a repository format change will break a bunch of other
> > things as well, including references in commit descriptions ("This
> > fixes a regression introduced in commit 42DEADBEEF")

> No they won't. Not if you do it right. It's easy enough to
> automatically replace the SHA1's in the description, the same way we
> replace everything else.

OK.. I'll bite.  How do you disambiguate a '42deadbeef' in the changelog part
of a commit as being a commit ID, as opposed to being an address in a traceback
or something similar? Yes, I know you only change the ones that actually map to
a commit ID, but I'd not be surprised if by now we've got enough commits and
stack tracebacks in the git history that we'll birthday-paradox ourselves into
a false-positive in an automatic replacement.

(And it's OK to say "the 3 stack tracebacks in changelogs we just mangled can
just go jump", but it does need at least a few seconds consideration..)
Linus Torvalds - Nov. 7, 2011, 4:24 p.m.
On Sun, Nov 6, 2011 at 11:52 PM,  <Valdis.Kletnieks@vt.edu> wrote:
>
> OK.. I'll bite.  How do you disambiguate a '42deadbeef' in the changelog part
> of a commit as being a commit ID, as opposed to being an address in a traceback
> or something similar? Yes, I know you only change the ones that actually map to
> a commit ID, but I'd not be surprised if by now we've got enough commits and
> stack tracebacks in the git history that we'll birthday-paradox ourselves into
> a false-positive in an automatic replacement.

I don't think we are quite there yet. And (sadly) most of the commit
ID's in the history are 7 hex characters, because that used to be the
default git abbreviation. So there is unlikely to be any real
conflicts.

If we do miss one or two, that will be sad and embarrassing, but is
not a real problem in practice.

We probably could add various heuristics (the SHA1 values are *often*
preceded by the string "commit"), and a really good import would also
have somebody at least visually inspecting ones that other heuristics
say might be debatable (for example - because they have 8 hex digits
and there are other numbers around them that were *not* converted),
but in the end perfection is the enemy of good. It's not really worth
the headache to worry about *all* the cases, if you can cheaply and
simply get 99+% right.

And I think the 99% is almost trivial. While the last 1% may or may
not be worth worrying about.

               Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junio C Hamano - Nov. 9, 2011, 5:26 p.m.
Linus Torvalds <torvalds@linux-foundation.org> writes:

> No, no, don't consider my "put in the merge message" a winner at all.
>
> I personally dislike it, and don't really think it's a wonderful thing
> at all. I really does have real downsides:
>
>  - internal signatures really *are* a disaster for maintenance. You
> can never fix them if they need fixing (and "need fixing" may well be
> "you want to re-sign things after a repository format change")
>
>  - they are ugly as heck, and you really don't want to see them in
> 99.999% of all cases.
>
> So putting those things iin the merge commit message may have some
> upsides, but it has tons of downsides too.
>
> I think your refs/audit/ idea should be given real thought, because
> maybe that's the right idea.

With the latest round of touch-ups, modulo a few bugs I will be fixing
before the 1.7.8 final, I think what we have is more or less OK in the
shorter term and should be ready for general consumption. The ugliness is
gone, but the issue around internal signatures may remain to be solved in
the longer term. At least, by storing the full contents of the tag today
in an extended header, when we figure out how a detached signature should
really work, we could convert by extracting them from the history.

In a separate message earlier in the thread, you raised another issue.

> I hate how anonymous our branches are. Sure, we can use good names for
> them, but it was a mistake to think we should describe the repository
> (for gitweb), rather than the branch.
> 
> Ok, "hate" is a strong word. I don't "hate" it. I don't even think
> it's a major design issue. But I do think that it would have been
> nicer if we had had some branch description model.

At the first glance, our branch model is indeed peculiar in that a branch
does not have a global identity. The scope of its name is local to the
repository, and it is just a pointer into the history. A "note" [*1*] that
can annotate a commit long after the commit is made is not a good way to
describe what a branch is about, because the tip of the branch can advance
beyond the commit that is annotated by such a note. A commit on a branch
does not serve as a good anchoring point to describe the branch.

However, a commit that merges the history of a branch, whether the merged
branch is from a local repository or from a remote one, does serve as a
good anchoring point. The work on a branch is finished as complete as
possible at the time of the merge, and the committer who merges the branch
agrees with both the objective and the implementation of the work done on
the branch, and that is why the merge is made [*2*]. Describing what the
history of the side branch was about in the resulting merge is a perfectly
sensible way to explain the branch. So in that sense, I am very happy with
the way the merge message template uses the pull request tag to let the
lieutenant explain and defend the history behind the tag used for the pull
request. Such an explanation does not have to be keyed with anybody's
local branch name (e.g. "for-linus" would mean different things for
different pull requests even from the same person), but keying it with the
resulting merge commit is a sensible way to leave the record in the
history.

After justifying with the above two paragraphs that it is perfectly
sensible to record the annotations on commits and not on "branch names", I
do agree that we would eventually want to be able to have such annotations
on commits after the fact. Neither "tags" nor "notes" is necessarily a
very good mechanism, however, for the purpose of "signed pull requests"
and "signed commits" [*3*]. Here are some pros and cons:

 - tags must be named, but the only thing we need is to be able to look
   the contents (with signature if signed) up given a commit object.
   Unlike the usual "I want to check out v3.0 release" look-up that goes
   from tag names to the commits, annotation look-ups go the other way, do
   not have to have a tagname, and having tagname does not help our
   look-up in any way. If we want to use tag to annotate various commits
   by various people and keep them around, we would need global namespace
   that would not cause them to crash (we can work this around by using
   the object name of the tag, e.g. renaming 'for-linus' tag to $(git
   rev-parse tags/for-linus), but that is merely a workaround of having to
   name things that do not have to be named in the first place). As a
   local storage machinery for annotations, tags hanging below refs/tags/
   (or refs/audit for that matter) hierarchy with their own names is an
   inappropriate model.

 + tags can auto-follow the commits when object transfer happens (at least
   in the fetch direction), and for the purpose of "signed pull requests"
   and "signed commits", this is a desirable property. When a repository
   gains a commit, the annotations attached to the commit that are missing
   from the receiving repository are automatically transferred from the
   place the commit comes from. Annotations given to other commits that
   are not transferred into the repository do not come to the repository.

 - "git notes" is represented as a commit that records a tree that holds
   the entire mapping from commit to its annotations, and the only way to
   transferr it is to send it together with its history as a whole. It
   does not have the nice auto-following property that transfers only the
   relevant annotations.

 + "git notes" maps the commits to its annotations in the right direction;
   the object name of an annotated object to its annotation.

In the longer term, I think we would need to extend the system in the
following way:

 - Introduce a mapping machanism that can be locally used to map names of
   the objects being annotated to names of other objects (most likely
   blobs but there is nothing that fundamentally prevents you from
   annotating a commit with a tree). The current "git notes" might be a
   perfectly suitable representation of this, or it may turn out to be
   lacking (I haven't thought things through), but the important point is
   that this "mapping store" is _local_. fsck, repack and prune need to be
   told that objects that store the annotation are reachable from the
   annotated objects.

 - Introduce a protocol extension to transfer this mapping information for
   objects being transferred in an efficient way. When "rev-list --objects
   have..want" tells us that the receiving end (in either fetch/push
   direction) would have an object at the end of the primary transfer
   (note that I did not say "an object will be sent in this transfer
   transaction"; "have" does not come into the picture), we make sure that
   missing annotations attached to the object is also transferred, and new
   mapping is registered at the receiving end.

The detailed design for the latter needs more thought. The auto-following
of tags works even if nothing is being fetched in the primary transfer
(i.e. "git fetch" && "git fetch" back to back to update our origin/master
with the master at the origin) when a new tag is added to ancient part of
the history that leads to the master at the origin, but this is exactly
because the sending end advertises all the available tags and the objects
they point at so that we can tell what new tags added to an old object is
missing from the receiving end. This obviously would not scale well when
we have tens of thousands of objects to annotate. Perhaps an entry in the
"mapping store" would record:

 - The object name of the object being annotated;

 - The object name of the annotation;

 - The "timestamp", i.e. when the association between the above two was
   made--this can be local to the repository and a simple counter would
   do.

and also maintain the last "timestamp" this repository sent annotations to
the remote (one timestamp per remote repository). When we push, we would
send annotations pertaining to the object reachable from what we are
pushing (not limited by what they already have, as the whole point of this
exercise is to allow us to transfer annotations added to an object long
after the object was created and sent to the remote) that is newer than
that "timestamp". Similarly, when fetching, we would send the "timestamp"
this repository last fetched annotations from the other end (which means
we would need one such "timestamp" per remote repository) and let the
remote side decide the set of new annotations they added since we last
synched that are on objects reachable from what we "want".

Or something like that.

[Footnote]

*1* By this word, I do not necessarily mean what the "git notes" command
manipulates. A tag that points at a commit is also equally a good vehicle
to annotate a commit after the fact.

*2* For this reason, it may make sense to "commit -S" such a merge
commit. The "mergetag" asserts the authenticity of the pull request from
the lieutenant whose history is being integrated, and the "gpgsig" asserts
the authenticity of the merge itself--the fact that it was made by the
integrator.

*3* I do not mean what "git commit -S" parked in 'pu' produces, which is
to store the signature in the commit. Adding "Signed-off-by:" after the
fact to an existing commit by many people is a more appropriate example.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johan Herland - Nov. 10, 2011, 8:02 a.m.
On Wed, Nov 9, 2011 at 18:26, Junio C Hamano <gitster@pobox.com> wrote:
>  - "git notes" is represented as a commit that records a tree that holds
>   the entire mapping from commit to its annotations, and the only way to
>   transferr it is to send it together with its history as a whole. It
>   does not have the nice auto-following property that transfers only the
>   relevant annotations.

True. However, consider these mitigating factors:

 - The annotations in question (the "signing" of commits) are all intended to
   be merged eventually (i.e. there is no reason for a developer to (after the
   fact) sign a commit that will never end up in the public record). Therefore,
   most or all of the notes in the notes tree are already relevant, or
will become
   relevant in the near future (when the associated commits are merged).

 - Additionally, you could organize these notes into two (or more) notes trees,
   one for merged/official annotations, and one for unmerged/pending
annotations.
   Then make the relevant tools (e.g. "git merge") transfer notes from one tree
   to the other, thereby making sure that the "official" record only contains
   notes that are relevant to the merged history.

 - Finally, there's always "git notes prune" to purge annotations for commits
   that ended up never being merged.

My point is that although "notes" might end up transferring more annotations
than strictly necessary, I believe that in practice all the notes
being transferred
are already (or will soon become) relevant.

>  + "git notes" maps the commits to its annotations in the right direction;
>   the object name of an annotated object to its annotation.
>
> In the longer term, I think we would need to extend the system in the
> following way:
>
>  - Introduce a mapping machanism that can be locally used to map names of
>   the objects being annotated to names of other objects (most likely
>   blobs but there is nothing that fundamentally prevents you from
>   annotating a commit with a tree). The current "git notes" might be a
>   perfectly suitable representation of this, or it may turn out to be
>   lacking (I haven't thought things through), but the important point is
>   that this "mapping store" is _local_. fsck, repack and prune need to be
>   told that objects that store the annotation are reachable from the
>   annotated objects.

IMHO this is precisely what "git notes" does today.

>  - Introduce a protocol extension to transfer this mapping information for
>   objects being transferred in an efficient way. When "rev-list --objects
>   have..want" tells us that the receiving end (in either fetch/push
>   direction) would have an object at the end of the primary transfer
>   (note that I did not say "an object will be sent in this transfer
>   transaction"; "have" does not come into the picture), we make sure that
>   missing annotations attached to the object is also transferred, and new
>   mapping is registered at the receiving end.
>
> The detailed design for the latter needs more thought. The auto-following
> of tags works even if nothing is being fetched in the primary transfer
> (i.e. "git fetch" && "git fetch" back to back to update our origin/master
> with the master at the origin) when a new tag is added to ancient part of
> the history that leads to the master at the origin, but this is exactly
> because the sending end advertises all the available tags and the objects
> they point at so that we can tell what new tags added to an old object is
> missing from the receiving end. This obviously would not scale well when
> we have tens of thousands of objects to annotate. Perhaps an entry in the
> "mapping store" would record:
>
>  - The object name of the object being annotated;
>
>  - The object name of the annotation;
>
>  - The "timestamp", i.e. when the association between the above two was
>   made--this can be local to the repository and a simple counter would
>   do.
>
> and also maintain the last "timestamp" this repository sent annotations to
> the remote (one timestamp per remote repository). When we push, we would
> send annotations pertaining to the object reachable from what we are
> pushing (not limited by what they already have, as the whole point of this
> exercise is to allow us to transfer annotations added to an object long
> after the object was created and sent to the remote) that is newer than
> that "timestamp". Similarly, when fetching, we would send the "timestamp"
> this repository last fetched annotations from the other end (which means
> we would need one such "timestamp" per remote repository) and let the
> remote side decide the set of new annotations they added since we last
> synched that are on objects reachable from what we "want".
>
> Or something like that.

You would also have to keep track of deleted annotations, to enable the local
side to delete an annotation corresponding to an already-deleted annotation
on the remote side.

Pretty soon, you end up having to record something similar to a DAG,
describing the history of manipulating these annotations. At that point, your
"timestamp" calculation starts to look very similar to the "have..want"
calculation already done when transferring "regular" refs. At which point you
have a system that is very similar to what "git notes" does today...


...Johan
David Woodhouse - Nov. 10, 2011, 1:51 p.m.
On Wed, 2011-11-02 at 21:13 -0700, Linus Torvalds wrote:
> No, my main objection to saving the data is that it's ugly and it's
> redundant. Sure, in practice you can check the signatures later fine
> (with the rare exceptions you mention), but even when you can do it,
> what's the big upside? 

Another objection (although it may not be insurmountable) is that it's
not necessarily *entirely* clear what's being signed.

In the simple case where I clone your tree, make a few commits with my
Signed-off-by:, sign a tag and then ask you to pull, that's easy enough.
I'm vouching for what I committed, and not for everything that was in
your tree beforehand.

But what if I'm working on top of someone else's published git tree?
Does a signed tag at the top of *my* work imply that I'm vouching for
all of theirs too?

In the case where the signature is ephemeral and only used for you to
trust my pull request, the answer is simple: If that other work wasn't
in your tree yet at the time I send my pull request, I'd damn well
better be vouching for it when I ask you to pull it. Nothing new there.

But if we're keeping signatures around for auditing purposes, we'd
better have a coherent answer to that question. One that isn't "a
signature cover everything since the last commit with torvalds@ as the
committer", if we want it to be useful for the general case.
David Woodhouse - Nov. 10, 2011, 1:52 p.m.
On Tue, 2011-11-01 at 14:21 -0700, Linus Torvalds wrote:
> I hate how anonymous our branches are. Sure, we can use good names for
> them, but it was a mistake to think we should describe the repository
> (for gitweb), rather than the branch.
> 
> Ok, "hate" is a strong word. I don't "hate" it. I don't even think
> it's a major design issue. But I do think that it would have been
> nicer if we had had some branch description model. 

I actually quite like it. I take it as a hint: if the contents of a
branch are *so* wildly different from the main repository that they need
a different description, perhaps I should be using a separate repository
instead of just a branch.
Junio C Hamano - Nov. 10, 2011, 3:15 p.m.
Johan Herland <johan@herland.net> writes:

> On Wed, Nov 9, 2011 at 18:26, Junio C Hamano <gitster@pobox.com> wrote:
>>  - "git notes" is represented as a commit that records a tree that holds
>>   the entire mapping from commit to its annotations, and the only way to
>>   transferr it is to send it together with its history as a whole. It
>>   does not have the nice auto-following property that transfers only the
>>   relevant annotations.
>
> True. However, consider these mitigating factors:
> ...
>
> My point is that although "notes" might end up transferring more
> annotations than strictly necessary, I believe that in practice all the
> notes being transferred are already (or will soon become) relevant.

Sorry, but I do not think you are considering what would happen when you
have many branches with different purposes, whose commits near tips will
never get merged with each other. "automatic following" semantics like
what "git fetch" does for signed tags is absolutely necessary in such a
case, and the above are not mitigating factors at all in that context.


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marc Branchaud - Nov. 10, 2011, 3:23 p.m.
On 11-11-10 08:51 AM, David Woodhouse wrote:
> On Wed, 2011-11-02 at 21:13 -0700, Linus Torvalds wrote:
>> No, my main objection to saving the data is that it's ugly and it's
>> redundant. Sure, in practice you can check the signatures later fine
>> (with the rare exceptions you mention), but even when you can do it,
>> what's the big upside? 
> 
> Another objection (although it may not be insurmountable) is that it's
> not necessarily *entirely* clear what's being signed.

I think this is a non-issue as far as the implementation is concerned.  That
is, the question exists regardless of what actual bits get (hashed and)
encrypted by a private key.  Furthermore, the answer will depend on who's
using the signatures and in what context, and it's not appropriate for the
git tool to make assumptions about those things.

> In the simple case where I clone your tree, make a few commits with my
> Signed-off-by:, sign a tag and then ask you to pull, that's easy enough.
> I'm vouching for what I committed, and not for everything that was in
> your tree beforehand.
> 
> But what if I'm working on top of someone else's published git tree?
> Does a signed tag at the top of *my* work imply that I'm vouching for
> all of theirs too?

<philosophy>

It all depends on what you mean by "vouch for".

You obviously thought that the 3rd-party repo was good for something,
otherwise why did you base your work on it in the first place?  So maybe
you're just vouching for the 3rd-party repo being good enough for what you're
trying to do.

Or, maybe you've done a thorough analysis of the 3rd-party code and are ready
to certify it as completely memory-leak-free or something.

Or or, maybe you're only making a statement about the commits that you've
authored yourself.  (You probably want to individually sign each of those
commits in this case.)

These sorts of issues have been debated on PKI mailing lists ad nauseum.  I
think the best approach is that if you want your signature to have a
particular meaning, then put that into some text that's part of what's being
signed.  Let other humans read that text and make their own decisions.

</philosophy>

And whatever the case, the software that makes and validates the signatures
shouldn't make any assertions about how to interpret good or bad signatures.
 (Yes, other software could interpret meanings according to some criteria,
and that software could exist alongside or be incorporated into the basic
digital signature software, but the interpretation software is doing a
different job.)

		M.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johan Herland - Nov. 10, 2011, 4:03 p.m.
On Thu, Nov 10, 2011 at 16:15, Junio C Hamano <junio@pobox.com> wrote:
> Johan Herland <johan@herland.net> writes:
>> On Wed, Nov 9, 2011 at 18:26, Junio C Hamano <gitster@pobox.com> wrote:
>>>  - "git notes" is represented as a commit that records a tree that holds
>>>   the entire mapping from commit to its annotations, and the only way to
>>>   transferr it is to send it together with its history as a whole. It
>>>   does not have the nice auto-following property that transfers only the
>>>   relevant annotations.
>>
>> True. However, consider these mitigating factors:
>> ...
>>
>> My point is that although "notes" might end up transferring more
>> annotations than strictly necessary, I believe that in practice all the
>> notes being transferred are already (or will soon become) relevant.
>
> Sorry, but I do not think you are considering what would happen when you
> have many branches with different purposes, whose commits near tips will
> never get merged with each other. "automatic following" semantics like
> what "git fetch" does for signed tags is absolutely necessary in such a
> case, and the above are not mitigating factors at all in that context.

What about having one notes ref per branch? If/when the branch is merged,
the associated notes ref containing the annotations for the commits on that
branch would be merged as well (using "git notes merge").

Sure, using one notes ref per branch is more expensive than a single notes
ref, but it's still cheaper than one ref per signed commit (which is what we
get when using annotated tags). And it prevents the added code and
complexity of the timestamped mapping approach.


...Johan
Junio C Hamano - Nov. 10, 2011, 5:18 p.m.
Johan Herland <johan@herland.net> writes:

> What about having one notes ref per branch? If/when the branch is merged,
> the associated notes ref containing the annotations for the commits on that
> branch would be merged as well (using "git notes merge").

That is a crude workaround that you could (with help from users) make it
work, but it does not change the fact that the current mechanism to
transfer and integrate notes across repositories is a bad match for what
the "signed commit" type annotations wants to achieve. In fact, the need
for such a workaround is an illustration of how bad a match the mechanism
is.

When you merge a history that has commit A into another history that did
not have that commit, the act of creating a merge commit itself should be
enough to make the resulting history to contain that commit. The commit
DAG already expresses it, and if a parallel "notes" mechanism needs to be
futzed with to match that DAG, and command like "merge" needs to be told
to help that process, that is a shortcoming of the "notes" mechanism.




--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johan Herland - Nov. 11, 2011, 1:17 a.m.
On Thu, Nov 10, 2011 at 18:18, Junio C Hamano <junio@pobox.com> wrote:
> Johan Herland <johan@herland.net> writes:
>
>> What about having one notes ref per branch? If/when the branch is merged,
>> the associated notes ref containing the annotations for the commits on that
>> branch would be merged as well (using "git notes merge").
>
> That is a crude workaround that you could (with help from users) make it
> work, but it does not change the fact that the current mechanism to
> transfer and integrate notes across repositories is a bad match for what
> the "signed commit" type annotations wants to achieve. In fact, the need
> for such a workaround is an illustration of how bad a match the mechanism
> is.
>
> When you merge a history that has commit A into another history that did
> not have that commit, the act of creating a merge commit itself should be
> enough to make the resulting history to contain that commit. The commit
> DAG already expresses it, and if a parallel "notes" mechanism needs to be
> futzed with to match that DAG, and command like "merge" needs to be told
> to help that process, that is a shortcoming of the "notes" mechanism.

[ ...and from elsewhere in this thread: ]

> Note that in this thread, I am not saying that "git notes" mechanism is
> not good for anything. A tree whose node names encode an object name is a
> valid way to store the mapping from that object to a set of other objects,
> and we already agreed that as the "local" storage mechanism, "git notes"
> may be used as-is for the purpose of this thread.
>
> But the transfer and merge semantics "git notes" mechanism offers treats
> the entire "notes" that appear in _one_ repository and merging that set to
> the entire "notes" in another repository and it is not a good match for
> the purpose of this thread.

Ok. Point taken.

Given that we need an alternative way to transfer annotations between
repos (using auto-follow to select the relevant set of annotations, and
then transferring only those annotations): Can we leverage existing
functionality in "notes" where useful (e.g. using existing notes merge
strategies to deal with colliding annotations), while at the same time
extending the current "notes" feature with this alternative transfer
mechanism? FWIW, I expect there are other "notes" use cases that
would also prefer the auto-follow only-relevant transfer behavior.

So, how can we use "notes" to better support the transfer semantics you
suggest? The mapping from the object being annotated to the annotation
object is already contained in the notes tree, but the "timestamp" you
describe (needed to efficiently calculate the set of annotations to
auto-follow) is not [1]. However, we could easily enough add a sorted
list of (timestamp,  annotated object name) pairs, to allow fast lookup
of annotations created after a given timestamp. We could even store this
list in a blob or tree object referenced directly from the notes tree [2].


Have fun! :)

...Johan


[1]: Although I did at some point experiment with using timestamps in the
internal organization of the notes tree (see for example
http://article.gmane.org/gmane.comp.version-control.git/127966 ), I ended
up using only the annotated object name (with flexible fanout). I don't
think that reintroducing timestamps in the notes tree organization will
pay off, because we need both lookup by annotated SHA1 and lookup by
newer-than-given-timestamp to be fast, and there's AFAIK no way to get
both from a single notes tree organzation.

[2]: E.g. accessible with "git cat-file refs/notes/foo:timestamps". When
a notes tree contains an entry that is obviously not an object name (SHA1),
the notes code will leave it alone/untouched in the tree (see "struct
non_note" and associated code in notes.c for further details).
Junio C Hamano - Nov. 11, 2011, 5:26 a.m.
Johan Herland <johan@herland.net> writes:

> Given that we need an alternative way to transfer annotations between
> repos (using auto-follow to select the relevant set of annotations, and
> then transferring only those annotations): Can we leverage existing
> functionality in "notes" where useful (e.g. using existing notes merge
> strategies to deal with colliding annotations), while at the same time
> extending the current "notes" feature with this alternative transfer
> mechanism? FWIW, I expect there are other "notes" use cases that
> would also prefer the auto-follow only-relevant transfer behavior.
>
> So, how can we use "notes" to better support the transfer semantics you
> suggest? The mapping from the object being annotated to the annotation
> object is already contained in the notes tree, but the "timestamp" you
> describe (needed to efficiently calculate the set of annotations to
> auto-follow) is not [1].

Please do not take the "timestamp" part too seriously.

I am starting to think that what we want in this context actually is very
close to annotated tags. I said we want a mapping from an annotated object
to "a set of other objects" that annotate it, but it was an unnecessary
and premature generalization. There is no reason that these annotations
have to be structured "Git" objects such as blobs and trees.

A set of annotated tags that have the same value on their "object" field
is a perfect match for "a set of annotations attached to a given object".

We already know that using the real tags has its own problems coming from
having to give each and every one of them unique names somewhere in the
refs hierarchy (be it refs/tags/ or refs/audit/), but imagine if we
somehow had a way to:

 - keep these annotated tags in the object store;

 - keep them from getting pruned even if they are not referenced from
   anywhere in refs/ hierarchy;

 - given an object, efficiently enumerate such annotate tags that refer to
   the object.

And then imagine that we are pushing history leading to a commit from one
repository to another. Both repositories store these "anonymous" (that is
what they are---they do not have a name in the refs/ hierarchy) tags.

The two repositories can individually enumerate all these "anonymous" tags
that annotate commits in the history that is being exchanged, and run a
set reconciliation algorithm (e.g. [*1*]) to find out the anonymous tags
that are missing from the recipient repository.

Such an approach does not require any timestamp.

My point is _not_ that the alternative in this message is superiour to the
handwaving in my other message, but is that I think it may not be the best
approach to think what needs to be added to "notes" to make it applicable
for the problem we are solving.

Rather, I think we should design how the overall system should look like
(i.e. what property the resulting system should have) and then find out
what is necessary in each part of the resulting solution (i.e. the list of
"somehow had a way to..." above, plus "efficient set reconciliation").


[Footnote]

*1* What's the Difference? Efficient Set Reconciliation without Prior
Context http://cseweb.ucsd.edu/~fuyeda/papers/sigcomm2011.pdf
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

 git-request-pull.sh |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/git-request-pull.sh b/git-request-pull.sh
index fc080cc5e45d..22b51930959f 100755
--- a/git-request-pull.sh
+++ b/git-request-pull.sh
@@ -20,11 +20,14 @@  GIT_PAGER=
 export GIT_PAGER
 
 patch=
+sign=
 while	case "$#" in 0) break ;; esac
 do
 	case "$1" in
 	-p)
 		patch=-p ;;
+	-s)
+		sign=-s ;;
 	--)
 		shift; break ;;
 	-*)
@@ -73,6 +76,12 @@  are available in the git repository at:' $baserev &&
 echo "  $url $branch" &&
 echo &&
 
+if test -n "$sign"
+then
+	printf "Commit $headrev\nfrom $url\n" | gpg --clearsign
+	echo
+fi &&
+
 git shortlog ^$baserev $headrev &&
 git diff -M --stat --summary $patch $merge_base..$headrev || exit
 exit $status