diff mbox series

[1/1] scripts: create kernel configuration upgrade script

Message ID 41e9cda304814eca2066a87cf2477c83933358f7.1707269880.git.ehem+openwrt@m5p.com
State Superseded
Delegated to: Petr Štetiar
Headers show
Series [1/1] scripts: create kernel configuration upgrade script | expand

Commit Message

Elliott Mitchell Feb. 7, 2024, 1:16 a.m. UTC
Create a script for automating kernel version changes.  This
generates a pair of commits which cause history to remain attached
to all versioned configuration files.

Signed-off-by: Elliott Mitchell <ehem+openwrt@m5p.com>
---
 scripts/kernel_upgrade.pl | 191 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)
 create mode 100755 scripts/kernel_upgrade.pl

Comments

Jonas Gorski Feb. 7, 2024, 10:53 a.m. UTC | #1
On Wed, 7 Feb 2024 at 02:48, Elliott Mitchell <ehem+openwrt@m5p.com> wrote:
>
> Create a script for automating kernel version changes.  This
> generates a pair of commits which cause history to remain attached
> to all versioned configuration files.

Why is this script needed? What exactly does it do? Does it preserve
bisectability? How would you use it? I see neither a help message nor
any usage examples.

Please provide more detailed explanation in the commit message,
especially since perl isn't the most common or easy to read language.

Regards,
Jonas
Felix Baumann Feb. 7, 2024, 11:57 a.m. UTC | #2
The sender domain has a DMARC Reject/Quarantine policy which disallows
sending mailing list messages using the original "From" header.

To mitigate this problem, the original message has been wrapped
automatically by the mailing list software.
Am 7. Februar 2024 11:53:55 MEZ schrieb Jonas Gorski <jonas.gorski@gmail.com>:
>On Wed, 7 Feb 2024 at 02:48, Elliott Mitchell <ehem+openwrt@m5p.com> wrote:
>>
>> Create a script for automating kernel version changes.  This
>> generates a pair of commits which cause history to remain attached
>> to all versioned configuration files.
>
>Why is this script needed? What exactly does it do? Does it preserve
>bisectability? How would you use it? I see neither a help message nor
>any usage examples.
>
>Please provide more detailed explanation in the commit message,
>especially since perl isn't the most common or easy to read language.
>
>Regards,
>Jonas
>
>_______________________________________________
>openwrt-devel mailing list
>openwrt-devel@lists.openwrt.org
>https://lists.openwrt.org/mailman/listinfo/openwrt-devel

This might be of help
<https://lists.openwrt.org/pipermail/openwrt-devel/2023-October/041672.html>
Elliott linked it in his previous mail.

It explains the problem fairly well.

Short version:
Right now every new kernel version creates a new kernel config file as a copy of the old one which doesn't preserve the git history which hinders the usage of git blame.

git bisect should still work fairly well even without the change.

Regards
Felix
Jonas Gorski Feb. 7, 2024, 12:33 p.m. UTC | #3
On Wed, 7 Feb 2024 at 12:58, Felix Baumann via openwrt-devel
<openwrt-devel@lists.openwrt.org> wrote:
> Am 7. Februar 2024 11:53:55 MEZ schrieb Jonas Gorski <jonas.gorski@gmail.com>:
> >On Wed, 7 Feb 2024 at 02:48, Elliott Mitchell <ehem+openwrt@m5p.com> wrote:
> >>
> >> Create a script for automating kernel version changes.  This
> >> generates a pair of commits which cause history to remain attached
> >> to all versioned configuration files.
> >
> >Why is this script needed? What exactly does it do? Does it preserve
> >bisectability? How would you use it? I see neither a help message nor
> >any usage examples.
> >
> >Please provide more detailed explanation in the commit message,
> >especially since perl isn't the most common or easy to read language.
> >
> >Regards,
> >Jonas
>
> This might be of help
> <https://lists.openwrt.org/pipermail/openwrt-devel/2023-October/041672.html>
> Elliott linked it in his previous mail.
>
> It explains the problem fairly well.
>
> Short version:
> Right now every new kernel version creates a new kernel config file as a copy of the old one which doesn't preserve the git history which hinders the usage of git blame.

I'm aware of the discussion, my point is that the information /
context should to be in the commit description, not in an external
place with no reference to it. If someone not aware of the email
thread looks at the commit I doubt they will be able to understand the
issue this is trying to solve, or how it is solving it. And since
neither the issue, the solution, nor the script itself are trivial all
three need some sort of explanation in the commit message.

FWIW, git blame also supports a find copies harder (-C) that can
detect copies, not just delete/adds as renames (can be passed multiple
times for even harder finding). Unfortunately it's rather slow and
does not work as well as other git command's find copies harder. Not
sure why. Might be worth investigating.

> git bisect should still work fairly well even without the change.

What does "fairly well" mean? Either this produces commits that work,
or it produces commits that break the build. There is no in between.

Regards,
Jonas
Elliott Mitchell Feb. 7, 2024, 3:39 p.m. UTC | #4
On Wed, Feb 07, 2024 at 11:53:55AM +0100, Jonas Gorski wrote:
> On Wed, 7 Feb 2024 at 02:48, Elliott Mitchell <ehem+openwrt@m5p.com> wrote:
> >
> > Create a script for automating kernel version changes.  This
> > generates a pair of commits which cause history to remain attached
> > to all versioned configuration files.
> 
> Why is this script needed? What exactly does it do? Does it preserve
> bisectability? How would you use it? I see neither a help message nor
> any usage examples.
> 
> Please provide more detailed explanation in the commit message,
> especially since perl isn't the most common or easy to read language.

Hmm, true.  This might be closer to PoC/WIP status since I was under
major time pressure.  I rather urgently wanted to get this out *before*
the 6.6 configs were added.

Use is supposed to be fairly simple:

```
scripts/kernel_upgrade.pl 6.1 6.6

git merge --ff-only <sha1 given by "Result is commit xxxx" line>
```

The script is *meant* to take care of that second step, but git has so
far been refusing to do that in an automated fashion.  People wanting to
test could instead create a temporary branch at the specified commit and
examine whether they like the result.



On Wed, Feb 07, 2024 at 01:33:11PM +0100, Jonas Gorski wrote:
> On Wed, 7 Feb 2024 at 12:58, Felix Baumann via openwrt-devel
> <openwrt-devel@lists.openwrt.org> wrote:
> >
> > This might be of help
> > <https://lists.openwrt.org/pipermail/openwrt-devel/2023-October/041672.html>
> > Elliott linked it in his previous mail.
> >
> > It explains the problem fairly well.
> >
> > Short version:
> > Right now every new kernel version creates a new kernel config file as a copy of the old one which doesn't preserve the git history which hinders the usage of git blame.
> 
> I'm aware of the discussion, my point is that the information /
> context should to be in the commit description, not in an external
> place with no reference to it. If someone not aware of the email
> thread looks at the commit I doubt they will be able to understand the
> issue this is trying to solve, or how it is solving it. And since
> neither the issue, the solution, nor the script itself are trivial all
> three need some sort of explanation in the commit message.

I'll see about some updating.

> FWIW, git blame also supports a find copies harder (-C) that can
> detect copies, not just delete/adds as renames (can be passed multiple
> times for even harder finding). Unfortunately it's rather slow and
> does not work as well as other git command's find copies harder. Not
> sure why. Might be worth investigating.

Rather slow is an understatement.  More than an order of magnitude
slower.  This is the difference between something where you can freely
run the command if you're curious about something you notice.  Versus
something you can occasionally start when you urgently need to examine
a major oddity, but then go off taking a coffee, tea, or other break
while waiting for the result.

Right now `git blame` is too slow to be useful in many places where it
should be used.

> > git bisect should still work fairly well even without the change.
> 
> What does "fairly well" mean? Either this produces commits that work,
> or it produces commits that break the build. There is no in between.

Due to the method, inherently a single non-buildable commit is created.
If `git bisect` lands on that, you will need to run `git bisect --skip`
to continue the bisect process.  The commit message could readily be
modified to state this.

I conservatively estimate OpenWRT is generating around 3000 commits per
year.  If the kernel version changes once per year, then there is a
0.03% chance of landing on the commit.  If your typical bisect session
touches 12 commits (this seems excessive) then 0.2% of your bisect
sessions will hit these.

While infinitely greater than zero, that is a worthy trade for making
`git blame` functional on configuration files.
diff mbox series

Patch

diff --git a/scripts/kernel_upgrade.pl b/scripts/kernel_upgrade.pl
new file mode 100755
index 0000000000..6cebaec201
--- /dev/null
+++ b/scripts/kernel_upgrade.pl
@@ -0,0 +1,191 @@ 
+#!/usr/bin/env perl
+# 
+# Copyright (C) 2024 Elliott Mitchell <ehem+openwrt@m5p.com>
+#
+# This is free software, licensed under the GNU General Public License v3.
+# See /LICENSE for more information.
+#
+
+use warnings;
+use strict;
+
+use feature 'state';
+
+die("wrong number of arguments") if(@ARGV!=2);
+
+my $start;
+
+my ($from, $to)=@ARGV;
+
+sub load()
+{
+	my $ret=[];
+
+	open(my $fd, '-|', 'git rev-parse HEAD');
+	$start=<$fd>;
+	chop($start);
+
+	local $/="\0";
+	open($fd, '-| :raw :bytes', "git ls-tree -trz --full-name --name-only HEAD -- target/linux")||die("failed to read git tree");
+
+	while(<$fd>) {
+		chop($_);
+		push(@$ret, substr($_, 0, -length($from))) if(substr($_, -length($from)) eq $from);
+	}
+
+	@$ret=sort({length($b)-length($a)} @$ret);
+
+	return $ret;
+}
+
+my $gitpid;
+my $gitfds=[undef, undef];
+
+sub startgit()
+{
+	my $child=[];
+	(pipe($child->[0], $gitfds->[0])&&pipe($gitfds->[1], $child->[1])) ||
+die("pipe() failed");
+	binmode($gitfds->[0]);
+	binmode($gitfds->[1]);
+
+	$gitpid=fork();
+	if($gitpid) {
+		close($child->[0]);
+		close($child->[1]);
+		$gitfds->[0]->autoflush(1);
+	} elsif($gitpid==0) {
+		close($gitfds->[0]);
+		close($gitfds->[1]);
+
+		open(STDIN, '<&', $child->[0]);
+		close($child->[0]);
+
+		open(STDOUT, '>&', $child->[1]);
+		close($child->[1]);
+
+		exec('git', 'fast-import', '--done');
+		die('exec() of git failed');
+	} else {
+		die('fork() failed');
+	}
+}
+
+sub gitsend
+{
+	return print({$gitfds->[0]} @_);
+}
+
+sub gitrecv()
+{
+	return $_=readline(${$gitfds}[1]);
+}
+
+sub gitls($$)
+{
+	my ($commit, $name)=@_;
+	local $/="\n";
+	gitsend("ls $commit $name\n");
+	gitrecv();
+
+	die('git ls failed') unless(/^([0-8]+)\s+[a-z]+\s+([0-9a-z]+)\s+.+$/);
+
+	return [$1, $2];
+}
+
+sub gitcommit($$$$)
+{
+	my ($dest, $message, $mark, $branch)=@_;
+	local $/="\n";
+	local $|=1;
+	state $author=undef;
+	unless($author) {
+		$author=['', ''];
+		open(my $user, '-|', 'git', 'config', '--get', 'user.name');
+		while(<$user>) {
+			chomp;
+			$author->[0].=$_;
+		}
+		$author->[0]=[split(/,/, [getpwuid($<)]->[6])]->[0] unless($author->[0]);
+
+		open(my $email, '-|', 'git', 'config', '--get', 'user.email');
+		while(<$email>) {
+			chomp;
+			$author->[1].=$_;
+		}
+		$author->[1]='anonymous@example.com' unless($author->[1]);
+
+		$author=$author->[0].' <'.$author->[1].'>';
+	}
+	gitsend("commit $branch\n");
+	gitsend("mark $mark\n");
+	gitsend("committer $author ".time()." +0000\n");
+
+	$_=length($message);
+	gitsend("data $_\n");
+	gitsend($message);
+	gitsend("from $dest\n");
+}
+
+sub gitdone()
+{
+	local $/="\n";
+	gitsend("done\n");
+	close($gitfds->[0]);
+	$gitfds->[0]=undef;
+	0 while(waitpid($gitpid, 0) != $gitpid);
+	print(STDERR "WARNING: git returned error exit status\n") if($?);
+	close($gitfds->[1]);
+	$gitfds->[1]=undef;
+}
+
+my $list=load();
+
+die("no files matching \"$from\" found") unless(@$list);
+
+startgit();
+
+
+gitcommit($start, <<"__TMP__", ':1', 'tmp');
+kernel: add configs and patches for $to
+
+Copy the configuration and patches from $from to $to.
+
+This is a special tool-generated commit.
+__TMP__
+
+foreach my $name (@$list) {
+	my $new=gitls($start, "$name$from");
+	gitsend("M $new->[0] $new->[1] $name$to\n");
+	gitsend("D $name$from\n");
+}
+gitsend("\n");
+
+
+gitcommit(':1', <<"__TMP__", ':2', "end");
+kernel: finish update from $from to $to
+
+Merge the add commit into HEAD to create all files with full history.
+
+This is a special tool-generated commit.
+__TMP__
+
+gitsend("merge $start\n");
+
+foreach my $name (@$list) {
+	my $new=gitls($start, "$name$from");
+	gitsend("M $new->[0] $new->[1] $name$from\n");
+}
+gitsend("\n");
+
+
+gitsend("get-mark :2\n");
+my $result=gitrecv();
+
+gitdone();
+
+print("Result is commit $result\n");
+
+exec('git', 'merge', '--ff-only', $result);
+
+exit(0);