diff mbox series

[OpenWrt-Devel] Migration in ath79 for swapped ethernet

Message ID 020101d563fa$a14539a0$e3cface0$@adrianschmutzler.de
State Changes Requested
Delegated to: John Crispin
Headers show
Series [OpenWrt-Devel] Migration in ath79 for swapped ethernet | expand

Commit Message

Adrian Schmutzler Sept. 5, 2019, 3 p.m. UTC
Hi,

if I remember correctly, there is still no mechanism to fix eth0/eth1 for devices where those have been swapped from ar71xx to ath79.

In principle, this can be done with a relatively small piece of code (not tested):


However, this will obviously swap eth0/eth1 on EVERY upgrade, not just when coming from ar71xx.
So, does anyone have an idea how to limit this to run only when updated from ar71xx?

Despite, while having the abstraction of "rename_all_eth", I wonder whether it would be possible and desirable to do all renames in one step:
sed -i -e 's/eth0/ethX/' -e 's/eth1/eth0/' -e 's/ethX/eth1/' $file
or even
sed -i -e 's/eth0/eth1/' -e  's/eth0/eth1/' $file
depending on how sed handles this internally. These options would mean less flash writes (although this might not be too important here).

One might also need to add the 'g' modifier to sed to account for multiple ethX found per line, e.g. 'eth0.1 eth0.2'.
I will test the latter cases when I have more time, just wanted to start the discussion with this proposal.

Best

Adrian

Comments

David Bauer Sept. 7, 2019, 9:39 a.m. UTC | #1
Hello Adrian,

On 9/5/19 5:00 PM, Adrian Schmutzler wrote:
> Hi,
> 
> if I remember correctly, there is still no mechanism to fix eth0/eth1 for devices where those have been swapped from ar71xx to ath79.
> 
> In principle, this can be done with a relatively small piece of code (not tested):
> 
> diff --git a/target/linux/ath79/base-files/etc/uci-defaults/05_eth_migration b/target/linux/ath79/base-files/etc/uci-defaults/05_eth_migration
> new file mode 100644
> index 0000000000..d6b519d25a
> --- /dev/null
> +++ b/target/linux/ath79/base-files/etc/uci-defaults/05_eth_migration
> @@ -0,0 +1,28 @@
> +#!/bin/sh
> +
> +rename_all_eth() {
> +       local before="$1"
> +       local after="$2"
> +
> +       sed -i "s/$before/$after/" /etc/board.json
> +       for e in $(ls /etc/config/* 2>/dev/null); do
> +               sed -i "s/$before/$after/" "$e"
> +       done
> +       for e in $(ls /etc/sysctl.d/* 2>/dev/null); do
> +               sed -i "s/$before/$after/" "$e"
> +       done
> +}
> +
> +case $(board_name) in
> +glinet,gl-ar150|\
> +tplink,archer-c58-v1|\
> +tplink,archer-c59-v1|\
> +tplink,archer-c60-v1|\
> +tplink,archer-c60-v2)
> +       rename_all_eth "eth0" "ethX"
> +       rename_all_eth "eth1" "eth0"
> +       rename_all_eth "ethX" "eth0"
> +       ;;
> +esac
> +
> +exit 0
> 
> However, this will obviously swap eth0/eth1 on EVERY upgrade, not just when coming from ar71xx.
> So, does anyone have an idea how to limit this to run only when updated from ar71xx?

I was thinking about the same. As we have no information about the previously installed platform,
i was thinking about abusing the wmac path we already use to migrate the WiFi configuration.
However, i think this is not the most elegant way to solve this issue.

> Despite, while having the abstraction of "rename_all_eth", I wonder whether it would be possible and desirable to do all renames in one step:
> sed -i -e 's/eth0/ethX/' -e 's/eth1/eth0/' -e 's/ethX/eth1/' $file
> or even
> sed -i -e 's/eth0/eth1/' -e  's/eth0/eth1/' $file
> depending on how sed handles this internally. These options would mean less flash writes (although this might not be too important here).

A rewrite with sed is not sufficient, as we will possible rewrite uci section names, possibly
referenced elsewhere. We have to loop thru all interface values and lists, rewriting each occurrence. 

Best wishes
David
Adrian Schmutzler Sept. 7, 2019, 10:15 a.m. UTC | #2
Hi,

> > However, this will obviously swap eth0/eth1 on EVERY upgrade, not just
> when coming from ar71xx.
> > So, does anyone have an idea how to limit this to run only when updated
> from ar71xx?
> 
> I was thinking about the same. As we have no information about the
> previously installed platform, i was thinking about abusing the wmac path we
> already use to migrate the WiFi configuration.
> However, i think this is not the most elegant way to solve this issue.

I have to think about that. I recently thought one could just check whether the lan/wan assignment matches the one expected for ar71xx, but that would obviously also catch cases were the user modified it to be like this.

> 
> > Despite, while having the abstraction of "rename_all_eth", I wonder
> whether it would be possible and desirable to do all renames in one step:
> > sed -i -e 's/eth0/ethX/' -e 's/eth1/eth0/' -e 's/ethX/eth1/' $file or
> > even sed -i -e 's/eth0/eth1/' -e  's/eth0/eth1/' $file depending on
> > how sed handles this internally. These options would mean less flash writes
> (although this might not be too important here).
> 
> A rewrite with sed is not sufficient, as we will possible rewrite uci section
> names, possibly referenced elsewhere. We have to loop thru all interface
> values and lists, rewriting each occurrence.

Actually, I could well live with that. What kind of references are you referring to?
If just someone really named a section with ethX, it will be renamed consistently throught all uci files (unless they are stored in another location).
Only in case someone uses a section name with ethX and refers to it e.g. in a custom script, this will be a problem.
And this is where I think we do not have to account for every tiny possibility. If someone upgrades to another architecture, I think it's fair to expect him to check whether his custom scripts still work. We do not have to overdo it.
But that's just my point of view at the moment.

Best

Adrian
Piotr Dymacz Jan. 20, 2020, 11:34 p.m. UTC | #3
Hi Adrian, David, Chuanhong,

On 07.09.2019 12:15, mail@adrianschmutzler.de wrote:
> Hi,
> 
>> > However, this will obviously swap eth0/eth1 on EVERY upgrade, not just
>> when coming from ar71xx.
>> > So, does anyone have an idea how to limit this to run only when updated
>> from ar71xx?
>> 
>> I was thinking about the same. As we have no information about the
>> previously installed platform, i was thinking about abusing the wmac path we
>> already use to migrate the WiFi configuration.
>> However, i think this is not the most elegant way to solve this issue.
> 
> I have to think about that. I recently thought one could just check whether the lan/wan assignment matches the one expected for ar71xx, but that would obviously also catch cases were the user modified it to be like this.
> 
>> 
>> > Despite, while having the abstraction of "rename_all_eth", I wonder
>> whether it would be possible and desirable to do all renames in one step:
>> > sed -i -e 's/eth0/ethX/' -e 's/eth1/eth0/' -e 's/ethX/eth1/' $file or
>> > even sed -i -e 's/eth0/eth1/' -e  's/eth0/eth1/' $file depending on
>> > how sed handles this internally. These options would mean less flash writes
>> (although this might not be too important here).
>> 
>> A rewrite with sed is not sufficient, as we will possible rewrite uci section
>> names, possibly referenced elsewhere. We have to loop thru all interface
>> values and lists, rewriting each occurrence.
> 
> Actually, I could well live with that. What kind of references are you referring to?
> If just someone really named a section with ethX, it will be renamed consistently throught all uci files (unless they are stored in another location).
> Only in case someone uses a section name with ethX and refers to it e.g. in a custom script, this will be a problem.
> And this is where I think we do not have to account for every tiny possibility. If someone upgrades to another architecture, I think it's fair to expect him to check whether his custom scripts still work. We do not have to overdo it.
> But that's just my point of view at the moment.

I'm in the middle of migrating some devices from soon-to-be-obsolete 
ar71xx to ath79 target and was wondering about status of the eth0/eth1 
vs. LAN/WAN assignment issue.

I'm aware of the 8dde11d521 ("ath79: dts: drop "simple-mfd" for gmacs in 
SoC dtsi") [0] and following changes but that "fixed" the problem only 
for devices which were following already reversed (I wouldn't call it 
wrong or incorrect, I also prefer to have LAN on eth0 interface) SOC's 
GMACx <> ethx assignment/register under ar71xx target - e.g. LAN on eth0 
which is in fact SOC's GMAC1 and WAN on eth1 which is SOC's GMAC0. Good 
explanation of that inverted assignment can be found in Jeff's patch 
here: [1].

I have a feeling that the idea with migration script got abandoned 
(Adrian?), so I was wondering if there is any other way we could 
preserve ar71xx LAN/WAN <> ethX assignment in ath79?

For example, I have a QCA9531 based device with PHY4 (connected directly 
to GMAC0) labeled as LAN (and registered as eth0 in kernel) and PHY3 
(connected to GMAC1 over internal switch) labeled as WAN. On ath79, due 
to change introduced in 8dde11d521, LAN and WAN order gets swapped (as 
expected) but partially reverting above change (adding back "simple-mfd" 
to eth1 in device's DTS, see below) brings back the "old" order of 
interfaces.

&eth1 {
	compatible = "qca,ar9330-eth", "syscon", "simple-mfd";
	mtd-mac-address = <&art 0x6>;
};

But it doesn't seem as a proper fix to me (maybe I'm wrong?) thus the 
question about any other, better approach?

[0] https://github.com/openwrt/openwrt/commit/8dde11d521
[1] 
https://www.mail-archive.com/openwrt-devel@lists.openwrt.org/msg48526.html
Adrian Schmutzler Jan. 21, 2020, 2:10 p.m. UTC | #4
Hi,

> -----Original Message-----
> From: openwrt-devel [mailto:openwrt-devel-bounces@lists.openwrt.org] On
> Behalf Of Piotr Dymacz
> Sent: Dienstag, 21. Januar 2020 00:34
> To: mail@adrianschmutzler.de; 'David Bauer' <mail@david-bauer.net>;
> gch981213@gmail.com
> Cc: openwrt-devel@lists.openwrt.org
> Subject: Re: [OpenWrt-Devel] Migration in ath79 for swapped ethernet
> 
> Hi Adrian, David, Chuanhong,
> 
> On 07.09.2019 12:15, mail@adrianschmutzler.de wrote:
> > Hi,
> >
> >> > However, this will obviously swap eth0/eth1 on EVERY upgrade, not just
> >> when coming from ar71xx.
> >> > So, does anyone have an idea how to limit this to run only when updated
> >> from ar71xx?
> >>
> >> I was thinking about the same. As we have no information about the
> >> previously installed platform, i was thinking about abusing the wmac path we
> >> already use to migrate the WiFi configuration.
> >> However, i think this is not the most elegant way to solve this issue.
> >
> > I have to think about that. I recently thought one could just check whether the
> lan/wan assignment matches the one expected for ar71xx, but that would
> obviously also catch cases were the user modified it to be like this.
> >
> >>
> >> > Despite, while having the abstraction of "rename_all_eth", I wonder
> >> whether it would be possible and desirable to do all renames in one step:
> >> > sed -i -e 's/eth0/ethX/' -e 's/eth1/eth0/' -e 's/ethX/eth1/' $file or
> >> > even sed -i -e 's/eth0/eth1/' -e  's/eth0/eth1/' $file depending on
> >> > how sed handles this internally. These options would mean less flash writes
> >> (although this might not be too important here).
> >>
> >> A rewrite with sed is not sufficient, as we will possible rewrite uci section
> >> names, possibly referenced elsewhere. We have to loop thru all interface
> >> values and lists, rewriting each occurrence.
> >
> > Actually, I could well live with that. What kind of references are you referring
> to?
> > If just someone really named a section with ethX, it will be renamed
> consistently throught all uci files (unless they are stored in another location).
> > Only in case someone uses a section name with ethX and refers to it e.g. in a
> custom script, this will be a problem.
> > And this is where I think we do not have to account for every tiny possibility. If
> someone upgrades to another architecture, I think it's fair to expect him to check
> whether his custom scripts still work. We do not have to overdo it.
> > But that's just my point of view at the moment.
> 
> I'm in the middle of migrating some devices from soon-to-be-obsolete
> ar71xx to ath79 target and was wondering about status of the eth0/eth1
> vs. LAN/WAN assignment issue.

To start with the end: I've decided to stop working on this.

The two major problems are obvious:
1. How to make sure we find every possible location of eth0/eth1 in user code

This is a problem which can be solved, and if it does not cover every single special case I could live with it.

2. How to find out whether we are updating from ar71xx or not.

This is a hard one: We cannot rely on the ethernet setup itself, as the user might have changed it for whatever reason. We could rely on some other parameters as suggested (wmac path etc.), but that would not be generally applicable and still would impose some boundary conditions (e.g. start before the wmac migration, as then config would be "old" and paths on the device would already be "new", making identification of the update possible).

An alternative way would be to exploit /etc/board.json for that, given that it is not updated during sysupgrade (I'm not sure what's happening here). If it is not updated, it would give us access to the configuration when the user installed the device, and without the changes the user would have made to /etc/config/network. One could then parse and compare /etc/board.json to some (device-specific) reference (e.g. wan=eth0) and base the decision to apply migration on that. Afterwards, a new /etc/board.json is generated, so the condition is not met anymore. Despite for the device-specific condition, this would also be a generally applicable concept.

All in all, this second problem (when to migrate) is the bigger problem. We also have a similar case in https://github.com/openwrt/openwrt/pull/2649 

So far for the technical aspects. From the organizational point of view, for a long time I thought I'm the only one caring about this topic. Since there was not much interest in bringing this to 19.07 before the release, I do not see much use of adding it afterwards now.

In any case, the migration script will be a complicated task and will certainly introduce cornercases as well. All in all, I do not think it's worth it, and we should keep to advise people to flash with "-n" that single time when upgrading from ar71xx to ath79. For the pros that will change their Ethernet setup by hand later without using "-n", I'd still provide the "easy" migrations like e.g. LED names.

> 
> I'm aware of the 8dde11d521 ("ath79: dts: drop "simple-mfd" for gmacs in
> SoC dtsi") [0] and following changes but that "fixed" the problem only
> for devices which were following already reversed (I wouldn't call it
> wrong or incorrect, I also prefer to have LAN on eth0 interface) SOC's
> GMACx <> ethx assignment/register under ar71xx target - e.g. LAN on eth0
> which is in fact SOC's GMAC1 and WAN on eth1 which is SOC's GMAC0. Good
> explanation of that inverted assignment can be found in Jeff's patch
> here: [1].

Well, effectively a lot of devices match ar71xx order again, but also several do not match anymore after that.

For the underlying logic, I think Chuanhong will be the best person to discuss with.

I've tried to start a list of devices where eth0/eth1 are swapped compared to ar71xx _now_ here:
https://openwrt.org/docs/guide-user/installation/ar71xx.to.ath79#devices_with_known_config_changes_without_migration_available

> 
> I have a feeling that the idea with migration script got abandoned
> (Adrian?), so I was wondering if there is any other way we could
> preserve ar71xx LAN/WAN <> ethX assignment in ath79?

See above, yes, I effectively abandoned that.

> 
> For example, I have a QCA9531 based device with PHY4 (connected directly
> to GMAC0) labeled as LAN (and registered as eth0 in kernel) and PHY3
> (connected to GMAC1 over internal switch) labeled as WAN. On ath79, due
> to change introduced in 8dde11d521, LAN and WAN order gets swapped (as
> expected) but partially reverting above change (adding back "simple-mfd"
> to eth1 in device's DTS, see below) brings back the "old" order of
> interfaces.
> 
> &eth1 {
> 	compatible = "qca,ar9330-eth", "syscon", "simple-mfd";
> 	mtd-mac-address = <&art 0x6>;
> };
> 
> But it doesn't seem as a proper fix to me (maybe I'm wrong?) thus the
> question about any other, better approach?

That's how I feel. For me, this always looked like a hack to me (based on my shallow level of understanding, though).
There might be special cases where this is necessary (e.g. force a device to be eth0 due to failsafe), but I still do not like it.

With the first device where I observed the swapped eth0/eth1, the GLinet AR150, Chuanhong explained me that the setup in ath79 would be more correct than the one in ar71xx.
After all, if we advise to flash with -n anyway, I would prefer to have the "more correct" setup in ath79 compared to having to stick to the setup from ar71xx where that applies.

So, no matter what we do, there is no easy way forward.

Best

Adrian

> 
> [0] https://github.com/openwrt/openwrt/commit/8dde11d521
> [1]
> https://www.mail-archive.com/openwrt-
> devel@lists.openwrt.org/msg48526.html
> 
> --
> Cheers,
> Piotr
> 
> 
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Christian Marangi Jan. 21, 2020, 2:32 p.m. UTC | #5
Why not add an additional step in the sysupgrade function. Something
to alert people that switch configuration needs to be migrated and
give the option to do the change itself, use a migration script or use
the default option? I think something like that should be implemented
anyway since it's needed if for some reason in the future we switch
target like mvebu or ipq to dsa drivers...

Il giorno mar 21 gen 2020 alle ore 15:10 Adrian Schmutzler
<mail@adrianschmutzler.de> ha scritto:
>
> Hi,
>
> > -----Original Message-----
> > From: openwrt-devel [mailto:openwrt-devel-bounces@lists.openwrt.org] On
> > Behalf Of Piotr Dymacz
> > Sent: Dienstag, 21. Januar 2020 00:34
> > To: mail@adrianschmutzler.de; 'David Bauer' <mail@david-bauer.net>;
> > gch981213@gmail.com
> > Cc: openwrt-devel@lists.openwrt.org
> > Subject: Re: [OpenWrt-Devel] Migration in ath79 for swapped ethernet
> >
> > Hi Adrian, David, Chuanhong,
> >
> > On 07.09.2019 12:15, mail@adrianschmutzler.de wrote:
> > > Hi,
> > >
> > >> > However, this will obviously swap eth0/eth1 on EVERY upgrade, not just
> > >> when coming from ar71xx.
> > >> > So, does anyone have an idea how to limit this to run only when updated
> > >> from ar71xx?
> > >>
> > >> I was thinking about the same. As we have no information about the
> > >> previously installed platform, i was thinking about abusing the wmac path we
> > >> already use to migrate the WiFi configuration.
> > >> However, i think this is not the most elegant way to solve this issue.
> > >
> > > I have to think about that. I recently thought one could just check whether the
> > lan/wan assignment matches the one expected for ar71xx, but that would
> > obviously also catch cases were the user modified it to be like this.
> > >
> > >>
> > >> > Despite, while having the abstraction of "rename_all_eth", I wonder
> > >> whether it would be possible and desirable to do all renames in one step:
> > >> > sed -i -e 's/eth0/ethX/' -e 's/eth1/eth0/' -e 's/ethX/eth1/' $file or
> > >> > even sed -i -e 's/eth0/eth1/' -e  's/eth0/eth1/' $file depending on
> > >> > how sed handles this internally. These options would mean less flash writes
> > >> (although this might not be too important here).
> > >>
> > >> A rewrite with sed is not sufficient, as we will possible rewrite uci section
> > >> names, possibly referenced elsewhere. We have to loop thru all interface
> > >> values and lists, rewriting each occurrence.
> > >
> > > Actually, I could well live with that. What kind of references are you referring
> > to?
> > > If just someone really named a section with ethX, it will be renamed
> > consistently throught all uci files (unless they are stored in another location).
> > > Only in case someone uses a section name with ethX and refers to it e.g. in a
> > custom script, this will be a problem.
> > > And this is where I think we do not have to account for every tiny possibility. If
> > someone upgrades to another architecture, I think it's fair to expect him to check
> > whether his custom scripts still work. We do not have to overdo it.
> > > But that's just my point of view at the moment.
> >
> > I'm in the middle of migrating some devices from soon-to-be-obsolete
> > ar71xx to ath79 target and was wondering about status of the eth0/eth1
> > vs. LAN/WAN assignment issue.
>
> To start with the end: I've decided to stop working on this.
>
> The two major problems are obvious:
> 1. How to make sure we find every possible location of eth0/eth1 in user code
>
> This is a problem which can be solved, and if it does not cover every single special case I could live with it.
>
> 2. How to find out whether we are updating from ar71xx or not.
>
> This is a hard one: We cannot rely on the ethernet setup itself, as the user might have changed it for whatever reason. We could rely on some other parameters as suggested (wmac path etc.), but that would not be generally applicable and still would impose some boundary conditions (e.g. start before the wmac migration, as then config would be "old" and paths on the device would already be "new", making identification of the update possible).
>
> An alternative way would be to exploit /etc/board.json for that, given that it is not updated during sysupgrade (I'm not sure what's happening here). If it is not updated, it would give us access to the configuration when the user installed the device, and without the changes the user would have made to /etc/config/network. One could then parse and compare /etc/board.json to some (device-specific) reference (e.g. wan=eth0) and base the decision to apply migration on that. Afterwards, a new /etc/board.json is generated, so the condition is not met anymore. Despite for the device-specific condition, this would also be a generally applicable concept.
>
> All in all, this second problem (when to migrate) is the bigger problem. We also have a similar case in https://github.com/openwrt/openwrt/pull/2649
>
> So far for the technical aspects. From the organizational point of view, for a long time I thought I'm the only one caring about this topic. Since there was not much interest in bringing this to 19.07 before the release, I do not see much use of adding it afterwards now.
>
> In any case, the migration script will be a complicated task and will certainly introduce cornercases as well. All in all, I do not think it's worth it, and we should keep to advise people to flash with "-n" that single time when upgrading from ar71xx to ath79. For the pros that will change their Ethernet setup by hand later without using "-n", I'd still provide the "easy" migrations like e.g. LED names.
>
> >
> > I'm aware of the 8dde11d521 ("ath79: dts: drop "simple-mfd" for gmacs in
> > SoC dtsi") [0] and following changes but that "fixed" the problem only
> > for devices which were following already reversed (I wouldn't call it
> > wrong or incorrect, I also prefer to have LAN on eth0 interface) SOC's
> > GMACx <> ethx assignment/register under ar71xx target - e.g. LAN on eth0
> > which is in fact SOC's GMAC1 and WAN on eth1 which is SOC's GMAC0. Good
> > explanation of that inverted assignment can be found in Jeff's patch
> > here: [1].
>
> Well, effectively a lot of devices match ar71xx order again, but also several do not match anymore after that.
>
> For the underlying logic, I think Chuanhong will be the best person to discuss with.
>
> I've tried to start a list of devices where eth0/eth1 are swapped compared to ar71xx _now_ here:
> https://openwrt.org/docs/guide-user/installation/ar71xx.to.ath79#devices_with_known_config_changes_without_migration_available
>
> >
> > I have a feeling that the idea with migration script got abandoned
> > (Adrian?), so I was wondering if there is any other way we could
> > preserve ar71xx LAN/WAN <> ethX assignment in ath79?
>
> See above, yes, I effectively abandoned that.
>
> >
> > For example, I have a QCA9531 based device with PHY4 (connected directly
> > to GMAC0) labeled as LAN (and registered as eth0 in kernel) and PHY3
> > (connected to GMAC1 over internal switch) labeled as WAN. On ath79, due
> > to change introduced in 8dde11d521, LAN and WAN order gets swapped (as
> > expected) but partially reverting above change (adding back "simple-mfd"
> > to eth1 in device's DTS, see below) brings back the "old" order of
> > interfaces.
> >
> > &eth1 {
> >       compatible = "qca,ar9330-eth", "syscon", "simple-mfd";
> >       mtd-mac-address = <&art 0x6>;
> > };
> >
> > But it doesn't seem as a proper fix to me (maybe I'm wrong?) thus the
> > question about any other, better approach?
>
> That's how I feel. For me, this always looked like a hack to me (based on my shallow level of understanding, though).
> There might be special cases where this is necessary (e.g. force a device to be eth0 due to failsafe), but I still do not like it.
>
> With the first device where I observed the swapped eth0/eth1, the GLinet AR150, Chuanhong explained me that the setup in ath79 would be more correct than the one in ar71xx.
> After all, if we advise to flash with -n anyway, I would prefer to have the "more correct" setup in ath79 compared to having to stick to the setup from ar71xx where that applies.
>
> So, no matter what we do, there is no easy way forward.
>
> Best
>
> Adrian
>
> >
> > [0] https://github.com/openwrt/openwrt/commit/8dde11d521
> > [1]
> > https://www.mail-archive.com/openwrt-
> > devel@lists.openwrt.org/msg48526.html
> >
> > --
> > Cheers,
> > Piotr
> >
> >
> > _______________________________________________
> > openwrt-devel mailing list
> > openwrt-devel@lists.openwrt.org
> > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Piotr Dymacz Jan. 27, 2020, 6:21 p.m. UTC | #6
Hi Adrian,

On 21.01.2020 15:10, Adrian Schmutzler wrote:

[...]

>> I'm in the middle of migrating some devices from soon-to-be-obsolete
>> ar71xx to ath79 target and was wondering about status of the eth0/eth1
>> vs. LAN/WAN assignment issue.
> 
> To start with the end: I've decided to stop working on this.
> 
> The two major problems are obvious:
> 1. How to make sure we find every possible location of eth0/eth1 in user code
> 
> This is a problem which can be solved, and if it does not cover every single special case I could live with it.
> 
> 2. How to find out whether we are updating from ar71xx or not.
> 
> This is a hard one: We cannot rely on the ethernet setup itself, as the user might have changed it for whatever reason. We could rely on some other parameters as suggested (wmac path etc.), but that would not be generally applicable and still would impose some boundary conditions (e.g. start before the wmac migration, as then config would be "old" and paths on the device would already be "new", making identification of the update possible).
> 
> An alternative way would be to exploit /etc/board.json for that, given that it is not updated during sysupgrade (I'm not sure what's happening here). If it is not updated, it would give us access to the configuration when the user installed the device, and without the changes the user would have made to /etc/config/network. One could then parse and compare /etc/board.json to some (device-specific) reference (e.g. wan=eth0) and base the decision to apply migration on that. Afterwards, a new /etc/board.json is generated, so the condition is not met anymore. Despite for the device-specific condition, this would also be a generally applicable concept.

IMHO, that would never look like a clean and nice solution and we would 
need to carry it in code for who knows how long (imagine some ar71xx 
board will get migrated after 20.x release).

> All in all, this second problem (when to migrate) is the bigger problem. We also have a similar case in https://github.com/openwrt/openwrt/pull/2649
> 
> So far for the technical aspects. From the organizational point of view, for a long time I thought I'm the only one caring about this topic. Since there was not much interest in bringing this to 19.07 before the release, I do not see much use of adding it afterwards now.

As the 19.07 was released with ar71xx I didn't consider that important 
at the time. Now it's time to consider it as a problem and prepare 
solution _before_ the next release which won't include ar71xx.

> In any case, the migration script will be a complicated task and will certainly introduce cornercases as well. All in all, I do not think it's worth it, and we should keep to advise people to flash with "-n" that single time when upgrading from ar71xx to ath79. For the pros that will change their Ethernet setup by hand later without using "-n", I'd still provide the "easy" migrations like e.g. LED names.

At the very beginning, ath79 was considered as a brand new target 
without _any_ concerns about migration path from ar71xx. But then, 
things got complicated (broken).

Either we support seamless ar71xx -> ath79 migration for _all_ devices 
supported in both targets or we just... don't. There shouldn't be cases 
where user has to check or ask whether owned device can be upgraded with 
preserving settings.

And I really don't consider LED naming migration as important as network 
interfaces naming swap (LED naming convention in upstream got changed 
anyway so we are expecting another change/migration at some point in 
future). Also, LEDs names isn't the only problem, in some cases type of 
trigger has to be changed (e.g. netdev vs. switch).
>> I'm aware of the 8dde11d521 ("ath79: dts: drop "simple-mfd" for gmacs in
>> SoC dtsi") [0] and following changes but that "fixed" the problem only
>> for devices which were following already reversed (I wouldn't call it
>> wrong or incorrect, I also prefer to have LAN on eth0 interface) SOC's
>> GMACx <> ethx assignment/register under ar71xx target - e.g. LAN on eth0
>> which is in fact SOC's GMAC1 and WAN on eth1 which is SOC's GMAC0. Good
>> explanation of that inverted assignment can be found in Jeff's patch
>> here: [1].
> 
> Well, effectively a lot of devices match ar71xx order again, but also several do not match anymore after that.
> 
> For the underlying logic, I think Chuanhong will be the best person to discuss with.

Chuanhong, could you join the discussion?

> I've tried to start a list of devices where eth0/eth1 are swapped compared to ar71xx _now_ here:
> https://openwrt.org/docs/guide-user/installation/ar71xx.to.ath79#devices_with_known_config_changes_without_migration_available

There is easy way to check GMACx <> ethX assignment order in mach-*.c 
files. Just check order of ath79_register_eth() calls:

ath79_register_eth(0);
ath79_register_eth(1);

Will register GMAC0 as eth0, GMAC1 as eth1

ath79_register_eth(1);
ath79_register_eth(0);

Will register GMAC1 as eth0, GMAC0 as eth1 (current ath79 "order")

>> I have a feeling that the idea with migration script got abandoned
>> (Adrian?), so I was wondering if there is any other way we could
>> preserve ar71xx LAN/WAN <> ethX assignment in ath79?
> 
> See above, yes, I effectively abandoned that.

Got it, so alternative solution is required.

>> For example, I have a QCA9531 based device with PHY4 (connected directly
>> to GMAC0) labeled as LAN (and registered as eth0 in kernel) and PHY3
>> (connected to GMAC1 over internal switch) labeled as WAN. On ath79, due
>> to change introduced in 8dde11d521, LAN and WAN order gets swapped (as
>> expected) but partially reverting above change (adding back "simple-mfd"
>> to eth1 in device's DTS, see below) brings back the "old" order of
>> interfaces.
>> 
>> &eth1 {
>> 	compatible = "qca,ar9330-eth", "syscon", "simple-mfd";
>> 	mtd-mac-address = <&art 0x6>;
>> };
>> 
>> But it doesn't seem as a proper fix to me (maybe I'm wrong?) thus the
>> question about any other, better approach?
> 
> That's how I feel. For me, this always looked like a hack to me (based on my shallow level of understanding, though).
> There might be special cases where this is necessary (e.g. force a device to be eth0 due to failsafe), but I still do not like it.

I was considering also aliases in DTSes.

> With the first device where I observed the swapped eth0/eth1, the GLinet AR150, Chuanhong explained me that the setup in ath79 would be more correct than the one in ar71xx.
> After all, if we advise to flash with -n anyway, I would prefer to have the "more correct" setup in ath79 compared to having to stick to the setup from ar71xx where that applies.

It's just semantics. I don't think there is a "more correct" setup here. 
And what's more, there is no single "correct" setup in ar71xx either as 
you could register GMACs in two different orders (see above comment 
about mach-*.c files).

> So, no matter what we do, there is no easy way forward.

We could remove all ar71xx -> ath79 migration helper scripts, ar71xx 
board names from supported devices lists in ath79 images and make the 
target a brand new, without any concerns about soon-to-be obsolete ar71xx ;)
Adrian Schmutzler Jan. 27, 2020, 6:35 p.m. UTC | #7
Just a quick one:

> > So, no matter what we do, there is no easy way forward.
> 
> We could remove all ar71xx -> ath79 migration helper scripts, ar71xx
> board names from supported devices lists in ath79 images and make the
> target a brand new, without any concerns about soon-to-be obsolete ar71xx ;)

At the moment, I'm actually mostly inclined towards this solution.

However, for me personally SUPPORTED_DEVICES was always more a "don't brick entirely" flag, so I never expected it to ensure 100 % config compatibility. More like preventing me from flashing ubnt,unifi image onto tplink,wdr-4300-v1. This impression might have been wrong, though.

But as mentioned by Ansuel, there are other incompatible switches to come (and some are already waiting), and unless we want to create new targets or rename devices in these cases, we have to think about different "levels" of compatibility anyway beyond ar71xx->ath79.

Best

Adrian
Peter Geis Jan. 27, 2020, 6:57 p.m. UTC | #8
On Mon, Jan 27, 2020 at 1:35 PM Adrian Schmutzler
<mail@adrianschmutzler.de> wrote:
>
> Just a quick one:
>
> > > So, no matter what we do, there is no easy way forward.
> >
> > We could remove all ar71xx -> ath79 migration helper scripts, ar71xx
> > board names from supported devices lists in ath79 images and make the
> > target a brand new, without any concerns about soon-to-be obsolete ar71xx ;)
>
> At the moment, I'm actually mostly inclined towards this solution.
>
> However, for me personally SUPPORTED_DEVICES was always more a "don't brick entirely" flag, so I never expected it to ensure 100 % config compatibility. More like preventing me from flashing ubnt,unifi image onto tplink,wdr-4300-v1. This impression might have been wrong, though.
>
> But as mentioned by Ansuel, there are other incompatible switches to come (and some are already waiting), and unless we want to create new targets or rename devices in these cases, we have to think about different "levels" of compatibility anyway beyond ar71xx->ath79.

Wouldn't it be reasonable to put up a warning that migrating from
ar71xx->ath79 will require a reset of networking configuration?
Then all you need to do is detect when that sort of upgrade is
occurring and nuke the requisite files.

Also I don't know bout y'all, but I usually take a major revision
upgrade as an opportunity to hard reset and rebuild anyways. (Years of
ingrained ddwrt habits)

>
> Best
>
> Adrian
>
>
>
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Piotr Dymacz Jan. 27, 2020, 8:45 p.m. UTC | #9
Hi Adrian,

On 27.01.2020 19:35, Adrian Schmutzler wrote:
> Just a quick one:
> 
>> > So, no matter what we do, there is no easy way forward.
>> 
>> We could remove all ar71xx -> ath79 migration helper scripts, ar71xx
>> board names from supported devices lists in ath79 images and make the
>> target a brand new, without any concerns about soon-to-be obsolete ar71xx ;)
> 
> At the moment, I'm actually mostly inclined towards this solution.

I'm afraid it's a bit late for that as 19.07 is already out and it 
supports (at least partially) ar71xx -> ath79 migration path/s.
Wouldn't that look unprofessional? Am I overreacting here?

> However, for me personally SUPPORTED_DEVICES was always more a "don't brick entirely" flag, so I never expected it to ensure 100 % config compatibility. More like preventing me from flashing ubnt,unifi image onto tplink,wdr-4300-v1. This impression might have been wrong, though.

I think device to image matching was the main reason behind the idea. 
IIRC, mismatched image doesn't prevent you against upgrading with 
preserved settings.

> But as mentioned by Ansuel, there are other incompatible switches to come (and some are already waiting), and unless we want to create new targets or rename devices in these cases, we have to think about different "levels" of compatibility anyway beyond ar71xx->ath79.

I believe ar71xx -> ath79 is a special case here. First of all, that's a 
new DTS-enabled target and it was suppose to _replace_ ar71xx but 19.07 
went out with both of them and I'm pretty sure there are users who got 
confused with that (some devices are supported only in one of the 
targets, some in both, some with seamless migration possible). On the 
other hand, when ar71xx gets abandoned, we (as a project) should make it 
clear if ath79 is a replacement (thus providing seamless upgrade from 
ar71xx) or a new target, without any relationship with ar71xx (thus a 
clean sysupgrade is required). Keeping anything in between would just 
confuse people.

DSA is slightly different topic as it will touch many different targets 
(also ath79, think about qca8k) so probably a project-wide solution 
would be required.
Piotr Dymacz Jan. 27, 2020, 9 p.m. UTC | #10
Hi Peter,

On 27.01.2020 19:57, Peter Geis wrote:
> On Mon, Jan 27, 2020 at 1:35 PM Adrian Schmutzler
> <mail@adrianschmutzler.de> wrote:
>>
>> Just a quick one:
>>
>> > > So, no matter what we do, there is no easy way forward.
>> >
>> > We could remove all ar71xx -> ath79 migration helper scripts, ar71xx
>> > board names from supported devices lists in ath79 images and make the
>> > target a brand new, without any concerns about soon-to-be obsolete ar71xx ;)
>>
>> At the moment, I'm actually mostly inclined towards this solution.
>>
>> However, for me personally SUPPORTED_DEVICES was always more a "don't brick entirely" flag, so I never expected it to ensure 100 % config compatibility. More like preventing me from flashing ubnt,unifi image onto tplink,wdr-4300-v1. This impression might have been wrong, though.
>>
>> But as mentioned by Ansuel, there are other incompatible switches to come (and some are already waiting), and unless we want to create new targets or rename devices in these cases, we have to think about different "levels" of compatibility anyway beyond ar71xx->ath79.
> 
> Wouldn't it be reasonable to put up a warning that migrating from
> ar71xx->ath79 will require a reset of networking configuration?
> Then all you need to do is detect when that sort of upgrade is
> occurring and nuke the requisite files.

I believe we already have such a 'warning' on the Wiki: [0]. The exact 
problem is 'detecting that sort of upgrade' (what about user migrating 
device under 19.07 but between ar71xx -> ath79 and then back to ar71xx?).
Also, the problem doesn't affect all the devices so the users have to 
first check whether the device they migrate/upgrade has to be 
(sys)upgraded without preserved settings or not.

> Also I don't know bout y'all, but I usually take a major revision
> upgrade as an opportunity to hard reset and rebuild anyways. (Years of
> ingrained ddwrt habits)

But it's not a general rule and, at least in case of generic/basic 
settings, user shouldn't be worried about upgrading between major 
versions with preserving settings. Otherwise, the whole idea doesn't 
make much sense and we should just prevent settings backup during major 
versions switch.

[0] 
https://openwrt.org/docs/guide-user/installation/ar71xx.to.ath79#upgrade_from_ar71xx_to_ath79
Peter Geis Jan. 27, 2020, 9:05 p.m. UTC | #11
On Mon, Jan 27, 2020 at 4:00 PM Piotr Dymacz <pepe2k@gmail.com> wrote:
>
> Hi Peter,
>
> On 27.01.2020 19:57, Peter Geis wrote:
> > On Mon, Jan 27, 2020 at 1:35 PM Adrian Schmutzler
> > <mail@adrianschmutzler.de> wrote:
> >>
> >> Just a quick one:
> >>
> >> > > So, no matter what we do, there is no easy way forward.
> >> >
> >> > We could remove all ar71xx -> ath79 migration helper scripts, ar71xx
> >> > board names from supported devices lists in ath79 images and make the
> >> > target a brand new, without any concerns about soon-to-be obsolete ar71xx ;)
> >>
> >> At the moment, I'm actually mostly inclined towards this solution.
> >>
> >> However, for me personally SUPPORTED_DEVICES was always more a "don't brick entirely" flag, so I never expected it to ensure 100 % config compatibility. More like preventing me from flashing ubnt,unifi image onto tplink,wdr-4300-v1. This impression might have been wrong, though.
> >>
> >> But as mentioned by Ansuel, there are other incompatible switches to come (and some are already waiting), and unless we want to create new targets or rename devices in these cases, we have to think about different "levels" of compatibility anyway beyond ar71xx->ath79.
> >
> > Wouldn't it be reasonable to put up a warning that migrating from
> > ar71xx->ath79 will require a reset of networking configuration?
> > Then all you need to do is detect when that sort of upgrade is
> > occurring and nuke the requisite files.
>
> I believe we already have such a 'warning' on the Wiki: [0]. The exact
> problem is 'detecting that sort of upgrade' (what about user migrating
> device under 19.07 but between ar71xx -> ath79 and then back to ar71xx?).
> Also, the problem doesn't affect all the devices so the users have to
> first check whether the device they migrate/upgrade has to be
> (sys)upgraded without preserved settings or not.
>
> > Also I don't know bout y'all, but I usually take a major revision
> > upgrade as an opportunity to hard reset and rebuild anyways. (Years of
> > ingrained ddwrt habits)
>
> But it's not a general rule and, at least in case of generic/basic
> settings, user shouldn't be worried about upgrading between major
> versions with preserving settings. Otherwise, the whole idea doesn't
> make much sense and we should just prevent settings backup during major
> versions switch.

Excellent point!
Here's an odd possibility, just to throw at the wall.
What about building a special transition image, specifically for
moving from ar71xx to ath79.
If you want to retain the ability to return to ar71xx have an image to
go the opposite way.

Or a metadata package to do the conversion post flash.

Because the option (which seems pretty drastic, unless the size could
be minimized) of having a near permanent conversion script built into
the firmware seems rather costly.

>
> [0]
> https://openwrt.org/docs/guide-user/installation/ar71xx.to.ath79#upgrade_from_ar71xx_to_ath79
>
> --
> Cheers,
> Piotr
>
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Adrian Schmutzler Jan. 28, 2020, 3:48 p.m. UTC | #12
Hi Piotr,

> -----Original Message-----
> From: openwrt-devel [mailto:openwrt-devel-bounces@lists.openwrt.org] On
> Behalf Of Piotr Dymacz
> Sent: Montag, 27. Januar 2020 21:45
> To: Adrian Schmutzler <mail@adrianschmutzler.de>
> Cc: openwrt-devel@lists.openwrt.org; gch981213@gmail.com;
> ansuelsmth@gmail.com; 'David Bauer' <mail@david-bauer.net>
> Subject: Re: [OpenWrt-Devel] Migration in ath79 for swapped ethernet
> 
> Hi Adrian,
> 
> On 27.01.2020 19:35, Adrian Schmutzler wrote:
> > Just a quick one:
> >
> >> > So, no matter what we do, there is no easy way forward.
> >>
> >> We could remove all ar71xx -> ath79 migration helper scripts, ar71xx
> >> board names from supported devices lists in ath79 images and make the
> >> target a brand new, without any concerns about soon-to-be obsolete ar71xx
;)
> >
> > At the moment, I'm actually mostly inclined towards this solution.
> 
> I'm afraid it's a bit late for that as 19.07 is already out and it
> supports (at least partially) ar71xx -> ath79 migration path/s.
> Wouldn't that look unprofessional? Am I overreacting here?

One didn't have to use -F during sysupgrade, but the release notes gave the
clear advice to upgrade without keeping settings.
So, IMO we actually didn't "support" any migration in 19.07.0.

> 
> > However, for me personally SUPPORTED_DEVICES was always more a "don't
> brick entirely" flag, so I never expected it to ensure 100 % config
compatibility.
> More like preventing me from flashing ubnt,unifi image onto
tplink,wdr-4300-v1.
> This impression might have been wrong, though.
> 
> I think device to image matching was the main reason behind the idea.
> IIRC, mismatched image doesn't prevent you against upgrading with
> preserved settings.
> 
> > But as mentioned by Ansuel, there are other incompatible switches to come
> (and some are already waiting), and unless we want to create new targets or
> rename devices in these cases, we have to think about different "levels" of
> compatibility anyway beyond ar71xx->ath79.
> 
> I believe ar71xx -> ath79 is a special case here. First of all, that's a
> new DTS-enabled target and it was suppose to _replace_ ar71xx but 19.07
> went out with both of them and I'm pretty sure there are users who got
> confused with that (some devices are supported only in one of the
> targets, some in both, some with seamless migration possible). On the
> other hand, when ar71xx gets abandoned, we (as a project) should make it
> clear if ath79 is a replacement (thus providing seamless upgrade from
> ar71xx) or a new target, without any relationship with ar71xx (thus a
> clean sysupgrade is required). Keeping anything in between would just
> confuse people.

I do not really see a viable/desirable option to support eth migration at the
moment. And I'm not really a fan of adding lots of migration stuff which spoils
the new ath79 target already, so after all I think I also do not _want_ to add
eth migration either.

So, I'd prefer to see the ath79 new and clean.

However, from the wording perspective, I do not think that a "replacement" means
seamless upgrade. I'd thus keep the SUPPORTED_DEVICES just as a device-matching
measure, but wouldn't implement any sophisticated migration despite that. Having
SUPPORTED_DEVICES might actually be valuable for certain third parties, like I'm
involved in a downstream project that regenerates the system/network config at
each upgrade, but still exploits SUPPORTED_DEVICES for having the correct image.

And I could well live with keeping the already present migration scripts, as
having them as "undocumented feature" won't hurt. If we do not advertise it, it
won't confuse people ...

Best

Adrian



> 
> DSA is slightly different topic as it will touch many different targets
> (also ath79, think about qca8k) so probably a project-wide solution
> would be required.
> 
> --
> Cheers,
> Piotr
> 
> _______________________________________________
> openwrt-devel mailing list
> openwrt-devel@lists.openwrt.org
> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Adrian Schmutzler Jan. 28, 2020, 3:59 p.m. UTC | #13
Hi,

> There is easy way to check GMACx <> ethX assignment order in mach-*.c
> files. Just check order of ath79_register_eth() calls:
> 
> ath79_register_eth(0);
> ath79_register_eth(1);
> 
> Will register GMAC0 as eth0, GMAC1 as eth1
> 
> ath79_register_eth(1);
> ath79_register_eth(0);
> 
> Will register GMAC1 as eth0, GMAC0 as eth1 (current ath79 "order")

I thought that once as well, but then found several cases where I couldn't rely
on it for eth0/eth1 order on running device. (But it's too long ago for me to be
more specific.)

Despite, from what I understand our current problem is the driver implementation
in ath79, which needs to skip/delay initialization in certain cases.
So, it's not so much about finding out the situation on ar71xx, but
understanding the situation in ath79 as well. All-in-all, I think this will come
down to having to check each device manually.

> 
> >> I have a feeling that the idea with migration script got abandoned
> >> (Adrian?), so I was wondering if there is any other way we could
> >> preserve ar71xx LAN/WAN <> ethX assignment in ath79?
> >
> > See above, yes, I effectively abandoned that.
> 
> Got it, so alternative solution is required.
> 
> >> For example, I have a QCA9531 based device with PHY4 (connected directly
> >> to GMAC0) labeled as LAN (and registered as eth0 in kernel) and PHY3
> >> (connected to GMAC1 over internal switch) labeled as WAN. On ath79, due
> >> to change introduced in 8dde11d521, LAN and WAN order gets swapped (as
> >> expected) but partially reverting above change (adding back "simple-mfd"
> >> to eth1 in device's DTS, see below) brings back the "old" order of
> >> interfaces.
> >>
> >> &eth1 {
> >> 	compatible = "qca,ar9330-eth", "syscon", "simple-mfd";
> >> 	mtd-mac-address = <&art 0x6>;
> >> };
> >>
> >> But it doesn't seem as a proper fix to me (maybe I'm wrong?) thus the
> >> question about any other, better approach?
> >
> > That's how I feel. For me, this always looked like a hack to me (based on my
> shallow level of understanding, though).
> > There might be special cases where this is necessary (e.g. force a device to
be
> eth0 due to failsafe), but I still do not like it.
> 
> I was considering also aliases in DTSes.

One could use that for failsafe (actually quite an interesting idea) and for
specifying the corresponding ethX in ar71xx. However, this still won't help us
with the migration script itself.

Best

Adrian
Jeffery To Jan. 28, 2020, 6:01 p.m. UTC | #14
On Tue, Jan 28, 2020 at 11:48 PM Adrian Schmutzler <mail@adrianschmutzler.de>
wrote:

> I do not really see a viable/desirable option to support eth migration at
> the
> moment. And I'm not really a fan of adding lots of migration stuff which
> spoils
> the new ath79 target already, so after all I think I also do not _want_ to
> add
> eth migration either.
>

(I should first say that I don't know enough about the ar71xx-ath79
migration to know if this idea will work, but I don't recall seeing it
mentioned before.)

What if we add a migration package for 18.06 that ar71xx users can opt-in
and install, which (when the user initiates the process) will migrate their
config and perform an upgrade to 19.07 (ath79)? Their config would be
broken for 18.06 after the first step, but if the sysupgrade completes
successfully then their config works for 19.07 after the reboot. (Would be
nice if there is a way to roll back the config changes if the sysupgrade
fails.)

This is perhaps a variation of the transition image idea from Peter Geis,
but this would be less intrusive to the overall upgrade process (at least
for me).

Jeff
Adrian Schmutzler Jan. 29, 2020, 2:03 p.m. UTC | #15
Hi,

> (I should first say that I don't know enough about the ar71xx-ath79 migration to know if this idea will work, but I don't recall seeing it mentioned before.)
>
> What if we add a migration package for 18.06 that ar71xx users can opt-in and install, which (when the user initiates the process) will migrate their config and perform an upgrade to 19.07 (ath79)? Their config would > be broken for 18.06 after the first step, but if the sysupgrade completes successfully then their config works for 19.07 after the reboot. (Would be nice if there is a way to roll back the config changes if the sysupgrade fails.)
>
> This is perhaps a variation of the transition image idea from Peter Geis, but this would be less intrusive to the overall upgrade process (at least for me).

I like that idea in general, because it will spare us from determining whether the system needs to be updated (we will outsource that to the user for this particular case). Disadvantages are obviously that the user has to be (made) aware of that solution, so it's still not a "seamless" upgrade. And we still would have to prepare a suitable upgrade script, which we do not have at all ATM (though this might make it easier to write one).

Best

Adrian
Piotr Dymacz Jan. 29, 2020, 2:27 p.m. UTC | #16
Hi Peter,

On 27.01.2020 22:05, Peter Geis wrote:
> On Mon, Jan 27, 2020 at 4:00 PM Piotr Dymacz <pepe2k@gmail.com> wrote:

[...]

>> But it's not a general rule and, at least in case of generic/basic
>> settings, user shouldn't be worried about upgrading between major
>> versions with preserving settings. Otherwise, the whole idea doesn't
>> make much sense and we should just prevent settings backup during major
>> versions switch.
> 
> Excellent point!
> Here's an odd possibility, just to throw at the wall.
> What about building a special transition image, specifically for
> moving from ar71xx to ath79.
> If you want to retain the ability to return to ar71xx have an image to
> go the opposite way.

Simply, no.
Please don't make it way more complicated that it truly is.

> Or a metadata package to do the conversion post flash.

My goal is to solve the problem without re-inventing the wheel. I prefer 
to solve the problem with what we already have.

> Because the option (which seems pretty drastic, unless the size could
> be minimized) of having a near permanent conversion script built into
> the firmware seems rather costly.

Conversion scripts would be my last solution. I prefer to deal with the 
problem without too much overhead.
Piotr Dymacz Jan. 29, 2020, 3:07 p.m. UTC | #17
Hi Adrian,

On 28.01.2020 16:48, Adrian Schmutzler wrote:
> Hi Piotr,
> 
>> -----Original Message-----
>> From: openwrt-devel [mailto:openwrt-devel-bounces@lists.openwrt.org] On
>> Behalf Of Piotr Dymacz
>> Sent: Montag, 27. Januar 2020 21:45
>> To: Adrian Schmutzler <mail@adrianschmutzler.de>
>> Cc: openwrt-devel@lists.openwrt.org; gch981213@gmail.com;
>> ansuelsmth@gmail.com; 'David Bauer' <mail@david-bauer.net>
>> Subject: Re: [OpenWrt-Devel] Migration in ath79 for swapped ethernet
>> 
>> Hi Adrian,
>> 
>> On 27.01.2020 19:35, Adrian Schmutzler wrote:
>> > Just a quick one:
>> >
>> >> > So, no matter what we do, there is no easy way forward.
>> >>
>> >> We could remove all ar71xx -> ath79 migration helper scripts, ar71xx
>> >> board names from supported devices lists in ath79 images and make the
>> >> target a brand new, without any concerns about soon-to-be obsolete ar71xx
> ;)
>> >
>> > At the moment, I'm actually mostly inclined towards this solution.
>> 
>> I'm afraid it's a bit late for that as 19.07 is already out and it
>> supports (at least partially) ar71xx -> ath79 migration path/s.
>> Wouldn't that look unprofessional? Am I overreacting here?
> 
> One didn't have to use -F during sysupgrade, but the release notes gave the
> clear advice to upgrade without keeping settings.
> So, IMO we actually didn't "support" any migration in 19.07.0.

Fair point.

>> > However, for me personally SUPPORTED_DEVICES was always more a "don't
>> brick entirely" flag, so I never expected it to ensure 100 % config
> compatibility.
>> More like preventing me from flashing ubnt,unifi image onto
> tplink,wdr-4300-v1.
>> This impression might have been wrong, though.
>> 
>> I think device to image matching was the main reason behind the idea.
>> IIRC, mismatched image doesn't prevent you against upgrading with
>> preserved settings.
>> 
>> > But as mentioned by Ansuel, there are other incompatible switches to come
>> (and some are already waiting), and unless we want to create new targets or
>> rename devices in these cases, we have to think about different "levels" of
>> compatibility anyway beyond ar71xx->ath79.
>> 
>> I believe ar71xx -> ath79 is a special case here. First of all, that's a
>> new DTS-enabled target and it was suppose to _replace_ ar71xx but 19.07
>> went out with both of them and I'm pretty sure there are users who got
>> confused with that (some devices are supported only in one of the
>> targets, some in both, some with seamless migration possible). On the
>> other hand, when ar71xx gets abandoned, we (as a project) should make it
>> clear if ath79 is a replacement (thus providing seamless upgrade from
>> ar71xx) or a new target, without any relationship with ar71xx (thus a
>> clean sysupgrade is required). Keeping anything in between would just
>> confuse people.
> 
> I do not really see a viable/desirable option to support eth migration at the
> moment. And I'm not really a fan of adding lots of migration stuff which spoils
> the new ath79 target already, so after all I think I also do not _want_ to add
> eth migration either.

I wouldn't want to introduce _now_ any extra 'migration' scripts, code, 
etc. All I really want is to keep mapping of physical interface to 
kernel interface name consistent with previous releases as this is the 
most important thing when you perform upgrade on e.g. remote-only 
accessible devices (or those on mast connected with single eth cable). 
And it's not only about 19.07 and 18.06. There are devices in ar71xx 
which got supported before the LEDE 17.01 release and I'm working on 
keeping them alive within ath79 target.

> So, I'd prefer to see the ath79 new and clean.
> 
> However, from the wording perspective, I do not think that a "replacement" means
> seamless upgrade. I'd thus keep the SUPPORTED_DEVICES just as a device-matching
> measure, but wouldn't implement any sophisticated migration despite that. Having
> SUPPORTED_DEVICES might actually be valuable for certain third parties, like I'm
> involved in a downstream project that regenerates the system/network config at
> each upgrade, but still exploits SUPPORTED_DEVICES for having the correct image.

If you prefer 'new and clean' ath79 then _IMHO_ ar71xx board names must 
be removed from SUPPORTED_DEVICES lists together with migration scripts 
in ath79. If downstream projects want to keep that 'mess', it would be 
up to them. It should be clear that the ath79 target isn't associated 
with ar71xx. IMHO, anything in between would be only an unmaintainable 
mess (just see your recent fixes regarding SUPPORTED_DEVICES in ath79: 
[0], [1]).

> And I could well live with keeping the already present migration scripts, as
> having them as "undocumented feature" won't hurt. If we do not advertise it, it
> won't confuse people ...

This smells for me like something easy to forget which would then get 
removed ~few years later during some gardening performed by newcomer 
without any knowledge about very-long-time-ago-dropped ar71xx :)

[0] https://git.openwrt.org/47935940d67147e3ec8dbfcb56ae14f1235369c5
[1] https://git.openwrt.org/da5b5ae9b9647f50853bff96309d1435cddcefce
Piotr Dymacz Jan. 29, 2020, 3:18 p.m. UTC | #18
Hi Adrian,

On 28.01.2020 16:59, Adrian Schmutzler wrote:
> Hi,
> 
>> There is easy way to check GMACx <> ethX assignment order in mach-*.c
>> files. Just check order of ath79_register_eth() calls:
>> 
>> ath79_register_eth(0);
>> ath79_register_eth(1);
>> 
>> Will register GMAC0 as eth0, GMAC1 as eth1
>> 
>> ath79_register_eth(1);
>> ath79_register_eth(0);
>> 
>> Will register GMAC1 as eth0, GMAC0 as eth1 (current ath79 "order")
> 
> I thought that once as well, but then found several cases where I couldn't rely
> on it for eth0/eth1 order on running device. (But it's too long ago for me to be
> more specific.)

There are two things here:
1. Mapping of physical ports to kernel interface naming.
2. Mapping of kernel interfaces to 'our' LAN/WAN system interfaces.

AFAIK, there wasn't any official or general rule about what should be 
LAN and WAN (in terms of mapping ethX to 'our' LAN/WAN), so it was 
always up to the device support author (personally I preferred to have 
LAN on eth0). And as you can see above, ar71xx allowed two different 
orders in which ethX interfaces where registered in kernel.

> Despite, from what I understand our current problem is the driver implementation
> in ath79, which needs to skip/delay initialization in certain cases.
> So, it's not so much about finding out the situation on ar71xx, but
> understanding the situation in ath79 as well. All-in-all, I think this will come
> down to having to check each device manually.

More or less, yes.

>> >> I have a feeling that the idea with migration script got abandoned
>> >> (Adrian?), so I was wondering if there is any other way we could
>> >> preserve ar71xx LAN/WAN <> ethX assignment in ath79?
>> >
>> > See above, yes, I effectively abandoned that.
>> 
>> Got it, so alternative solution is required.
>> 
>> >> For example, I have a QCA9531 based device with PHY4 (connected directly
>> >> to GMAC0) labeled as LAN (and registered as eth0 in kernel) and PHY3
>> >> (connected to GMAC1 over internal switch) labeled as WAN. On ath79, due
>> >> to change introduced in 8dde11d521, LAN and WAN order gets swapped (as
>> >> expected) but partially reverting above change (adding back "simple-mfd"
>> >> to eth1 in device's DTS, see below) brings back the "old" order of
>> >> interfaces.
>> >>
>> >> &eth1 {
>> >> 	compatible = "qca,ar9330-eth", "syscon", "simple-mfd";
>> >> 	mtd-mac-address = <&art 0x6>;
>> >> };
>> >>
>> >> But it doesn't seem as a proper fix to me (maybe I'm wrong?) thus the
>> >> question about any other, better approach?
>> >
>> > That's how I feel. For me, this always looked like a hack to me (based on my
>> shallow level of understanding, though).
>> > There might be special cases where this is necessary (e.g. force a device to
> be
>> eth0 due to failsafe), but I still do not like it.
>> 
>> I was considering also aliases in DTSes.
> 
> One could use that for failsafe (actually quite an interesting idea) and for
> specifying the corresponding ethX in ar71xx. However, this still won't help us
> with the migration script itself.

Lets forget about migration scripts now and try to find a way to keep 
old interfaces mapping in ar71xx the same in ath79, using DTS only.
As I wrote already, it's just semantics.
Chuanhong Guo Jan. 30, 2020, 5:19 a.m. UTC | #19
Hi!
Sorry for the late reply.

On Wed, Jan 29, 2020 at 11:18 PM Piotr Dymacz <pepe2k@gmail.com> wrote:
> AFAIK, there wasn't any official or general rule about what should be
> LAN and WAN (in terms of mapping ethX to 'our' LAN/WAN), so it was
> always up to the device support author (personally I preferred to have
> LAN on eth0). And as you can see above, ar71xx allowed two different
> orders in which ethX interfaces where registered in kernel.
>
> > Despite, from what I understand our current problem is the driver implementation
> > in ath79, which needs to skip/delay initialization in certain cases.
> > So, it's not so much about finding out the situation on ar71xx, but
> > understanding the situation in ath79 as well. All-in-all, I think this will come
> > down to having to check each device manually.
>
> More or less, yes.

on ath79 chips with builtin switch, the phy used by gmac0 is wired to mdio on
gmac1. we need to get mdio1 ready before we can probe gmac0.

ar71xx do peripheral resets in mach-xxx.c code, and have mdio/gmac in separate
drivers. This inspired me to make a commit:
83d2dbc599 ath79: ag71xx: Split mdio driver into an independent platform device.

But it was later discovered that even though atheros provided separated
mdio/gmac reset, we still have to assert both resets together, or gmac register
values won't be restored to default. Some devices got broken ethernet because of
this.

In order to do this reset sequence in ath79, mdio and gmac can't be treated as
two separated drivers, so gmac1 must be probed before gmac0 to have mdio1
ready.

Since probing order is fixed in ath79, I think the only way to specify interface
names is to add a device tree property and fill the corresponding name field in
netdev structure. This has no chance to be accepted upstream and can only
be our local hack forever.

Regards,
Chuanhong Guo
diff mbox series

Patch

diff --git a/target/linux/ath79/base-files/etc/uci-defaults/05_eth_migration b/target/linux/ath79/base-files/etc/uci-defaults/05_eth_migration
new file mode 100644
index 0000000000..d6b519d25a
--- /dev/null
+++ b/target/linux/ath79/base-files/etc/uci-defaults/05_eth_migration
@@ -0,0 +1,28 @@ 
+#!/bin/sh
+
+rename_all_eth() {
+       local before="$1"
+       local after="$2"
+
+       sed -i "s/$before/$after/" /etc/board.json
+       for e in $(ls /etc/config/* 2>/dev/null); do
+               sed -i "s/$before/$after/" "$e"
+       done
+       for e in $(ls /etc/sysctl.d/* 2>/dev/null); do
+               sed -i "s/$before/$after/" "$e"
+       done
+}
+
+case $(board_name) in
+glinet,gl-ar150|\
+tplink,archer-c58-v1|\
+tplink,archer-c59-v1|\
+tplink,archer-c60-v1|\
+tplink,archer-c60-v2)
+       rename_all_eth "eth0" "ethX"
+       rename_all_eth "eth1" "eth0"
+       rename_all_eth "ethX" "eth0"
+       ;;
+esac
+
+exit 0